hadoop2.6.0版本叢集環境搭建
問題導讀
1.安裝hadoop需要做哪些準備?
2.如何驗證hadoop是否成功?
3.如何執行wordcout?
一、環境說明
1、機器:一臺物理機 和一臺虛擬機器
2、linux版本:[spark@S1PA11 ~]$ cat /etc/issue
Red Hat Enterprise Linux Server release 5.4 (Tikanga)
3、JDK: [spark@S1PA11 ~]$ java -version
java version "1.6.0_27"
Java(TM) SE Runtime Environment (build 1.6.0_27-b07)
Java HotSpot(TM) 64-Bit Server VM (build 20.2-b06, mixed mode)
4、叢集節點:兩個 S1PA11(Master),S1PA222(Slave)
二、準備工作
1、安裝Java jdk
2、ssh免密碼驗證
3、下載Hadoop版本
三、安裝Hadoop
這是下載後的hadoop-2.6.0.tar.gz壓縮包,
1、解壓 tar -xzvf hadoop-2.6.0.tar.gz
2、move到指定目錄下:[spark@S1PA11 software]$ mv hadoop-2.6.0 ~/opt/
3、進入hadoop目前 [spark@S1PA11 opt]$ cd hadoop-2.6.0/
[spark@S1PA11 hadoop-2.6.0]$ ls
bin dfs etc include input lib libexec LICENSE.txt logs NOTICE.txt README.txt sbin share tmp
配置之前,先在本地檔案系統建立以下資料夾:~/hadoop/tmp、~/dfs/data、~/dfs/name。 主要涉及的配置檔案有7個:都在/hadoop/etc/hadoop資料夾下,可以用gedit命令對其進行編輯。
4、進去hadoop配置檔案目錄
4.1、配置 hadoop-env.sh檔案-->修改JAVA_HOME
4.2、配置 yarn-env.sh 檔案-->>修改JAVA_HOME
4.3、配置slaves檔案-->>增加slave節點
4.4、配置 core-site.xml檔案-->>增加hadoop核心配置(hdfs檔案埠是9000、file:/home/spark/opt/hadoop-2.6.0/tmp、)
4.5、配置 hdfs-site.xml 檔案-->>增加hdfs配置資訊(namenode、datanode埠和目錄位置)
4.6、配置 mapred-site.xml 檔案-->>增加mapreduce配置(使用yarn框架、jobhistory使用地址以及web地址)
4.7、配置 yarn-site.xml 檔案-->>增加yarn功能
5、將配置好的hadoop檔案copy到另一臺slave機器上
四、驗證
1、格式化namenode:
2、啟動hdfs:
3、停止hdfs:
4、啟動yarn:
5、停止yarn:
6、檢視叢集狀態:
-------------------------------------------------
Live datanodes (1):
Name: 10.126.45.56:50010 (S1PA222)
Hostname: S1PA209
Decommission Status : Normal
Configured Capacity: 52101857280 (48.52 GB)
DFS Used: 823296 (804 KB)
Non DFS Used: 6352347136 (5.92 GB)
DFS Remaining: 45748686848 (42.61 GB)
DFS Used%: 0.00%
DFS Remaining%: 87.81%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon Jan 05 16:44:50 CST 2015
7、檢視hdfs:http://10.58.44.47:50070/
8、檢視RM:http://10.58.44.47:8088/
9、執行wordcount程式
9.1、建立 input目錄:[spark@S1PA11 hadoop-2.6.0]$ mkdir input
9.2、在input建立f1、f2並寫內容
[spark@S1PA11 hadoop-2.6.0]$ cat input/f1
Hello world bye jj
[spark@S1PA11 hadoop-2.6.0]$ cat input/f2
Hello Hadoop bye Hadoop
9.3、在hdfs建立/tmp/input目錄
[spark@S1PA11 hadoop-2.6.0]$ ./bin/hadoop fs -mkdir /tmp
15/01/05 16:53:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[spark@S1PA11 hadoop-2.6.0]$ ./bin/hadoop fs -mkdir /tmp/input
15/01/05 16:54:16 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
9.4、將f1、f2檔案copy到hdfs /tmp/input目錄
[spark@S1PA11 hadoop-2.6.0]$ ./bin/hadoop fs -put input/ /tmp
15/01/05 16:56:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
9.5、檢視hdfs上是否有f1、f2檔案
[spark@S1PA11 hadoop-2.6.0]$ ./bin/hadoop fs -ls /tmp/input/
15/01/05 16:57:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
-rw-r--r-- 3 spark supergroup 20 2015-01-04 19:09 /tmp/input/f1
-rw-r--r-- 3 spark supergroup 25 2015-01-04 19:09 /tmp/input/f2
9.6、執行wordcount程式
[spark@S1PA11 hadoop-2.6.0]$ ./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /tmp/input /output
15/01/05 17:00:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/01/05 17:00:09 INFO client.RMProxy: Connecting to ResourceManager at S1PA11/10.58.44.47:8032
15/01/05 17:00:11 INFO input.FileInputFormat: Total input paths to process : 2
15/01/05 17:00:11 INFO mapreduce.JobSubmitter: number of splits:2
15/01/05 17:00:11 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1420447392452_0001
15/01/05 17:00:12 INFO impl.YarnClientImpl: Submitted application application_1420447392452_0001
15/01/05 17:00:12 INFO mapreduce.Job: The url to track the job: http://S1PA11:8088/proxy/application_1420447392452_0001/
15/01/05 17:00:12 INFO mapreduce.Job: Running job: job_1420447392452_0001
9.7、檢視執行結果
[spark@S1PA11 hadoop-2.6.0]$ ./bin/hadoop fs -cat /output/part-r-0000
15/01/05 17:06:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
1.安裝hadoop需要做哪些準備?
2.如何驗證hadoop是否成功?
3.如何執行wordcout?
一、環境說明
1、機器:一臺物理機 和一臺虛擬機器
2、linux版本:[spark@S1PA11 ~]$ cat /etc/issue
Red Hat Enterprise Linux Server release 5.4 (Tikanga)
3、JDK: [spark@S1PA11 ~]$ java -version
java version "1.6.0_27"
Java(TM) SE Runtime Environment (build 1.6.0_27-b07)
Java HotSpot(TM) 64-Bit Server VM (build 20.2-b06, mixed mode)
4、叢集節點:兩個 S1PA11(Master),S1PA222(Slave)
二、準備工作
1、安裝Java jdk
2、ssh免密碼驗證
3、下載Hadoop版本
三、安裝Hadoop
這是下載後的hadoop-2.6.0.tar.gz壓縮包,
1、解壓 tar -xzvf hadoop-2.6.0.tar.gz
2、move到指定目錄下:[spark@S1PA11 software]$ mv hadoop-2.6.0 ~/opt/
3、進入hadoop目前 [spark@S1PA11 opt]$ cd hadoop-2.6.0/
[spark@S1PA11 hadoop-2.6.0]$ ls
bin dfs etc include input lib libexec LICENSE.txt logs NOTICE.txt README.txt sbin share tmp
配置之前,先在本地檔案系統建立以下資料夾:~/hadoop/tmp、~/dfs/data、~/dfs/name。 主要涉及的配置檔案有7個:都在/hadoop/etc/hadoop資料夾下,可以用gedit命令對其進行編輯。
4、進去hadoop配置檔案目錄
4.1、配置 hadoop-env.sh檔案-->修改JAVA_HOME
4.2、配置 yarn-env.sh 檔案-->>修改JAVA_HOME
4.3、配置slaves檔案-->>增加slave節點
4.4、配置 core-site.xml檔案-->>增加hadoop核心配置(hdfs檔案埠是9000、file:/home/spark/opt/hadoop-2.6.0/tmp、)
4.5、配置 hdfs-site.xml 檔案-->>增加hdfs配置資訊(namenode、datanode埠和目錄位置)
4.6、配置 mapred-site.xml 檔案-->>增加mapreduce配置(使用yarn框架、jobhistory使用地址以及web地址)
4.7、配置 yarn-site.xml 檔案-->>增加yarn功能
5、將配置好的hadoop檔案copy到另一臺slave機器上
四、驗證
1、格式化namenode:
2、啟動hdfs:
3、停止hdfs:
4、啟動yarn:
5、停止yarn:
6、檢視叢集狀態:
-------------------------------------------------
Live datanodes (1):
Name: 10.126.45.56:50010 (S1PA222)
Hostname: S1PA209
Decommission Status : Normal
Configured Capacity: 52101857280 (48.52 GB)
DFS Used: 823296 (804 KB)
Non DFS Used: 6352347136 (5.92 GB)
DFS Remaining: 45748686848 (42.61 GB)
DFS Used%: 0.00%
DFS Remaining%: 87.81%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon Jan 05 16:44:50 CST 2015
7、檢視hdfs:http://10.58.44.47:50070/
8、檢視RM:http://10.58.44.47:8088/
9、執行wordcount程式
9.1、建立 input目錄:[spark@S1PA11 hadoop-2.6.0]$ mkdir input
9.2、在input建立f1、f2並寫內容
[spark@S1PA11 hadoop-2.6.0]$ cat input/f1
Hello world bye jj
[spark@S1PA11 hadoop-2.6.0]$ cat input/f2
Hello Hadoop bye Hadoop
9.3、在hdfs建立/tmp/input目錄
[spark@S1PA11 hadoop-2.6.0]$ ./bin/hadoop fs -mkdir /tmp
15/01/05 16:53:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[spark@S1PA11 hadoop-2.6.0]$ ./bin/hadoop fs -mkdir /tmp/input
15/01/05 16:54:16 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
9.4、將f1、f2檔案copy到hdfs /tmp/input目錄
[spark@S1PA11 hadoop-2.6.0]$ ./bin/hadoop fs -put input/ /tmp
15/01/05 16:56:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
9.5、檢視hdfs上是否有f1、f2檔案
[spark@S1PA11 hadoop-2.6.0]$ ./bin/hadoop fs -ls /tmp/input/
15/01/05 16:57:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
-rw-r--r-- 3 spark supergroup 20 2015-01-04 19:09 /tmp/input/f1
-rw-r--r-- 3 spark supergroup 25 2015-01-04 19:09 /tmp/input/f2
9.6、執行wordcount程式
[spark@S1PA11 hadoop-2.6.0]$ ./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /tmp/input /output
15/01/05 17:00:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/01/05 17:00:09 INFO client.RMProxy: Connecting to ResourceManager at S1PA11/10.58.44.47:8032
15/01/05 17:00:11 INFO input.FileInputFormat: Total input paths to process : 2
15/01/05 17:00:11 INFO mapreduce.JobSubmitter: number of splits:2
15/01/05 17:00:11 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1420447392452_0001
15/01/05 17:00:12 INFO impl.YarnClientImpl: Submitted application application_1420447392452_0001
15/01/05 17:00:12 INFO mapreduce.Job: The url to track the job: http://S1PA11:8088/proxy/application_1420447392452_0001/
15/01/05 17:00:12 INFO mapreduce.Job: Running job: job_1420447392452_0001
9.7、檢視執行結果
[spark@S1PA11 hadoop-2.6.0]$ ./bin/hadoop fs -cat /output/part-r-0000
15/01/05 17:06:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
相關文章
- 【環境搭建】RocketMQ叢集搭建MQ
- Zookeeper 叢集環境搭建
- es 5.5.3叢集環境搭建
- Redis叢集環境搭建實踐Redis
- 12. Redis叢集環境搭建Redis
- k8s——搭建叢集環境K8S
- 高可用叢集環境搭建-留檔
- Mac 環境下 Redis 叢集的搭建MacRedis
- ZooKeeper 系列(二)—— Zookeeper單機環境和叢集環境搭建
- Hadoop叢集之 ZooKeeper和Hbase環境搭建Hadoop
- RocketMQ 4.7.1 環境搭建、叢集、MQ整合SpringBootMQSpring Boot
- Windows 10環境簡單搭建ELK叢集Windows
- Linux 環境下搭建Hadoop叢集(全分佈)LinuxHadoop
- Ubuntu上搭建Hadoop叢集環境的步驟UbuntuHadoop
- Hadoop框架:叢集模式下分散式環境搭建Hadoop框架模式分散式
- 搭建eureka叢集環境以及客戶端配置客戶端
- Elasticsearch叢集搭建教程及生產環境配置Elasticsearch
- hadoop之旅9-centerOS7 : hbase叢集環境搭建HadoopROS
- 基於docker環境下搭建redis主從叢集DockerRedis
- Linux環境搭建Nginx+Tomcat負載均衡叢集LinuxNginxTomcat負載
- mysql8.0以後的版本,進行多主一從的叢集環境搭建MySql
- 大資料叢集搭建 – 1. CDH叢集安裝 – 環境準備大資料
- Apache SeaTunnel 2.3.5 Zeta-Server叢集環境搭建與使用ApacheServer
- 如何基於Jupyter notebook搭建Spark叢集開發環境Spark開發環境
- Docker環境搭建(Win版本)Docker
- Redis 4.0叢集環境部署Redis
- linux虛擬機器環境快速搭建redis5.x版本的主從叢集總結Linux虛擬機Redis
- Linux環境快速搭建elasticsearch6.5.4叢集和Head外掛LinuxElasticsearch
- Docker構建redis叢集環境DockerRedis
- 學習CDH叢集環境的搭建(虛擬機器可演示)虛擬機
- 資料倉儲元件:HBase叢集環境搭建和應用案例元件
- Hadoop的叢集環境部署說明Hadoop
- 分散式系統與叢集環境分散式
- Hadoop叢集環境啟動順序Hadoop
- Elastic認證叢集環境準備AST
- redis叢集之分片叢集的原理和常用代理環境部署Redis
- CentOS 7.9 環境下搭建k8s叢集(一主兩從)CentOSK8S
- Canalv1.1.4版本搭建HA叢集
- 搭建zookeeper叢集(偽叢集)