小丸子學Hadoop系列之——部署Hadoop叢集

wxjzqym發表於2015-12-22
0.叢集規劃
主機名 ip地址 安裝的軟體 執行的程式
hadoop1 10.1.245.72 hadoop,zookeeper  namenode,zkfc,resourcemanager,journalnode
hadoop2 10.1.245.73 hadoop,zookeeper  namenode,zkfc,resourcemanager,datanode,nodemanager,journalnode
hadoop3 10.1.245.74 hadoop,zookeeper  datanode,nodemanager,journalnode


1.安裝jdk
mkdir -p /opt/freeware/
cd /opt/freeware/
tar xvf jdk1.7.0_79.tgz 
vi /etc/profile
export JAVA_HOME=/opt/freeware/jdk1.7.0_79
export PATH=$JAVA_HOME/bin:$PATH

source /etc/profile
java -version
java version "1.7.0_79"


2.建立使用者和配置hosts檔案
groupadd -g 511 hadoop
useradd -u 511 -g hadoop hdpusr01
echo oracle|passwd --stdin hdpusr01

vi /etc/hosts        
10.1.245.72              hadoop1                                                         
10.1.245.73              hadoop2                                                                   
10.1.245.74              hadoop3                


3.配置hadoop使用者ssh互信
3.1 生產rsa公鑰對
su - hdpusr01
ssh-keygen -t rsa
cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys
chmod 600 .ssh/authorized_keys

3.2 配置授權檔案--在主機hadoop1操作
[hdpusr01@hadoop1 ~]$ ssh hadoop2 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys 
[hdpusr01@hadoop1 ~]$ ssh hadoop3 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys 

[hdpusr01@hadoop1 ~]$ scp ~/.ssh/authorized_keys hadoop2:/home/hdpusr01/.ssh/authorized_keys
[hdpusr01@hadoop1 ~]$ scp ~/.ssh/authorized_keys hadoop3:/home/hdpusr01/.ssh/authorized_keys

3.3 驗證ssh互信
ssh hadoop1 date
ssh hadoop2 date
ssh hadoop3 date


4.安裝Hadoop叢集
4.1 安裝ZooKeeper叢集
具體步驟請參考《小丸子學ZooKeeper系列之——部署ZooKeeper叢集 》
http://blog.itpub.net/20801486/viewspace-1866925/

4.2 安裝hadoop叢集
4.2.1 解壓軟體--在主機hadoop1操作
[hdpusr01@hadoop1 ~]$ tar xvf /opt/freeware/hadoop-2.6.0.tgz
[hdpusr01@hadoop1 ~]$ chown -R hdpusr01:hadoop hadoop-2.6.0/
[hdpusr01@hadoop1 ~]$ mv hadoop-2.6.0/ hadoop

4.2.2 配置hadoop--在主機hadoop1操作
4.2.2.1 修改core-site.xml
[hdpusr01@hadoop1 ~]$ vi core-site.xml
01:    
02:            
03:             fs.defaultFS
04:             hdfs://hdfscls1
05:        
06:            
07:             io.file.buffer.size
08:             131072
09:        
10:        
11:             hadoop.tmp.dir
12:             /home/hdpusr01/hadoop/tmp
13:        
14:            
15:             ha.zookeeper.quorum
16:             hadoop1:29181,hadoop2:29181,hadoop3:29181
17:        
18:    

4.2.2.2 修改hdfs-stie.xml
[hdpusr01@hadoop1 ~]$ vi hadoop/etc/hadoop/hdfs-site.xml 
01:    
02:    
03:             dfs.nameservices
04:             hdfscls1
05:    
06:    
07:             dfs.ha.namenodes.hdfscls1
08:             nn1,nn2
09:            
10:    
11:             dfs.namenode.rpc-address.hdfscls1.nn1
12:             hadoop1:8920
13:    
14:    
15:             dfs.namenode.http-address.hdfscls1.nn1
16:             hadoop1:59070
17:    
18:    
19:             dfs.namenode.rpc-address.hdfscls1.nn2
20:             hadoop2:8920
21:    
22:    
23:             dfs.namenode.http-address.hdfscls1.nn2
24:             hadoop2:59070
25:    
26:    
27:             dfs.namenode.shared.edits.dir
28:             qjournal://hadoop1:8485;hadoop2:8485;hadoop3:8485/hdfscls1
29:    
30:    
31:             dfs.journalnode.edits.dir
32:             /home/hdpusr01/hadoop/dfs/jn
33:    
34:    
35:       dfs.journalnode.rpc-address
36:       0.0.0.0:8485
37:    
38:    
39:       dfs.journalnode.http-address
40:       0.0.0.0:8480
41:    
42:    
43:       dfs.datanode.address
44:       0.0.0.0:59010
45:    
46:    
47:       dfs.datanode.http.address
48:       0.0.0.0:59075
49:      
50:    
51:       dfs.datanode.ipc.address
52:       0.0.0.0:59020
53:    
54:    
55:             dfs.ha.automatic-failover.enabled
56:             true
57:    
58:    
59:             dfs.client.failover.proxy.provider.hdfscls1
60:             org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
61:    
62:    
63:             dfs.ha.fencing.methods
64:             sshfence
65:    
66:    
67:             dfs.ha.fencing.ssh.private-key-files
68:             /home/hdpusr01/.ssh/id_rsa
69:    
70:    
71:             dfs.ha.fencing.ssh.connect-timeout
72:             30000
73:    
74:    
75:       dfs.datanode.data.dir
76:       file:/home/hdpusr01/hadoop/dfs/dn/data01
77:    
78:    

4.2.2.3 修改mapred-site.xml
[hdpusr01@hadoop1 ~]$ vi hadoop/etc/hadoop/mapred-site.xml
01:    
02:    
03:             mapreduce.framework.name
04:             yarn
05:    
06:    

4.2.2.4 修改yarn-site.xml
[hdpusr01@hadoop1 ~]$ vi hadoop/etc/hadoop/yarn-site.xml
01:    
02:    
03:             yarn.resourcemanager.ha.enabled
04:             true
05:    
06:    
07:             yarn.resourcemanager.cluster-id
08:             rmcls1
09:    
10:    
11:             yarn.resourcemanager.ha.rm-ids
12:             rm1,rm2
13:    
14:    
15:             yarn.resourcemanager.hostname.rm1
16:             hadoop1
17:    
18:    
19:             yarn.resourcemanager.hostname.rm2
20:             hadoop2
21:    
22:    
23:             yarn.resourcemanager.recovery.enabled
24:             true
25:    
26:    
27:             yarn.resourcemanager.store.class
28:             org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
29:    
30:    
31:             yarn.resourcemanager.zk-address
32:             hadoop1:29181,hadoop2:29181,hadoop3:29181
33:    
34:    
35:             yarn.nodemanager.aux-services
36:             mapreduce_shuffle
37:    
38:    

4.2.2.5 修改slaves
[hdpusr01@hadoop1 ~]$ vi hadoop/etc/hadoop/slaves 
hadoop2
hadoop3

4.2.2.6 建立hadoop相關目錄
[hdpusr01@hadoop1 ~]$ mkdir -p hadoop/{tmp,dfs/jn,dfs/dn/data01}

4.2.3 複製hadoop目錄到其他節點並解壓
[hdpusr01@hadoop1 ~]$ tar cvf hdp.tar hadoop/
[hdpusr01@hadoop1 ~]$ scp hdp.tar hadoop2:/home/hdpusr01
[hdpusr01@hadoop1 ~]$ scp hdp.tar hadoop3:/home/hdpusr01

[hdpusr01@hadoop2 ~]$ tar xvf hdp.tar 
[hdpusr01@hadoop3 ~]$ tar xvf hdp.tar 

4.2.4 檢查zookeeper叢集狀態
[hdpusr01@hadoop1 ~]$ zookeeper/bin/zkServer.sh status
Mode: follower

[hdpusr01@hadoop2 ~]$ zookeeper/bin/zkServer.sh status
Mode: leader

[hdpusr01@hadoop3 ~]$ zookeeper/bin/zkServer.sh status
Mode: follower

4.2.5 啟動journalnode——在主機hadoop1~hadoop3操作
[hdpusr01@hadoop1 ~]$ hadoop/sbin/hadoop-daemon.sh start journalnode
starting journalnode, logging to /home/hdpusr01/hadoop/logs/hadoop-hdpusr01-journalnode-hadoop1.out
[hdpusr01@hadoop1 ~]$ jps
1766 JournalNode
注:啟動正常的話用jps命令可以看到JournalNode程式

4.2.6 格式化HDFS——在主機hadoop1操作
[hdpusr01@hadoop1 ~]$ hadoop/bin/hdfs namenode -format
15/12/21 17:55:54 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1012038946-10.1.245.72-1450691754279
15/12/21 17:55:54 INFO common.Storage: Storage directory /home/hdpusr01/hadoop/tmp/dfs/name has been successfully formatted.
15/12/21 17:55:54 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
注:從日誌資訊看出hdfs格式化成功,生產的檔案在目錄/home/hdpusr01/hadoop/tmp/dfs/name中

4.2.7 複製namenode臨時目錄到主機hadoop2——在主機hadoop1操作
[hdpusr01@hadoop1 ~]$ scp -r hadoop/tmp/* hadoop2:/home/hdpusr01/hadoop/tmp

4.2.8 格式化ZK——在主機hadoop1操作
[hdpusr01@hadoop1 ~]$ hadoop/bin/hdfs zkfc -formatZK
15/12/21 18:00:56 INFO zookeeper.ClientCnxn: Session establishment complete on server hadoop1/10.1.245.72:29181, sessionid = 0x151c3bc8d780000, negotiated timeout = 80000
15/12/21 18:00:56 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/hdfscls1 in ZK.
注:從日誌資訊看出hadoop已經與zookeeper建立連線,並且在zk中建立了/hadoop-ha/hdfscls1的znode節點,下面登入zk驗證一下。

[hdpusr01@hadoop1 ~]$ zookeeper/bin/zkCli.sh -server hadoop1:29181
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[zk: hadoop1:29181(CONNECTED) 2] get /hadoop-ha/hdfscls1
cZxid = 0x100000003
ctime = Mon Dec 21 18:00:56 CST 2015
mZxid = 0x100000003
mtime = Mon Dec 21 18:00:56 CST 2015
pZxid = 0x100000003
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 0

4.2.9 啟動HDFS——在主機hadoop1操作
[hdpusr01@hadoop1 ~]$ hadoop/sbin/start-dfs.sh
2015-12-21 18:26:03,341 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-12-21 18:26:03,997 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image storage directory (dfs.namenode.name.dir) configured. Beware of data loss due to lack of redundant storage directories!
注:啟動hdfs過程中遇到一個告警,提示無法載入native-hadoop library,嘗試設定環境變數後再次啟動hdfs,警告仍然存在
[hdpusr01@hadoop1 ~]$ jps
16966 NameNode
17161 JournalNode
17333 DFSZKFailoverController

[hdpusr01@hadoop2 ~]$ jps
8907 DFSZKFailoverController
8784 JournalNode
8697 DataNode
8626 NameNode

[hdpusr01@hadoop3 ~]$ jps
3339 DataNode
3426 JournalNode
注:從上面jps命令輸出可以看到hadoop1~2主機上的NameNode程式,hadoop2~3主機上的DataNode程式以及hadoop1~3主機上的JournalNode程式都啟動了

4.2.10 啟動YARN——在主機hadoop1操作
[hdpusr01@hadoop1 ~]$ hadoop/sbin/start-yarn.sh 
[hdpusr01@hadoop1 ~]$ jps
21031 ResourceManager

[hdpusr01@hadoop2 ~]$ jps
10634 NodeManager

[hdpusr01@hadoop3 ~]$ jps
3760 NodeManager
注:從上面jps命令輸出可以看到hadoop1主機上的ResourceManager程式,hadoop2~3主機上的NodeManager程式都啟動了,hadoop2主機上的ResourceManager程式需要手動啟動

4.2.11 啟動ResourceManager——在主機hadoop2操作
[hdpusr01@hadoop2 ~]$ hadoop/sbin/yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /home/hdpusr01/hadoop/logs/yarn-hdpusr01-resourcemanager-hadoop2.out
[hdpusr01@hadoop2 ~]$ jps
10860 ResourceManager
注:至此hadoop叢集部署完畢

4.3 檢視叢集狀態——透過web頁面檢視
4.3.1 檢視hadoop1上的NameNode




4.3.2 檢視hadoop2上的NameNode和DataNode









注:從hadoop提供的web頁面可以看到NameNode和DataNode的程式資訊,一切正常,至此Hadoop叢集搭建完畢。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/20801486/viewspace-1877048/,如需轉載,請註明出處,否則將追究法律責任。

相關文章