Hadoop2.7實戰v1.0之新增DataNode節點後,更改檔案複製策略dfs.replication

hackeruncle發表於2016-03-06
1.檢視當前系統的複製策略dfs.replication為3,表示檔案會備份成3份
a.透過檢視hdfs-site.xml 檔案

點選(此處)摺疊或開啟

  1. [root@sht-sgmhadoopnn-01 ~]# cd /hadoop/hadoop-2.7.2/etc/hadoop
  2. [root@sht-sgmhadoopnn-01 hadoop]# more hdfs-site.xml
  3.  <property>
  4.                 <name>dfs.replication</name>
  5.                 <value>3</value>
  6. </property>
b.透過檢視當前hdfs檔案的複製值是多少

點選(此處)摺疊或開啟

  1. [root@sht-sgmhadoopnn-01 hadoop]# hdfs dfs -ls /testdir
  2. Found 7 items
  3. -rw-r--r-- 3 root supergroup 37322672 2016-03-05 17:59 /testdir/012_HDFS.avi
  4. -rw-r--r-- 3 root supergroup 224001146 2016-03-05 18:01 /testdir/016_Hadoop.avi
  5. -rw-r--r-- 3 root supergroup 176633760 2016-03-05 19:11 /testdir/022.avi
  6. -rw-r--r-- 3 root supergroup 30 2016-02-28 22:42 /testdir/1.log
  7. -rw-r--r-- 3 root supergroup 196 2016-02-28 22:23 /testdir/full_backup.log
  8. -rw-r--r-- 3 root supergroup 142039186 2016-03-05 17:55 /testdir/oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm
  9. -rw-r--r-- 3 root supergroup 44 2016-02-28 19:40 /testdir/test.log
  10. [root@sht-sgmhadoopnn-01 hadoop]#
  11. ### 緊跟-rw-r--r--許可權後面的3,表示該檔案在hdfs有多少份備份
c.透過 hadoop fsck /,也可以方便的看到Average block replication的值仍然為3,該值我們可以手動的進行動態修改。
而Default replication factor則需要重啟整個Hadoop叢集才能修改(就是hdfs-site.xml 檔案中改為4,然後叢集重啟才生效,不過這種情況不適用生產叢集),
但實際影響系統的還是Average block replication的值,因此並非一定要修改預設值Default replication factor。

點選(此處)摺疊或開啟

  1. [root@sht-sgmhadoopnn-01 hadoop]# hdfs fsck /
  2. 16/03/06 17:15:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  3. Connecting to namenode via http://sht-sgmhadoopnn-01:50070/fsck?ugi=root&path=%2F
  4. FSCK started by root (auth:SIMPLE) from /172.16.101.55 for path / at Sun Mar 06 17:15:29 CST 2016
  5. ............Status: HEALTHY
  6.  Total size: 580151839 B
  7.  Total dirs: 15
  8.  Total files: 12
  9.  Total symlinks: 0
  10.  Total blocks (validated): 11 (avg. block size 52741076 B)
  11.  Minimally replicated blocks: 11 (100.0 %)
  12.  Over-replicated blocks: 0 (0.0 %)
  13.  Under-replicated blocks: 0 (0.0 %)
  14.  Mis-replicated blocks: 0 (0.0 %)
  15.  Default replication factor: 3
  16.  Average block replication: 3.0
  17.  Corrupt blocks: 0
  18.  Missing replicas: 0 (0.0 %)
  19.  Number of data-nodes: 4
  20.  Number of racks: 1
  21. FSCK ended at Sun Mar 06 17:15:29 CST 2016 in 9 milliseconds
  22. The filesystem under path '/' is HEALTHY
  23. You have mail in /var/spool/mail/root
2.修改hdfs檔案備份係數

點選(此處)摺疊或開啟

  1. [root@sht-sgmhadoopnn-01 hadoop]# hdfs dfs -help
  2. -setrep [-R] [-w] ... :
  3.   Set the replication level of a file. If is a directory then the command
  4.   recursively changes the replication factor of all files under the directory tree
  5.   rooted at .
  6.                                                                                  
  7.   -w It requests that the command waits for the replication to complete. This
  8.       can potentially take a very long time.
  9.   -R It is accepted for backwards compatibility. It has no effect.


  10. [root@sht-sgmhadoopnn-01 hadoop]# hdfs dfs -setrep -w 4 -R /
  11. setrep: `-R': No such file or directory
  12. Replication 4 set: /out1/_SUCCESS
  13. Replication 4 set: /out1/part-r-00000
  14. Replication 4 set: /testdir/012_HDFS.avi
  15. Replication 4 set: /testdir/016_Hadoop.avi
  16. Replication 4 set: /testdir/022.avi
  17. Replication 4 set: /testdir/1.log
  18. Replication 4 set: /testdir/full_backup.log
  19. Replication 4 set: /testdir/oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm
  20. Replication 4 set: /testdir/test.log
  21. Replication 4 set: /tmp/hadoop-yarn/staging/history/done_intermediate/root/job_1456590271264_0002-1456659654297-root-word+count-1456659679606-1-1-SUCCEEDED-root.root-1456659662730.jhist
  22. Replication 4 set: /tmp/hadoop-yarn/staging/history/done_intermediate/root/job_1456590271264_0002.summary
  23. Replication 4 set: /tmp/hadoop-yarn/staging/history/done_intermediate/root/job_1456590271264_0002_conf.xml
  24. Waiting for /out1/_SUCCESS ... done
  25. Waiting for /out1/part-r-00000 .... done
  26. Waiting for /testdir/012_HDFS.avi ... done
  27. Waiting for /testdir/016_Hadoop.avi ... done
  28. Waiting for /testdir/022.avi ... done
  29. Waiting for /testdir/1.log ... done
  30. Waiting for /testdir/full_backup.log ... done
  31. Waiting for /testdir/oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm ... done
  32. Waiting for /testdir/test.log ... done
  33. Waiting for /tmp/hadoop-yarn/staging/history/done_intermediate/root/job_1456590271264_0002-1456659654297-root-word+count-1456659679606-1-1-SUCCEEDED-root.root-1456659662730.jhist ... done
  34. Waiting for /tmp/hadoop-yarn/staging/history/done_intermediate/root/job_1456590271264_0002.summary ... done
  35. Waiting for /tmp/hadoop-yarn/staging/history/done_intermediate/root/job_1456590271264_0002_conf.xml ... done
  36. [root@sht-sgmhadoopnn-01 hadoop]#

  37. ##再次檢查備份系統的情況, Average block replication為4
  38. [root@sht-sgmhadoopnn-01 hadoop]# hdfs fsck /
  39. 16/03/06 17:25:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  40. Connecting to namenode via %2F
  41. FSCK started by root (auth:SIMPLE) from /172.16.101.55 for path / at Sun Mar 06 17:25:51 CST 2016
  42. ............Status: HEALTHY
  43.  Total size: 580151839 B
  44.  Total dirs: 15
  45.  Total files: 12
  46.  Total symlinks: 0
  47.  Total blocks (validated): 11 (avg. block size 52741076 B)
  48.  Minimally replicated blocks: 11 (100.0 %)
  49.  Over-replicated blocks: 0 (0.0 %)
  50.  Under-replicated blocks: 0 (0.0 %)
  51.  Mis-replicated blocks: 0 (0.0 %)
  52.  Default replication factor: 3
  53.  Average block replication: 4.0
  54.  Corrupt blocks: 0
  55.  Missing replicas: 0 (0.0 %)
  56.  Number of data-nodes: 4
  57.  Number of racks: 1
  58. FSCK ended at Sun Mar 06 17:25:51 CST 2016 in 6 milliseconds
  59. The filesystem under path '/
3.測試

點選(此處)摺疊或開啟

  1. [root@sht-sgmhadoopnn-01 hadoop]# vi /tmp/wjp.log
  2. hello,i am
  3. hadoop
  4. hdfs
  5. mapreduce
  6. yarn
  7. hive
  8. zookeeper

  9. [root@sht-sgmhadoopnn-01 hadoop]# hdfs dfs -put /tmp/wjp.log /testdir

  10. [root@sht-sgmhadoopnn-01 hadoop]# hdfs dfs -ls /testdir
  11. Found 8 items
  12. -rw-r--r-- 4 root supergroup 37322672 2016-03-05 17:59 /testdir/012_HDFS.avi
  13. -rw-r--r-- 4 root supergroup 224001146 2016-03-05 18:01 /testdir/016_Hadoop.avi
  14. -rw-r--r-- 4 root supergroup 176633760 2016-03-05 19:11 /testdir/022.avi
  15. -rw-r--r-- 4 root supergroup 30 2016-02-28 22:42 /testdir/1.log
  16. -rw-r--r-- 4 root supergroup 196 2016-02-28 22:23 /testdir/full_backup.log
  17. -rw-r--r-- 4 root supergroup 142039186 2016-03-05 17:55 /testdir/oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm
  18. -rw-r--r-- 4 root supergroup 44 2016-02-28 19:40 /testdir/test.log
  19. -rw-r--r-- 3 root supergroup 62 2016-03-06 17:30 /testdir/wjp.log
  20. [root@sht-sgmhadoopnn-01 hadoop]# hdfs dfs -rm /testdir/wjp.log
  21. 16/03/06 17:31:47 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 1440 minutes, Emptier interval = 0 minutes.
  22. Moved: 'hdfs://mycluster/testdir/wjp.log' to trash at: hdfs://mycluster/user/root/.Trash/Current
  23. [root@sht-sgmhadoopnn-01 hadoop]#
  24. ### put的測試檔案wjp.log的備份數還是3,於是我先把測試檔案刪除掉,去修改namenode節點的hdfs-site.xml的引數
4.修改namenode節點的hdfs-site.xml的引數

點選(此處)摺疊或開啟

  1. [root@sht-sgmhadoopnn-01 hadoop]# vi hdfs-site.xml
  2.          <property>
  3.                 <name>dfs.replication</name>
  4.                 <value>4</value>
  5.          </property>
  6. [root@sht-sgmhadoopnn-01 hadoop]# scp hdfs-site.xml root@sht-sgmhadoopnn-02:/hadoop/hadoop-2.7.2/etc/hadoop
  7. ###假如叢集中,配置了namenode HA,那麼應該需要對另外一個standbyNamenode節點的檔案要同步一直,無需也同步到datanode節點
5.再次測試
##先不重啟試試看

點選(此處)摺疊或開啟

  1. [root@sht-sgmhadoopnn-01 hadoop]# hdfs dfs -put /tmp/wjp.log /testdir
  2. 16/03/06 17:36:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  3. You have mail in /var/spool/mail/root
  4. [root@sht-sgmhadoopnn-01 hadoop]# hdfs dfs -ls /testdir
  5. 16/03/06 17:36:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  6. Found 8 items
  7. -rw-r--r-- 4 root supergroup 37322672 2016-03-05 17:59 /testdir/012_HDFS.avi
  8. -rw-r--r-- 4 root supergroup 224001146 2016-03-05 18:01 /testdir/016_Hadoop.avi
  9. -rw-r--r-- 4 root supergroup 176633760 2016-03-05 19:11 /testdir/022.avi
  10. -rw-r--r-- 4 root supergroup 30 2016-02-28 22:42 /testdir/1.log
  11. -rw-r--r-- 4 root supergroup 196 2016-02-28 22:23 /testdir/full_backup.log
  12. -rw-r--r-- 4 root supergroup 142039186 2016-03-05 17:55 /testdir/oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm
  13. -rw-r--r-- 4 root supergroup 44 2016-02-28 19:40 /testdir/test.log
  14. -rw-r--r-- 4 root supergroup 62 2016-03-06 17:36 /testdir/wjp.log

  15. [root@sht-sgmhadoopnn-01 hadoop]# hdfs fsck /
  16. 16/03/06 21:49:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  17. Connecting to namenode via http://sht-sgmhadoopnn-01:50070/fsck?ugi=root&path=%2F
  18. FSCK started by root (auth:SIMPLE) from /172.16.101.55 for path / at Sun Mar 06 21:49:12 CST 2016
  19. ...............Status: HEALTHY
  20.  Total size: 580152025 B
  21.  Total dirs: 17
  22.  Total files: 15
  23.  Total symlinks: 0
  24.  Total blocks (validated): 14 (avg. block size 41439430 B)
  25.  Minimally replicated blocks: 14 (100.0 %)
  26.  Over-replicated blocks: 0 (0.0 %)
  27.  Under-replicated blocks: 0 (0.0 %)
  28.  Mis-replicated blocks: 0 (0.0 %)
  29.  Default replication factor: 3
  30.  Average block replication: 4.0
  31.  Corrupt blocks: 0
  32.  Missing replicas: 0 (0.0 %)
  33.  Number of data-nodes: 4
  34.  Number of racks: 1
  35. FSCK ended at Sun Mar 06 21:49:12 CST 2016 in 8 milliseconds
##【事實證明】:無需重啟叢集或者namenode節點,是從剛才動態設定命令(hdfs dfs -setrep -w 4 -R /)的記憶體資訊中讀取的,
而不是從配置檔案hdfs-site.xml檔案中讀取配置的,從而驗證了上面這句話:
實際影響系統的還是Average block replication的值,因此並非一定要修改預設值Default replication factor。


總結命令:
hdfs fsck /
hdfs dfs -setrep -w 4 -R /

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/30089851/viewspace-2047825/,如需轉載,請註明出處,否則將追究法律責任。

相關文章