Hadoop安裝後的叢集基準測試
[root@slave1 hadoop-0.20.2]# hadoop fs -mkdir /usr/username
[root@slave1 hadoop-0.20.2]# hadoop fs -chown /usr/username
這時設定目錄空間限制比較合適。可以給使用者目錄設定一個1TB的限制
[root@slave1 hadoop-0.20.2]# hadoop dfsadmin -setSpaceQuota 1t /user/username
1. 用TestDFSIO基準測試HDFS
Hadoop帶有一些基準測試程式,基準測試程式被打包在測試程式JAR檔案中。其中,TestDFSIO用來測試HDFS的I/O效能。大多數新系統硬體的故障都是硬碟故障。通過執行I/O密集型基準測試,可以對叢集的使用進行熱身。它通過使用MapReduce作業來完成測試作為並行讀寫檔案的便捷方法。每個檔案的讀寫都在單獨的map任務中進行,並且map的輸出可以用來收集統計剛剛處理過的檔案。這個統計資料在reduce中累加起來得出一個彙總。以下命令寫了10個檔案,每個1000MB:
[root@slave1 hadoop-0.20.2]# hadoop jar hadoop-0.20.2-test.jar TestDFSIO -write -nrFiles 10 -fileSize 1000
以下內容是TestDFSIO基準測試的執行結果,結果被寫入控制檯並同時記錄在一個本地檔案。
[hadoop@hadoop-namenode hadoop]$ cat TestDFSIO_results.log
----- TestDFSIO ----- : write
Date & time: Tue Jan 18 19:04:37 CST 2011
Number of files: 10
Total MBytes processed: 10000
Throughput mb/sec: 45.45867806164197
Average IO rate mb/sec: 46.181983947753906
IO rate std deviation: 5.620244800553667
Test exec time sec: 94.833
完成基準測試後,可通過引數-clean從HDFS上刪除所有生成的檔案:
[root@slave1 hadoop-0.20.2]# hadoop jar hadoop-0.20.2-test.jar TestDFSIO -clean
2. 用排序測試MapReduce
Hadoop自帶一個部分排序的程式。這對測試整個MapReduce系統很有用,因為整個輸入資料集都會通過洗牌傳輸至reducer。一共三個步驟:生成一些隨機的資料,執行排序,然後驗證結果。
首先我們通過使用RandomWriter生成一些隨機的資料。它以每個節點10個map的方式執行一個MapReduce作業,並且每一個map生成近似10GB的隨機二進位制資料,帶有不同長度的鍵和值。
[hadoop@hadoop-namenode hadoop]$ bin/hadoop jar hadoop-0.20.2-examples.jar randomwriter random-data
Running 20 maps.
Job started: Tue Jan 18 19:05:21 CST 2011
11/01/18 19:05:22 INFO mapred.JobClient: Running job: job_201101181725_0009
11/01/18 19:05:23 INFO mapred.JobClient: map 0% reduce 0%
11/01/18 19:06:17 INFO mapred.JobClient: map 5% reduce 0%
11/01/18 19:06:21 INFO mapred.JobClient: map 10% reduce 0%
11/01/18 19:06:23 INFO mapred.JobClient: map 15% reduce 0%
11/01/18 19:06:24 INFO mapred.JobClient: map 20% reduce 0%
11/01/18 19:07:06 INFO mapred.JobClient: map 25% reduce 0%
11/01/18 19:07:09 INFO mapred.JobClient: map 35% reduce 0%
11/01/18 19:07:21 INFO mapred.JobClient: map 40% reduce 0%
11/01/18 19:07:57 INFO mapred.JobClient: map 45% reduce 0%
11/01/18 19:08:00 INFO mapred.JobClient: map 55% reduce 0%
11/01/18 19:08:09 INFO mapred.JobClient: map 60% reduce 0%
11/01/18 19:08:45 INFO mapred.JobClient: map 65% reduce 0%
11/01/18 19:08:51 INFO mapred.JobClient: map 70% reduce 0%
11/01/18 19:08:54 INFO mapred.JobClient: map 80% reduce 0%
11/01/18 19:09:31 INFO mapred.JobClient: map 85% reduce 0%
11/01/18 19:09:40 INFO mapred.JobClient: map 95% reduce 0%
11/01/18 19:09:43 INFO mapred.JobClient: map 100% reduce 0%
11/01/18 19:09:45 INFO mapred.JobClient: Job complete: job_201101181725_0009
11/01/18 19:09:45 INFO mapred.JobClient: Counters: 8
11/01/18 19:09:45 INFO mapred.JobClient: Job Counters
11/01/18 19:09:45 INFO mapred.JobClient: Launched map tasks=22
11/01/18 19:09:45 INFO mapred.JobClient: org.apache.hadoop.examples.RandomWriter$Counters
11/01/18 19:09:45 INFO mapred.JobClient: BYTES_WRITTEN=21474942228
11/01/18 19:09:45 INFO mapred.JobClient: RECORDS_WRITTEN=2044390
11/01/18 19:09:45 INFO mapred.JobClient: FileSystemCounters
11/01/18 19:09:45 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=21545680248
11/01/18 19:09:45 INFO mapred.JobClient: Map-Reduce Framework
11/01/18 19:09:45 INFO mapred.JobClient: Map input records=20
11/01/18 19:09:45 INFO mapred.JobClient: Spilled Records=0
11/01/18 19:09:45 INFO mapred.JobClient: Map input bytes=0
11/01/18 19:09:45 INFO mapred.JobClient: Map output records=2044390
Job ended: Tue Jan 18 19:09:45 CST 2011
The job took 263 seconds.
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/8183550/viewspace-684152/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- hadoop基準測試_Hadoop TeraSort基準測試Hadoop
- Redis叢集方案,Codis安裝測試Redis
- hadoop-2.6.0基準測試Hadoop
- hadoop叢集內lzo的安裝與配置Hadoop
- 分散式Hadoop1.2.1叢集的安裝分散式Hadoop
- Ganglia監控Hadoop叢集的安裝部署Hadoop
- Hadoop TeraSort 基準測試實驗Hadoop
- 安裝 Hadoop:設定單節點 Hadoop 叢集Hadoop
- 在Ubuntu 18.04.1上安裝Hadoop叢集UbuntuHadoop
- hadoop叢集多節點安裝詳解Hadoop
- FreeBSD下安裝配置Hadoop叢集(三)Hadoop
- 虛擬機器Hadoop叢集搭建5安裝Hadoop虛擬機Hadoop
- 完全分散式Hadoop叢集的安裝部署步驟分散式Hadoop
- hadoop偽分散式叢集的安裝(不是單機版)Hadoop分散式
- 基於OGG的Oracle與Hadoop叢集準實時同步介紹OracleHadoop
- FreeBSD下安裝配置Hadoop叢集(效能調優)Hadoop
- 基於kerberos的hadoop安全叢集搭建ROSHadoop
- 安裝 REDIS 叢集Redis
- 安裝Kafka叢集Kafka
- Hadoop叢集部署安裝Hadoop
- redis 5.0 叢集的安裝Redis
- 超詳細hadoop叢集伺服器安裝配置教程Hadoop伺服器
- 完整安裝always on叢集
- 安裝Consul叢集
- 快速安裝 kafka 叢集Kafka
- FastDFS 叢集 安裝 配置AST
- 測試基準資料的準備
- Solaris本地叢集VCS安裝過程(試用License)
- MySQL基準測試MySql
- 基於 ZooKeeper 搭建 Hadoop 高可用叢集Hadoop
- hadoop叢集篇--從0到1搭建hadoop叢集Hadoop
- Hadoop叢集搭建Hadoop
- Hadoop搭建叢集Hadoop
- Hadoop 叢集命令Hadoop
- hadoop分散式叢集搭建前期準備(centos7)Hadoop分散式CentOS
- hadoop單機安裝配置及測試通過Hadoop
- Spark新手入門——2.Hadoop叢集(偽分佈模式)安裝SparkHadoop模式
- Python測試Kafka叢集(pykafka)PythonKafka