1.單機部署hadoop測試環境

豬頭強發表於2015-08-12

之前看了很多理論上的知識,感覺雲裡霧裡的,所以趕緊著手搭建個單機版的hadoop跑一跑,開啟自學大資料技術的第一步~~

  1.在開源的世界裡,我就是個土豪,要啥有啥,所以首先你得有個jdk,有錢所以用最新的java8,hadoop使用的是hadoop2.6.0。

  2.配置好java後,可以在/etc/profile裡配置好環境變數,方便之後使用,緊接著解壓hadoop2.6.0.tar.gz。

  3.接下來配置hadoop,所有的配置檔案都在hadoop資料夾下的etc/hadoop中:

 (1)hadoop-env.sh :這個指令碼只需要修改最上面的JavaHome即可,修改為自己的java路徑

 (2)core-site.xml,mapred-site.xml,hdfs-site.xml這幾個配置完事再補上吧~~~,網上挺多的,不過要找自己對應的版本,不然會出很多奇怪的問題。

  4.配置好之後就要啟動了

  (1)啟動之前首先要把namenode格式化一下,這是第一次啟動hadoop需要做的動作,他會把hdfs中所有的東西全部清空掉的,所以要慎用~~

[qiang@localhost hadoop-2.6.0]$  bin/hadoop namenode -format
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

15/08/11 08:25:43 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = localhost/127.0.0.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.6.0
.....
.....
.....
15/08/11 08:25:46 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost/127.0.0.1
************************************************************/

  格式化會出現一大堆資訊,如果沒有報錯,那麼說明之前的配置應該是可以滴~~~

  (2)啟動的時候,可以直接使用sbin/start-all.sh,但是這種方式太low,如果叢集啟動出現錯誤,那麼不會知道是那一部分的問題,不便於問題的排查,所以我們來一個一個啟動它

啟動namenode:

[qiang@localhost hadoop-2.6.0]$ sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /home/qiang/hadoop-2.6.0/logs/hadoop-qiang-namenode-localhost.localdomain.out

啟動datanode:

[qiang@localhost hadoop-2.6.0]$ sbin/hadoop-daemon.sh start datanode
starting datanode, logging to /home/qiang/hadoop-2.6.0/logs/hadoop-qiang-datanode-localhost.localdomain.out

可以用jps命令檢視是否啟動

[qiang@localhost ~]$ jps
17254 Jps
16473 NameNode
16698 DataNode

當然也可以使用開放的埠在web瀏覽器上檢視:(hdfs開放的埠為50070)

開了當然要用用他了,看看是不是唬人的,所以我們向hdfs中上傳點東西試試:

[qiang@localhost hadoop-2.6.0]$ bin/hadoop fs -mkdir /home
[qiang@localhost hadoop-2.6.0]$ bin/hadoop fs -mkdir /home/qiangweikang
[qiang@localhost hadoop-2.6.0]$ bin/hadoop fs -put README.txt /home/qiangweikang

點選uitilites中的system source會看到我們之前傳進去的東東:

 好開森~~

完事我們繼續啟動yarn

[qiang@localhost hadoop-2.6.0]$ sbin/start-yarn.sh

在web上就可以看到傳說中的那隻大象....  ,而且我們可以看到有一個活動的節點(yarn的ResourceManager的預設埠號是8088)

 

接下來我們再跑一個demo,看看hadoop是怎麼去執行的(在share下有自帶的demo可供測試)這個pi的計算很有意思,是對一個圓做投擲飛鏢的動作,第一個引數是map操作的次數

第二個引數是每次投擲多少個飛鏢,好高大上啊,pi還可以這樣算~~~,難道這就是傳說中的概率統計?

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar pi 2 100
Number of Maps  = 2
Samples per Map = 100
Wrote input for Map #0
Wrote input for Map #1
Starting Job
15/08/11 08:54:24 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/08/11 08:54:25 INFO input.FileInputFormat: Total input paths to process : 2
15/08/11 08:54:25 INFO mapreduce.JobSubmitter: number of splits:2
15/08/11 08:54:25 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1439308289430_0001
15/08/11 08:54:26 INFO impl.YarnClientImpl: Submitted application application_1439308289430_0001
15/08/11 08:54:26 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1439308289430_0001/
15/08/11 08:54:26 INFO mapreduce.Job: Running job: job_1439308289430_0001
15/08/11 08:54:41 INFO mapreduce.Job: Job job_1439308289430_0001 running in uber mode : false
15/08/11 08:54:41 INFO mapreduce.Job:  map 0% reduce 0%
15/08/11 08:54:51 INFO mapreduce.Job:  map 50% reduce 0%
15/08/11 08:54:52 INFO mapreduce.Job:  map 100% reduce 0%
15/08/11 08:55:04 INFO mapreduce.Job:  map 100% reduce 100%
15/08/11 08:55:05 INFO mapreduce.Job: Job job_1439308289430_0001 completed successfully
15/08/11 08:55:06 INFO mapreduce.Job: Counters: 49
    File System Counters
        FILE: Number of bytes read=50
        FILE: Number of bytes written=317688
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=526
        HDFS: Number of bytes written=215
        HDFS: Number of read operations=11
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=3
    Job Counters 
        Launched map tasks=2
        Launched reduce tasks=1
        Data-local map tasks=2
        Total time spent by all maps in occupied slots (ms)=14463
        Total time spent by all reduces in occupied slots (ms)=10093
        Total time spent by all map tasks (ms)=14463
        Total time spent by all reduce tasks (ms)=10093
        Total vcore-seconds taken by all map tasks=14463
        Total vcore-seconds taken by all reduce tasks=10093
        Total megabyte-seconds taken by all map tasks=14810112
        Total megabyte-seconds taken by all reduce tasks=10335232
    Map-Reduce Framework
        Map input records=2
        Map output records=4
        Map output bytes=36
        Map output materialized bytes=56
        Input split bytes=290
        Combine input records=0
        Combine output records=0
        Reduce input groups=2
        Reduce shuffle bytes=56
        Reduce input records=4
        Reduce output records=0
        Spilled Records=8
        Shuffled Maps =2
        Failed Shuffles=0
        Merged Map outputs=2
        GC time elapsed (ms)=412
        CPU time spent (ms)=4770
        Physical memory (bytes) snapshot=680353792
        Virtual memory (bytes) snapshot=6324887552
        Total committed heap usage (bytes)=501743616
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=236
    File Output Format Counters 
        Bytes Written=97
Job Finished in 42.318 seconds
Estimated value of Pi is 3.12000000000000000000

 

最後記得把yarn關掉~~

[qiang@localhost hadoop-2.6.0]$ sbin/stop-yarn.sh 

 

相關文章