Hadoop 分散式儲存分散式計算

heming96發表於2008-08-09

http://hadoop.apache.org/core/docs/current/cluster_setup.html

http://www.ibm.com/developerworks/cn/opensource/os-cn-hadoop2/index.html

實際操作:

[@more@]
vi /etc/hosts
192.168.1.212 web02
192.168.1.214 lvs
192.168.1.215 nq
ssh-keygen -t dsa -P ' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
scp ~/.ssh/authorized_keys lvs:~/.ssh/
scp ~/.ssh/authorized_keys nv:~/.ssh/
vi conf/master
web02
vi conf/slave
lvs
nq
vi conf/hadoop-site.xml

<!-- Put site-specific property overrides in this file. --&gt


fs.default.name
hdfs://192.168.1.212:9000


mapred.job.tracker
192.168.1.212:9001


dfs.name.dir
/backup/hadoop/name


dfs.replication
2


dfs.data.dir
/backup/hadoop/data
Determines where on the local filesystem an DFS data node
should store its blocks. If this is a comma-delimited list of directories,
then data will be stored in all named directories, typically on different devices.
Directories that do not exist are ignored.


hadoop.tmp.dir
/backup/hadoop/tmp/

bin/hadoop namenode -format
scp -r /backup/hadoop lvs: /backup
scp -r /backup/hadoop nq: /backup
[root@web02 hadoop]# bin/start-all.sh
namenode running as process 25305. Stop it first.
nq-data-center: starting datanode, logging to /backup/hadoop/bin/../logs/hadoop-root-datanode-nq-data-center.out
lvs: starting datanode, logging to /backup/hadoop/bin/../logs/hadoop-root-datanode-lvs.out
web02: secondarynamenode running as process 25471. Stop it first.
jobtracker running as process 25547. Stop it first.
nq-data-center: starting tasktracker, logging to /backup/hadoop/bin/../logs/hadoop-root-tasktracker-nq-data-center.out
lvs: starting tasktracker, logging to /backup/hadoop/bin/../logs/hadoop-root-tasktracker-lvs.out
[root@web02 hadoop]# mkdir test-in
[root@web02 hadoop]# cd test-in
[root@web02 test-in]# 在 test-in 目錄下建立兩個文字檔案, WordCount 程式將統計其中各個單詞出現次數
-bash: 在: command not found
echo "hello world bye world" >file1.txt
[root@web02 test-in]# echo "hello world bye world" >file1.txt
[root@web02 test-in]# echo "hello hadoop goodbye hadoop" >file2.txt
[root@web02 test-in]# cd ..
[root@web02 hadoop]# bin/hadoop jar hadoop-0.16.0-examples.jar wordcount test-in test-out
執行完畢,下面檢視執行結果:
cd test-out
cat part-00000java.io.IOException: Error opening job jar: hadoop-0.16.0-examples.jar
at org.apache.hadoop.util.RunJar.main(RunJar.java:90)
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)
Caused by: java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.(ZipFile.java:114)
at java.util.jar.JarFile.(JarFile.java:133)
at java.util.jar.JarFile.(JarFile.java:70)
at org.apache.hadoop.util.RunJar.main(RunJar.java:88)
... 4 more
[root@web02 hadoop]# 執行完畢,下面檢視執行結果:
-bash: 執行完畢,下面檢視執行結果:: command not found
[root@web02 hadoop]# cd test-out
-bash: cd: test-out: 沒有那個檔案或目錄
[root@web02 hadoop]# bin/hadoop jar hadoop-*-examples.jar wordcount test-in test-out
org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs://192.168.1.212:9000/user/root/test-in
at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputFormat.java:215)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
at org.apache.hadoop.examples.WordCount.run(WordCount.java:149)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.examples.WordCount.main(WordCount.java:155)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:53)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)
[root@web02 hadoop]# bin/hadoop jar hadoop-0.16.0-examples.jar wordcount /backup/hadoop/test-in test-out
java.io.IOException: Error opening job jar: hadoop-0.16.0-examples.jar
at org.apache.hadoop.util.RunJar.main(RunJar.java:90)
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)
Caused by: java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.(ZipFile.java:114)
at java.util.jar.JarFile.(JarFile.java:133)
at java.util.jar.JarFile.(JarFile.java:70)
at org.apache.hadoop.util.RunJar.main(RunJar.java:88)
... 4 more
[root@web02 hadoop]# bin/hadoop jar hadoop-*-examples.jar wordcount /backup/hadoop/test-in test-out
org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : hdfs://192.168.1.212:9000/backup/hadoop/test-in
at org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputFormat.java:215)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
at org.apache.hadoop.examples.WordCount.run(WordCount.java:149)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.examples.WordCount.main(WordCount.java:155)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:53)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)
[root@web02 hadoop]# bin/hadoop dfs -put /backup/hadoop/test-in test-out
[root@web02 hadoop]# bin/hadoop jar hadoop-*-examples.jar wordcount test-out output
08/08/08 09:36:03 INFO mapred.FileInputFormat: Total input paths to process : 2
08/08/08 09:36:03 INFO mapred.JobClient: Running job: job_200808080926_0003
08/08/08 09:36:04 INFO mapred.JobClient: map 0% reduce 0%
08/08/08 09:36:09 INFO mapred.JobClient: map 100% reduce 0%
08/08/08 09:36:14 INFO mapred.JobClient: map 100% reduce 22%
08/08/08 09:36:16 INFO mapred.JobClient: map 100% reduce 100%
08/08/08 09:36:17 INFO mapred.JobClient: Job complete: job_200808080926_0003
08/08/08 09:36:17 INFO mapred.JobClient: Counters: 16
08/08/08 09:36:17 INFO mapred.JobClient: File Systems
08/08/08 09:36:17 INFO mapred.JobClient: Local bytes read=226
08/08/08 09:36:17 INFO mapred.JobClient: Local bytes written=710
08/08/08 09:36:17 INFO mapred.JobClient: HDFS bytes read=54
08/08/08 09:36:17 INFO mapred.JobClient: HDFS bytes written=41
08/08/08 09:36:17 INFO mapred.JobClient: Job Counters
08/08/08 09:36:17 INFO mapred.JobClient: Launched map tasks=3
08/08/08 09:36:17 INFO mapred.JobClient: Launched reduce tasks=1
08/08/08 09:36:17 INFO mapred.JobClient: Data-local map tasks=3
08/08/08 09:36:17 INFO mapred.JobClient: Map-Reduce Framework
08/08/08 09:36:17 INFO mapred.JobClient: Map input records=2
08/08/08 09:36:17 INFO mapred.JobClient: Map output records=8
08/08/08 09:36:17 INFO mapred.JobClient: Map input bytes=50
08/08/08 09:36:17 INFO mapred.JobClient: Map output bytes=82
08/08/08 09:36:17 INFO mapred.JobClient: Combine input records=8
08/08/08 09:36:17 INFO mapred.JobClient: Combine output records=6
08/08/08 09:36:17 INFO mapred.JobClient: Reduce input groups=5
08/08/08 09:36:17 INFO mapred.JobClient: Reduce input records=6
08/08/08 09:36:17 INFO mapred.JobClient: Reduce output records=5
[root@web02 hadoop]# bin/hadoop dfs -mkdir testdir
[root@web02 hadoop]# bin/hadoop dfs -rm testdir
rm: Cannot remove directory "/user/root/testdir", use -rmr instead
[root@web02 hadoop]# bin/hadoop dfs -rmr testdir
Deleted /user/root/testdir
[root@web02 hadoop]# bin/hadoop dfs -put /usr/local/heming package
會生成 /user/root/ package 子目錄

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/9614263/viewspace-1008736/,如需轉載,請註明出處,否則將追究法律責任。

相關文章