Linux下安裝Hadoop 詳解及WordCount執行

hzcya發表於2020-11-11

單機配置環境如下:

Hadoop(3.1.1)安裝包

JDK1.8.0_231安裝包

Centos -Linux系統環境

使用ssh進行本地免密登入

ssh-keygen -t rsa -P "" -f ~/.ssh/id_rsa

cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys

chmod 755 ~/.ssh/authorized_key

登入成功:

安裝並配置JDK

tar -zxvf jdk-8u231-linux-x64.tar.gz

mkdir /usr/loca/java

cp jdk1.8.0_231 /usr/local/java/

vim /etc/profile

export JAVA_HOME=/usr/local/java/jdk1.8.0_231/

export PATH=$JAVA_HOME/bin:$PATH

java -version

解壓縮Hadoop安裝包

tar -zxvf FusionInsight-Hadoop-3.1.1.tar.gz

解壓縮後出現hadoop的資料夾

配置Hadoop環境變數

export HADOOP_HOME=/home/lhh/hive/hadoop/

export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

建立機器名字

vim /etc/hostname

vim /etc/hosts

hostname hadoop-01

重啟伺服器,修改生效

配置Hadoop中的相應檔案

./hadoop/etc/hadoop/hadoop-env.sh、core-site.xml、mapred-site.xml、hdfs-site.xml、yarn-site.xml

./hadoop/sbin/start-dfs.sh、stop-dfs.sh、start-yarn.sh、stop-yarn.sh

新建hadoop-env.sh配置如下:

export JAVA_HOME=/usr/local/java/jdk1.8.0_231/

注意:hadoop-3.1.1版本需要手動建立該檔案

core-site.xml配置如下:

<configuration> <property> <name>hadoop.tmp.dir</name> <value>/home/lhh/hive/tmp</value> <description>Abase for other temporary directories.</description> </property> <property> <name>fs.defaultFS</name> <value>hdfs://hadoop-01:9000</value> </property> <property> <name>hadoop.proxyuser.root.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.root.groups</name> <value>*</value> </property> </configuration>

mapred-site.xml配置如下:

<configuration> <property> <name>mapred.job.tracker</name> <value>hadoop-1:9001</value> </property> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>

hdfs-site.xml配置如下:

<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/home/lhh/hive/tmp/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/home/lhh/hive/tmp/dfs/data</value> </property> </configuration>

yarn-site.xml配置檔案如下:

<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>hadoop-01</value> </property> <property> <name>yarn.nodemanager.env-whitelist</name> <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME,HADOOP_HOME,PATH,LANG,TZ</value> </property> </configuration>

8.執行Hadoop

在解壓後的hadoop目錄下使用如下命令:

./bin/hdfs namenode -format

開啟NameNode、DataNode等守護程式

./sbin/start-all.sh

./sbin/mr-jobhistory-daemon.sh start historyserver

檢視程式資訊

檢視Web UI

lsof -i:9870

檢視埠是否被監聽,在網頁輸入如下網址:

9.執行WordCount

1)本地建立test.txt檔案

2)在HDFS新建一個資料夾,用於上傳測試檔案

./bin/hdfs dfs -mkdir /test

3)將本地text.txt上傳到test目錄中

./bin/hdfs dfs -put /home/lhh/hive/test.txt /test

4)執行WordCount

./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1-hw-ei-302005.jar wordcount /test/test.txt /test/out

5)檢視結果

./bin/hadoop fs -cat /test/out/part-r-00000

9.hadoop2.x/3.x常用埠號覽表

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/69978904/viewspace-2733652/,如需轉載,請註明出處,否則將追究法律責任。

相關文章