Hadoop2.x學習筆記

黃思喆發表於2015-05-28

Hadoop2.x學習筆記

之前已經配置過一回hadoop1.x了.但是為了用yarn還是決定改用2.x.這次從頭來過重新配置.

單機安裝

  • 依然是brew安裝

    $brew install hadoop

  • 然後是配置JAVA_HOME

    export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_71.jdk/Contents/Home

  • 再之後就是host檔案了

    127.0.0.1 localhost 255.255.255.255 broadcasthost ::1 localhost fe80::1%lo0 localhost 127.0.0.1 XXX

ps: xxx是你的賬戶名

  • 配置ssh

    1. 去mac的偏好設定裡的共享裡把遠端登入給打上勾

    2. 設定無密碼登陸

    $ssh localhost

如果要密碼,就:

$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa 
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

再試試應該就可以無密碼登陸了

  • 配置hadoop

為了方便可以設定環境變數HADOOP_HOME到你的hadoop目錄下

修改hadoop/etc/hadoop/hadoop-env.sh

把裡面的JAVA_HOME修改成.bash_profile中的一樣就可以了

修改hadoop/etc/hadoop/yarn-env.sh

同樣是修改JAVA_HOME

修改hadoop/etc/hadoop/core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>

    <property>
    <name>fs.default.name</name>
    <value>localhost:9000</value>
    </property>
    <!--fs.default.name:用來配置namenode,指定HDFS檔案系統的URL,通過該URL我們可以訪問檔案系統的內容,也可以把localhost換成本機IP地址;如果是完全分佈模式,則必須把localhost改為實際namenode機器的IP地址;如果不寫埠,則使用預設埠8020。 -->
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/usr/local/Cellar/hadoop/tmp/hadooptmp</value>
    </property>
    <!-- hadoop.tmp.dir:Hadoop的預設臨時路徑,這個最好配置,如果在新增節點或者其
    他情況下莫名其妙的DataNode啟動不了,就刪除此檔案中的tmp目錄即可。不過如果刪除了NameNode機器的此目錄,那麼就需要重新執行NameNode格式化的命令。該目錄必須預先手工建立。-->
    <property>
    <name>hadoop.native.lib</name>
    <value>false</value>
    <description>Should native hadoop libraries, if present, be used.</description>
    </property>
</configuration>

修改hadoop/etc/hadoop/hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>

    <property>
        <name>dfs.data.dir</name>
        <value>/usr/local/Cellar/hadoop/hdfs/data</value>
    </property>
    <!--配置HDFS儲存目錄,資料存放目錄,用於datanode存放資料-->
    <property>
        <name>dfs.name.dir</name>
        <value>/usr/local/Cellar/hadoop/hdfs/name</value>
    </property>
    <!--用來儲存namenode的檔案系統後設資料,包括編輯日誌和檔案系統映像,如果更換地址的話,則需要重新使用hadoop namenode –format命令格式化namenode-->
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
    <!--用來設定檔案系統冗餘備份數量,因為只有一個節點,所有設定為1,系統預設數量為3-->

</configuration>

修改hadoop/etc/hadoop/mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
    <name>mapred.job.tracker</name>
    <value>localhost:9001</value>
    </property>
    <!--該項配置用來配置jobtracker節點,localhost也可以換成本機的IP地址;真實分佈模式下注意更改成實際jobtracker機器的IP地址-->
<!--
<property>
    <name>mapred.map.tasks</name>
    <value>20</value>
</property>
<property>
    <name>mapred.reduce.tasks</name>
    <value>4</value>
</property>
-->
<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
</property>
<!--
<property>
    <name>mapreduce.jobhistory.address</name>
    <value>Master:10020</value>
</property>
<property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>Master:19888</value>
</property>
-->
</configuration>

修改hadoop/etc/hadoop/yarn-site.xml <?xml version="1.0"?>