Hadoop2.x原始碼-編譯剖析

哥不是小蘿莉發表於2015-10-29

1.概述

  最近,有小夥伴涉及到原始碼編譯。然而,在編譯期間也是遇到各種坑,在求助於搜尋引擎,技術部落格,也是難以解決自身所遇到的問題。筆者在被詢問多次的情況下,今天打算為大家來寫一篇文章來剖析下編譯的細節,以及遇到編譯問題後,應該如何去解決這樣類似的問題。因為,編譯的問題,對於後期業務擴充,二次開發,編譯打包是一個基本需要面臨的問題。

2.編譯準備

  在編譯原始碼之前,我們需要準備編譯所需要的基本環境。下面給大家列舉本次編譯的基礎環境,如下所示:

  • 硬體環境
作業系統 CentOS6.6
CPU

I7

記憶體 16G
硬碟 快閃記憶體
核數 4核
  • 軟體環境
JDK 1.7
Maven 3.2.3
ANT 1.9.6
Protobuf 2.5.0

  在準備好這些環境之後,我們需要去將這些環境安裝到作業系統當中。步驟如下:

2.1 基礎環境安裝

  關於JDK,Maven,ANT的安裝較為簡單,這裡就不多做贅述了,將其對應的壓縮包解壓,然後在/etc/profile檔案當中新增對應的路徑到PATH中即可。下面筆者給大家介紹安裝Protobuf,其安裝需要對Protobuf進行編譯,故我們需要編譯的依賴環境gcc、gcc-c++、cmake、openssl-devel、ncurses-devel,安裝命令如下所示:

yum -y install gcc   
yum -y install gcc-c++
yum -y install cmake
yum -y install openssl-devel
yum -y install ncurses-devel

  驗證GCC是否安裝成功,命令如下所示:

  驗證Make核CMake是否安裝成功,命令如下所示:

  在準備完這些環境之後,開始去編譯Protobuf,編譯命令如下所示:

[hadoop@nna ~]$ cd protobuf-2.5.0/
[hadoop@nna protobuf-2.5.0]$ ./configure --prefix=/usr/local/protoc
[hadoop@nna protobuf-2.5.0]$ make
[hadoop@nna protobuf-2.5.0]$ make install

  PS:這裡安裝的時候有可能提示許可權不足,若出現該類問題,使用sudo進行安裝。

  驗證Protobuf安裝是否成功,命令如下所示:

  下面,我們開始進入編譯環境,在編譯的過程當中會遇到很多問題,大家遇到問題的時候,要認真的去分析這些問題產生的原因,這裡先給大家列舉一些可以避免的問題,在使用Maven進行編譯的時候,若使用預設的JVM引數,在編譯到hadoop-hdfs模組的時候,會出現溢位現象。異常資訊如下所示:

java.lang.OutOfMemoryError: Java heap space 

  這裡,我們在編譯Hadoop原始碼之前,可以先去環境變數中設定其引數即可,內容修改如下:

export MAVEN_OPTS="-Xms256m -Xmx512m"

  接下來,我們進入到Hadoop的原始碼,這裡筆者使用的是Hadoop2.6的原始碼進行編譯,更高版本的原始碼進行編譯,估計會有些許差異,編譯命令如下所示:

[hadoop@nna tar]$ cd hadoop-2.6.0-src
[hadoop@nna tar]$ mvn package -DskipTests -Pdist,native

  PS:這裡筆者是直接將其編譯為資料夾,若需要編譯成tar包,可以在後面加上tar的引數,命令為 mvn package -DskipTests -Pdist,native -Dtar

  筆者在編譯過程當中,出現過在編譯KMS模組時,下載tomcat不完全的問題,Hadoop採用的tomcat是apache-tomcat-6.0.41.tar.gz,若是在此模組下出現異常,可以使用一下命令檢視tomcat的檔案大小,該檔案正常大小為6.9M左右。檢視命令如下所示:

[hadoop@nna downloads]$ du -sh *

  若出現只有幾K的tomcat安裝包,表示tomcat下載失敗,我們將其手動下載到/home/hadoop/tar/hadoop-2.6.0-src/hadoop-common-project/hadoop-kms/downloads目錄下即可。在編譯成功後,會出現以下資訊:

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop Main ................................. SUCCESS [  1.162 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [  0.690 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [  1.589 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [  0.164 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [  1.064 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [  2.260 s]
[INFO] Apache Hadoop MiniKDC .............................. SUCCESS [  1.492 s]
[INFO] Apache Hadoop Auth ................................. SUCCESS [  2.233 s]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [  2.102 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [01:00 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [  3.891 s]
[INFO] Apache Hadoop KMS .................................. SUCCESS [  5.872 s]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [  0.019 s]
[INFO] Apache Hadoop HDFS ................................. SUCCESS [04:04 min]
[INFO] Apache Hadoop HttpFS ............................... SUCCESS [01:47 min]
[INFO] Apache Hadoop HDFS BookKeeper Journal .............. SUCCESS [04:58 min]
[INFO] Apache Hadoop HDFS-NFS ............................. SUCCESS [  2.492 s]
[INFO] Apache Hadoop HDFS Project ......................... SUCCESS [  0.020 s]
[INFO] hadoop-yarn ........................................ SUCCESS [  0.018 s]
[INFO] hadoop-yarn-api .................................... SUCCESS [01:05 min]
[INFO] hadoop-yarn-common ................................. SUCCESS [01:00 min]
[INFO] hadoop-yarn-server ................................. SUCCESS [  0.029 s]
[INFO] hadoop-yarn-server-common .......................... SUCCESS [01:03 min]
[INFO] hadoop-yarn-server-nodemanager ..................... SUCCESS [01:10 min]
[INFO] hadoop-yarn-server-web-proxy ....................... SUCCESS [  1.810 s]
[INFO] hadoop-yarn-server-applicationhistoryservice ....... SUCCESS [  4.041 s]
[INFO] hadoop-yarn-server-resourcemanager ................. SUCCESS [ 11.739 s]
[INFO] hadoop-yarn-server-tests ........................... SUCCESS [  3.332 s]
[INFO] hadoop-yarn-client ................................. SUCCESS [  4.762 s]
[INFO] hadoop-yarn-applications ........................... SUCCESS [  0.017 s]
[INFO] hadoop-yarn-applications-distributedshell .......... SUCCESS [  1.586 s]
[INFO] hadoop-yarn-applications-unmanaged-am-launcher ..... SUCCESS [  1.233 s]
[INFO] hadoop-yarn-site ................................... SUCCESS [  0.018 s]
[INFO] hadoop-yarn-registry ............................... SUCCESS [  3.270 s]
[INFO] hadoop-yarn-project ................................ SUCCESS [  2.164 s]
[INFO] hadoop-mapreduce-client ............................ SUCCESS [  0.032 s]
[INFO] hadoop-mapreduce-client-core ....................... SUCCESS [ 13.047 s]
[INFO] hadoop-mapreduce-client-common ..................... SUCCESS [ 10.890 s]
[INFO] hadoop-mapreduce-client-shuffle .................... SUCCESS [  2.534 s]
[INFO] hadoop-mapreduce-client-app ........................ SUCCESS [  6.429 s]
[INFO] hadoop-mapreduce-client-hs ......................... SUCCESS [  4.866 s]
[INFO] hadoop-mapreduce-client-jobclient .................. SUCCESS [02:04 min]
[INFO] hadoop-mapreduce-client-hs-plugins ................. SUCCESS [  1.183 s]
[INFO] Apache Hadoop MapReduce Examples ................... SUCCESS [  3.655 s]
[INFO] hadoop-mapreduce ................................... SUCCESS [  1.775 s]
[INFO] Apache Hadoop MapReduce Streaming .................. SUCCESS [ 11.478 s]
[INFO] Apache Hadoop Distributed Copy ..................... SUCCESS [ 15.399 s]
[INFO] Apache Hadoop Archives ............................. SUCCESS [  1.359 s]
[INFO] Apache Hadoop Rumen ................................ SUCCESS [  3.736 s]
[INFO] Apache Hadoop Gridmix .............................. SUCCESS [  2.822 s]
[INFO] Apache Hadoop Data Join ............................ SUCCESS [  1.791 s]
[INFO] Apache Hadoop Ant Tasks ............................ SUCCESS [  1.350 s]
[INFO] Apache Hadoop Extras ............................... SUCCESS [  1.858 s]
[INFO] Apache Hadoop Pipes ................................ SUCCESS [  5.805 s]
[INFO] Apache Hadoop OpenStack support .................... SUCCESS [  3.061 s]
[INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [07:14 min]
[INFO] Apache Hadoop Client ............................... SUCCESS [  2.986 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [  0.053 s]
[INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [  2.917 s]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [  5.702 s]
[INFO] Apache Hadoop Tools ................................ SUCCESS [  0.015 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [  8.587 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 28:25 min
[INFO] Finished at: 2015-10-22T15:12:10+08:00
[INFO] Final Memory: 89M/451M
[INFO] ------------------------------------------------------------------------

  在編譯完成之後,會在Hadoop原始碼的dist目錄下生成編譯好的檔案,如下圖所示:

  圖中hadoop-2.6.0即表示編譯好的檔案。

3.總結

  在編譯的過程當中,會出現各種各樣的問題,有些問題可以藉助搜尋引擎去幫助我們解決,有些問題搜尋引擎卻難以直接的給出解決方案,這時,我們需要冷靜的分析編譯錯誤資訊,大膽的去猜測,然後去求證我們的想法。簡而言之,解決問題的方法是有很多的。當然,大家也可以在把遇到的編譯問題,貼在評論下方,供後來者參考或借鑑。

4.結束語

  這篇文章就和大家分享到這裡,如果大家在研究和學習的過程中有什麼疑問,可以加群進行討論或傳送郵件給我,我會盡我所能為您解答,與君共勉!

 

相關文章