Apache Kafka-0.8.1.1原始碼編譯

五柳-先生發表於2015-12-09

經過近一個月時間,終於差不多將之前在Flume 0.9.4上面編寫的source、sink等外掛遷移到Flume-ng 1.5.0,包括了將Flume 0.9.4上面的TailSource、TailDirSource等外掛的遷移(當然,我們加入了許多新的功能,比如故障恢復、日誌的斷點續傳、按塊傳送日誌以及每個一定的時間輪詢傳送日誌而不是等一個日誌傳送完才傳送另外一個日誌)。現在我們需要將Flume-ng 1.5.0和最新的Kafka-0.8.1.1進行整合,今天這篇文章主要是說如何編譯Kafka-0.8.1.1原始碼。
  在講述如何編譯Kafka-0.8.1.1原始碼之前,我們先來了解一下什麼是Kafka:
  Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design.(Kafka是一個分散式的、可分割槽的(partitioned)、基於備份的(replicated)和commit-log儲存的服務.。它提供了類似於messaging system的特性,但是在設計實現上完全不同)。kafka是一種高吞吐量的分散式釋出訂閱訊息系統,它有如下特性:
  (1)、通過O(1)的磁碟資料結構提供訊息的持久化,這種結構對於即使數以TB的訊息儲存也能夠保持長時間的穩定效能。
  (2)、高吞吐量:即使是非常普通的硬體kafka也可以支援每秒數十萬的訊息。
  (3)、支援通過kafka伺服器和消費機叢集來分割槽訊息。
  (4)、支援Hadoop並行資料載入。
  官方文件中關於kafka分散式訂閱架構如下圖:

kafka_producer_consumer

  好了,更多關於Kafka的介紹可以去http://kafka.apache.org/裡面檢視。現在我們正入正題,說說如何編譯 Kafka-0.8.1.1,我們可以用Kafka裡面自帶的指令碼進行編譯;我們也可以用sbt進行編譯,sbt編譯有點麻煩,我將在文章的後面進行介紹。

一、用Kafka裡面自帶的指令碼進行編譯

  下載好了Kafka原始碼,裡面自帶了一個gradlew的指令碼,我們可以利用這個編譯Kafka原始碼:

1 # wget http://mirror.bit.edu.cn/apache/kafka/0.8.1.1/kafka-0.8.1.1-src.tgz
2 # tar -zxf kafka-0.8.1.1-src.tgz
3 # cd kafka-0.8.1.1-src
4 # ./gradlew releaseTarGz

執行上面的命令進行編譯將會出現以下的異常資訊:

01 :core:signArchives FAILED
02  
03 FAILURE: Build failed with an exception.
04  
05 * What went wrong:
06 Execution failed for task ':core:signArchives'.
07 > Cannot perform signing task ':core:signArchives' because it
08  has no configured signatory
09  
10 * Try:
11 Run with --stacktrace option to get the stack trace. Run with
12 --info or --debug option to get more log output.
13  
14 BUILD FAILED

這是一個bug(https://issues.apache.org/jira/browse/KAFKA-1297),可以用下面的命令進行編譯

1 ./gradlew releaseTarGzAll -x signArchives

這時候將會編譯成功(在編譯的過程中將會出現很多的)。在編譯的過程中,我們也可以指定對應的Scala版本進行編譯:

1 ./gradlew -PscalaVersion=2.10.3 releaseTarGz -x signArchives

編譯完之後將會在core/build/distributions/裡面生成kafka_2.10-0.8.1.1.tgz檔案,這個和從網上下載的一樣,可以直接用。

二、利用sbt進行編譯

  我們同樣可以用sbt來編譯Kafka,步驟如下:

01 # git clone https://git-wip-us.apache.org/repos/asf/kafka.git
02 # cd kafka
03 # git checkout -b 0.8 remotes/origin/0.8
04 # ./sbt update
05 [info]  [SUCCESSFUL ] org.eclipse.jdt#core;3.1.1!core.jar (2243ms)
06 [info] downloading http://repo1.maven.org/maven2/ant/ant/1.6.5/ant-1.6.5.jar ...
07 [info]  [SUCCESSFUL ] ant#ant;1.6.5!ant.jar (1150ms)
08 [info] Done updating.
09 [info] Resolving org.apache.hadoop#hadoop-core;0.20.2 ...
10 [info] Done updating.
11 [info] Resolving com.yammer.metrics#metrics-annotation;2.2.0 ...
12 [info] Done updating.
13 [info] Resolving com.yammer.metrics#metrics-annotation;2.2.0 ...
14 [info] Done updating.
15 [success] Total time: 168 s, completed Jun 182014 6:51:38 PM
16  
17 # ./sbt package
18 [info] Set current project to Kafka (in build file:/export1/spark/kafka/)
19 Getting Scala 2.8.0 ...
20 :: retrieving :: org.scala-sbt#boot-scala
21     confs: [default]
22     3 artifacts copied, 0 already retrieved (14544kB/27ms)
23 [success] Total time: 1 s, completed Jun 182014 6:52:37 PM

對於Kafka 0.8及以上版本還需要執行以下的命令:

01 # ./sbt assembly-package-dependency
02 [info] Loading project definition from /export1/spark/kafka/project
03 [warn] Multiple resolvers having different access mechanism configured with
04 same name 'sbt-plugin-releases'. To avoid conflict, Remove duplicate project
05 resolvers (`resolvers`) or rename publishing resolver (`publishTo`).
06 [info] Set current project to Kafka (in build file:/export1/spark/kafka/)
07 [warn] Credentials file /home/wyp/.m2/.credentials does not exist
08 [info] Including slf4j-api-1.7.2.jar
09 [info] Including metrics-annotation-2.2.0.jar
10 [info] Including scala-compiler.jar
11 [info] Including scala-library.jar
12 [info] Including slf4j-simple-1.6.4.jar
13 [info] Including metrics-core-2.2.0.jar
14 [info] Including snappy-java-1.0.4.1.jar
15 [info] Including zookeeper-3.3.4.jar
16 [info] Including log4j-1.2.15.jar
17 [info] Including zkclient-0.3.jar
18 [info] Including jopt-simple-3.2.jar
19 [warn] Merging 'META-INF/NOTICE' with strategy 'rename'
20 [warn] Merging 'org/xerial/snappy/native/README' with strategy 'rename'
21 [warn] Merging 'META-INF/maven/org.xerial.snappy/snappy-java/LICENSE'
22 with strategy 'rename'
23 [warn] Merging 'LICENSE.txt' with strategy 'rename'
24 [warn] Merging 'META-INF/LICENSE' with strategy 'rename'
25 [warn] Merging 'META-INF/MANIFEST.MF' with strategy 'discard'
26 [warn] Strategy 'discard' was applied to a file
27 [warn] Strategy 'rename' was applied to 5 files
28 [success] Total time: 3 s, completed Jun 182014 6:53:41 PM

當然,我們也可以在sbt裡面指定scala的版本:

01 <!--
02  User: 過往記憶
03  Date: 14-6-18
04  Time: 20:20
05  bolg: http://www.iteblog.com
06  本文地址:http://www.iteblog.com/archives/1044
07  過往記憶部落格,專注於hadoop、hive、spark、shark、flume的技術部落格,大量的乾貨
08  過往記憶部落格微信公共帳號:iteblog_hadoop
09 -->
10 sbt "++2.10.3 update"
11 sbt "++2.10.3 package"
12 sbt "++2.10.3 assembly-package-dependency"
尊重原創,轉載請註明: 轉載自過往記憶(http://www.iteblog.com/)
本文連結地址: 《Apache Kafka-0.8.1.1原始碼編譯》(http://www.iteblog.com/archives/1044)

相關文章