Hadoop2.7實戰v1.0之Flume1.6.0搭建(Http Source-->Memory Chanel --> Hdfs Sink)
Hadoop2.7實戰v1.0之Flume1.6.0搭建(Http Source-->Memory Chanel --> Hdfs Sink)
1.檢視系統是否已經配置jdk1.7.0
點選(此處)摺疊或開啟
-
[root@xxx-01 jdk1.7.0_25]# bin/java -version
-
java version "1.7.0_25"
-
Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
-
Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
-
[root@xxx-01 jdk1.7.0_25]# pwd
-
/usr/java/jdk1.7.0_25
- [root@xxx-01 jdk1.7.0_25]
##假如沒有配置jdk1.7以上的版本,請參考 http://blog.itpub.net/30089851/viewspace-1994585/ 的 "安裝JDK"
2.下載和解壓flume1.6.0
點選(此處)摺疊或開啟
-
[root@xxx-01 local]# wget http://ftp.cuhk.edu.hk/pub/packages/apache.org/flume/1.6.0/apache-flume-1.6.0-bin.tar.gz
-
[root@xxx-01 local]# tar zxvf apache-flume-1.6.0-bin.tar.gz
-
[root@xxx-01 local]# cd apache-flume-1.6.0-bin
-
[root@xxx-01 apache-flume-1.6.0-bin]# ll
-
total 140
-
drwxr-xr-x 2 template games 4096 May 21 17:32 bin
-
-rw-r--r-- 1 template games 69856 May 9 2015 CHANGELOG
-
drwxr-xr-x 2 template games 4096 May 21 17:32 conf
-
-rw-r--r-- 1 template games 6172 May 9 2015 DEVNOTES
-
drwxr-xr-x 10 template games 4096 May 12 2015 docs
-
drwxr-xr-x 2 root root 4096 May 21 17:32 lib
-
-rw-r--r-- 1 template games 25903 May 9 2015 LICENSE
-
-rw-r--r-- 1 template games 249 May 9 2015 NOTICE
-
-rw-r--r-- 1 template games 1779 May 9 2015 README
-
-rw-r--r-- 1 template games 1585 May 9 2015 RELEASE-NOTES
-
drwxr-xr-x 2 root root 4096 May 21 17:32 tools
-
[root@xxx-01 apache-flume-1.6.0-bin]#
-
[root@xxx-01 apache-flume-1.6.0-bin]# cd conf
-
[root@xxx-01 conf]# ls -l
-
total 16
-
-rw-r--r-- 1 template games 1661 May 9 2015 flume-conf.properties.template
-
-rw-r--r-- 1 template games 1110 May 9 2015 flume-env.ps1.template
-
-rw-r--r-- 1 template games 1214 May 9 2015 flume-env.sh.template
-
-rw-r--r-- 1 template games 3107 May 9 2015 log4j.properties
-
[root@xxx-01 conf]#
-
[root@xxx-01 conf]# cp flume-env.sh.template flume-env.sh
-
[root@xxx-01 conf]# cp flume-conf.properties.template flume-conf.properties
- [root@xxx-01 conf]#
3.配置flume-env.sh 和環境變數
點選(此處)摺疊或開啟
-
[root@xxx-01 conf]# vi flume-env.sh
-
export JAVA_HOME=/usr/java/jdk1.7.0_25
-
export HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop
-
[root@xxx-01 ~]# vi /etc/profile
-
export JAVA_HOME="/usr/java/jdk1.7.0_25"
-
export FLUME_HOME=/usr/local/apache-flume-1.6.0-bin
-
export FLUME_CONF_DIR=$FLUME_HOME/conf
-
-
export PATH=$FLUME_HOME/bin:$JAVA_HOME/bin:$PATH
-
-
[root@xxx-01 ~]# source /etc/profile
-
[root@xxx-01 ~]# echo $FLUME_HOME
- /usr/local/apache-flume-1.6.0-bin
4.配置flume-conf.properties
點選(此處)摺疊或開啟
-
[root@xxx-01 conf]# vi flume-conf.properties
-
# Name the components on this agent
-
a1.sources = r1
-
a1.sinks = k1
-
a1.channels = c1
-
-
# Describe/configure the source
###預設http handle的格式是json -
a1.sources.r1.type = http
-
a1.sources.r1.bind = 本機ip
-
a1.sources.r1.port = 5140
-
-
a1.sources.r1.fileHeader = false
-
#a1.sources.r1.deserializer.outputCharset=UTF-8
-
-
a1.sources.r1.interceptors =i1
-
a1.sources.r1.interceptors.i1.type = timestamp
-
-
# Describe the sink
-
a1.sinks.k1.type = hdfs
-
a1.sinks.k1.channel = c1
- # 可以指定hdfs ha的fs.defaultFS配置資訊,而不是指定其中一臺master的,關鍵是當前flume機器要有hadoop環境(因為要載入hadoop jar包)
- #和在flume機器上這三個hadoop-env.sh hdfs-site.xml core-site.xml檔案要與 日誌儲存的hdfs配置一致.
- a1.sinks.k1.hdfs.path = hdfs://nameservice1/testwjp/%Y-%m-%d/%H
-
#a1.sinks.k1.hdfs.path = hdfs://xxx-01:8022/testwjp/%Y-%m-%d/%H
-
a1.sinks.k1.hdfs.filePrefix = logs
-
a1.sinks.k1.hdfs.inUsePrefix = .
-
-
a1.sinks.k1.hdfs.rollInterval = 0
-
### roll 16 m = 16777216 bytes
-
a1.sinks.k1.hdfs.rollSize = 16777216
-
a1.sinks.k1.hdfs.rollCount = 0
-
a1.sinks.k1.hdfs.batchSize = 1000
-
a1.sinks.k1.hdfs.writeFormat = text
-
-
###A. plain text format 普通文字格式
-
a1.sinks.k1.hdfs.fileType = DataStream
-
-
###B. compressed plain text to zip format 壓縮格式
-
#a1.sinks.k1.hdfs.fileType = CompressedStream
-
#a1.sinks.k1.hdfs.codeC = bzip2
-
-
# Use a channel which buffers events in memory
-
a1.channels.c1.type = memory
-
a1.channels.c1.keep-alive = 30
-
a1.channels.c1.capacity = 100000
-
a1.channels.c1.transactionCapacity = 1000
-
# Bind the source and sink to the channel
-
a1.sources.r1.channels = c1
- a1.sinks.k1.channel = c1
5.啟動agent
[root@xxx-01 apache-flume-1.6.0-bin]#bin/flume-ng agent -c conf -f conf/http-memory-hdfs.properties -n a1 -Dflume.root.logger=INFO,console
###後臺啟動
nohup bin/flume-ng agent -c conf -f conf/http-memory-hdfs.properties -n a1 -Dflume.root.logger=INFO,console &
6.測試端A
[root@xxx-01 bin]# curl -X POST -d'[{"headers":{"h1":"v1","h2":"v2"},"body":"hello body"}]' http://10.168.11.13:5140
7.測試端B Fox瀏覽器+HttpRequester工具
URL:http://本機ip:5140
Type:POST
Content Type: application/json
Content:
點選(此處)摺疊或開啟
-
[{
-
"headers" : {
-
"timestamp" : "1",
-
"host" : "random_host1.example.com"
-
},
-
"body" : "random_body1"
-
},
-
{
-
"headers" : {
-
"timestamp" : "2",
-
"host" : "random_host2.example.com"
-
},
-
"body" : "random_body2"
- }]
###單擊"Submit"按鈕,返回狀態 200,標識內容post成功.
8.驗證資料是否sink到hdfs上
命令列:hadoop fs -ls hdfs://nameservice1/testwjp/
Web:
官方文件:
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/30089851/viewspace-2105014/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Hadoop2.7實戰v1.0之HDFS HAHadoop
- Hadoop2.7實戰v1.0之start-balancer.sh與hdfs balancer資料均衡Hadoop
- Hadoop2.7實戰v1.0之YARN HAHadoopYarn
- Hadoop2.7實戰v1.0之HBase1.1.5 HA分散式搭建Hadoop分散式
- Hadoop2.7實戰v1.0之JVM引數調優HadoopJVM
- Hadoop2.7實戰v1.0之Linux引數調優HadoopLinux
- Hadoop2.7實戰v1.0之Eclipse+Hive2.0.0的JDBC案例(最詳細搭建)HadoopEclipseHiveJDBC
- Hadoop2.7實戰v1.0之Hive-2.0.0+MySQL本地模式安裝HadoopHiveMySql模式
- Hadoop2.7實戰v1.0之Hive-2.0.0+MySQL遠端模式安裝HadoopHiveMySql模式
- Flink的sink實戰之四:自定義
- Hadoop2.7實戰v1.0之動態刪除DataNode(含NodeManager)節點(修改dfs.replication)Hadoop
- Hadoop2.7實戰v1.0之動態新增、刪除DataNode節點及複製策略導向Hadoop
- Hadoop2.7實戰v1.0之新增DataNode節點後,更改檔案複製策略dfs.replicationHadoop
- BIOS實戰之Memory配置iOS
- Flume-ng HDFS sink原理解析
- Hadoop2.7實戰v1.0之Hive-2.0.0的Hiveserver2服務和beeline遠端除錯HadoopHiveServer除錯
- Flink的sink實戰之二:kafkaKafka
- Flink的sink實戰之一:初探
- Flink的sink實戰之三:cassandra3
- Go 關閉chanel & chanel的range迴圈Go
- kubernetes實戰篇之dashboard搭建
- 介面自動化實戰之框架搭建框架
- Flume1.6.0之Error-protobuf-This is supposed to be overridden by subclassesError
- hadoop實戰4--(hdfs讀流程,hdfs寫流程,副本放置策略)Hadoop
- Hadoop2.x運維實戰之入門手冊v1.0Hadoop運維
- Hadoop大資料實戰系列文章之HDFS檔案系統Hadoop大資料
- 大資料實戰之環境搭建(八)大資料
- 小白學習大資料測試之hadoop hdfs和MapReduce小實戰大資料Hadoop
- flume-ng+Kafka+Storm+HDFS 實時系統搭建KafkaORM
- Flutter Chanel通訊流程Flutter
- Go 通道(chanel)詳解Go
- Configure In-Memory HTTP ReplicationHTTP
- hadoop 2.0 hdfs HA 搭建Hadoop
- HDFS分散式叢集搭建分散式
- HTTP快取協議實戰HTTP快取協議
- Hue3.9 搭建整合【HDFS】【Hive】Hive
- 負載均衡環境搭建實戰之apache和tomcat負載ApacheTomcat
- 微服務實戰SpringCloud之Spring Cloud Feign替代HTTP Client微服務SpringGCCloudHTTPclient