04【線上日誌分析】之Flume Agent的3臺收集+1臺聚合到hdfs的搭建
【log收集】:
機器名稱 服務名稱 使用者
flume-agent-01: namenode hdfs
flume-agent-02: datanode hdfs
flume-agent-03: datanode hdfs
【log聚合】:
機器名稱 使用者
sht-sgmhadoopcm-01(172.16.101.54) root
【sink到hdfs】:
hdfs://172.16.101.56:8020/testwjp/
1.下載apache-flume-1.7.0-bin.tar.gz
[hdfs@flume-agent-01 tmp]$ wget
--2017-01-04 20:40:10--
Resolving www-eu.apache.org... 88.198.26.2, 2a01:4f8:130:2192::2
Connecting to www-eu.apache.org|88.198.26.2|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 55711670 (53M) [application/x-gzip]
Saving to: “apache-flume-1.7.0-bin.tar.gz”
100%[===============================================================================================================================================================================================>] 55,711,670 473K/s in 74s
2017-01-04 20:41:25 (733 KB/s) - “apache-flume-1.7.0-bin.tar.gz” saved [55711670/55711670]
2.解壓重新命名
[hdfs@flume-agent-01 tmp]$
[hdfs@flume-agent-01 tmp]$ tar -xzvf apache-flume-1.7.0-bin.tar.gz
[hdfs@flume-agent-01 tmp]$ mv apache-flume-1.7.0-bin flume-ng
[hdfs@flume-agent-01 tmp]$ cd flume-ng/conf
3.複製flume環境配置和agent配置檔案
[hdfs@flume-agent-01 tmp]$ cp flume-env.sh.template flume-env.sh
[hdfs@flume-agent-01 tmp]$ cp flume-conf.properties.template exec_memory_avro.properties
4.新增hdfs使用者的環境變數檔案
[hdfs@flume-agent-01 tmp]$ cd
[hdfs@flume-agent-01 ~]$ ls -la
total 24
drwxr-xr-x 3 hdfs hadoop 4096 Jul 8 14:05 .
drwxr-xr-x. 35 root root 4096 Dec 10 2015 ..
-rw------- 1 hdfs hdfs 4471 Jul 8 17:22 .bash_history
drwxrwxrwt 2 hdfs hadoop 4096 Nov 19 2014 cache
-rw------- 1 hdfs hdfs 3131 Jul 8 14:05 .viminfo
[hdfs@flume-agent-01 ~]$ cp /etc/skel/.* ./
cp: omitting directory `/etc/skel/.'
cp: omitting directory `/etc/skel/..'
[hdfs@flume-agent-01 ~]$ ls -la
total 36
drwxr-xr-x 3 hdfs hadoop 4096 Jan 4 20:49 .
drwxr-xr-x. 35 root root 4096 Dec 10 2015 ..
-rw------- 1 hdfs hdfs 4471 Jul 8 17:22 .bash_history
-rw-r--r-- 1 hdfs hdfs 18 Jan 4 20:49 .bash_logout
-rw-r--r-- 1 hdfs hdfs 176 Jan 4 20:49 .bash_profile
-rw-r--r-- 1 hdfs hdfs 124 Jan 4 20:49 .bashrc
drwxrwxrwt 2 hdfs hadoop 4096 Nov 19 2014 cache
-rw------- 1 hdfs hdfs 3131 Jul 8 14:05 .viminfo
5.新增flume的環境變數
[hdfs@flume-agent-01 ~]$ vi .bash_profile
export FLUME_HOME=/tmp/flume-ng
export FLUME_CONF_DIR=$FLUME_HOME/conf
export PATH=$PATH:$FLUME_HOME/bin
[hdfs@flume-agent-01 ~]$ . .bash_profile
6.修改flume環境配置檔案
[hdfs@flume-agent-01 conf]$ vi flume-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_25
7.將基於Flume-ng Exec Source開發自定義外掛AdvancedExecSource的AdvancedExecSource.jar包上傳到$FLUME_HOME/lib/
http://blog.itpub.net/30089851/viewspace-2131995/
[hdfs@LogshedNameNodeLogcollector lib]$ pwd
/tmp/flume-ng/lib
[hdfs@LogshedNameNodeLogcollector lib]$ ll AdvancedExecSource.jar
-rw-r--r-- 1 hdfs hdfs 10618 Jan 5 23:50 AdvancedExecSource.jar
[hdfs@LogshedNameNodeLogcollector lib]$
8.修改flume的agent配置檔案
[hdfs@flume-agent-01 conf]$ vi exec_memory_avro.properties
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the custom exec source
a1.sources.r1.type = com.onlinelog.analysis.AdvancedExecSource
a1.sources.r1.command = tail -f /var/log/hadoop-hdfs/hadoop-cmf-hdfs1-NAMENODE-flume-agent-01.log.out
a1.sources.r1.hostname = flume-agent-01
a1.sources.r1.servicename = namenode
# Describe the sink
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = 172.16.101.54
a1.sinks.k1.port = 4545
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.keep-alive = 60
a1.channels.c1.capacity = 1000000
a1.channels.c1.transactionCapacity = 2000
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
9.將flume-agent-01的flume-ng打包,scp到flume-agent-02/03 和 sht-sgmhadoopcm-01(172.16.101.54)
[hdfs@flume-agent-01 tmp]$ zip -r flume-ng.zip flume-ng/*
[jpwu@flume-agent-01 ~]$ scp /tmp/flume-ng.zip flume-agent-02:/tmp/
[jpwu@flume-agent-01 ~]$ scp /tmp/flume-ng.zip flume-agent-03:/tmp/
[jpwu@flume-agent-01 ~]$ scp /tmp/flume-ng.zip sht-sgmhadoopcm-01:/tmp/
10.在flume-agent-02配置hdfs使用者環境變數和解壓,修改agent配置檔案
[hdfs@flume-agent-02 ~]$ cp /etc/skel/.* ./
cp: omitting directory `/etc/skel/.'
cp: omitting directory `/etc/skel/..'
[hdfs@flume-agent-02 ~]$ vi .bash_profile
export FLUME_HOME=/tmp/flume-ng
export FLUME_CONF_DIR=$FLUME_HOME/conf
export PATH=$PATH:$FLUME_HOME/bin
[hdfs@flume-agent-02 ~]$ . .bash_profile
[hdfs@flume-agent-02 tmp]$ unzip flume-ng.zip
[hdfs@flume-agent-02 tmp]$ cd flume-ng/conf
##修改以下引數即可
[hdfs@flume-agent-02 conf]$ vi exec_memory_avro.properties
a1.sources.r1.command = tail -f /var/log/hadoop-hdfs/hadoop-cmf-hdfs1-DATANODE-flume-agent-02.log.out
a1.sources.r1.hostname = flume-agent-02
a1.sources.r1.servicename = datanode
###要檢查flume-env.sh的JAVA_HOME目錄是否存在
11.在flume-agent-03配置hdfs使用者環境變數和解壓,修改agent配置檔案
[hdfs@flume-agent-03 ~]$ cp /etc/skel/.* ./
cp: omitting directory `/etc/skel/.'
cp: omitting directory `/etc/skel/..'
[hdfs@flume-agent-03 ~]$ vi .bash_profile
export FLUME_HOME=/tmp/flume-ng
export FLUME_CONF_DIR=$FLUME_HOME/conf
export PATH=$PATH:$FLUME_HOME/bin
[hdfs@flume-agent-03 ~]$ . .bash_profile
[hdfs@flume-agent-03 tmp]$ unzip flume-ng.zip
[hdfs@flume-agent-03 tmp]$ cd flume-ng/conf
##修改以下引數即可
[hdfs@flume-agent-03 conf]$ vi exec_memory_avro.properties
a1.sources.r1.command = tail -f /var/log/hadoop-hdfs/hadoop-cmf-hdfs1-DATANODE-flume-agent-03.log.out
a1.sources.r1.hostname = flume-agent-03
a1.sources.r1.servicename = datanode
###要檢查flume-env.sh的JAVA_HOME目錄是否存在
12.聚合端 sht-sgmhadoopcm-01,配置root使用者環境變數和解壓,修改agent配置檔案
[root@sht-sgmhadoopcm-01 tmp]# vi /etc/profile
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
export FLUME_HOME=/tmp/flume-ng
export FLUME_CONF_DIR=$FLUME_HOME/conf
export PATH=$FLUME_HOME/bin:$JAVA_HOME/bin:$PATH
[root@sht-sgmhadoopcm-01 tmp]# source /etc/profile
[root@sht-sgmhadoopcm-01 tmp]#
[root@sht-sgmhadoopcm-01 tmp]# unzip flume-ng.zip
[root@sht-sgmhadoopcm-01 tmp]# cd flume-ng/conf
[root@sht-sgmhadoopcm-01 conf]# vi flume-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
###測試: 先聚合, sink到hdfs端
[root@sht-sgmhadoopcm-01 conf]# vi avro_memory_hdfs.properties
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = avro
a1.sources.r1.bind = 172.16.101.54
a1.sources.r1.port = 4545
# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://172.16.101.56:8020/testwjp/
a1.sinks.k1.hdfs.filePrefix = logs
a1.sinks.k1.hdfs.inUsePrefix = .
a1.sinks.k1.hdfs.rollInterval = 0
### roll 16 m = 16777216 bytes
a1.sinks.k1.hdfs.rollSize = 1048576
a1.sinks.k1.hdfs.rollCount = 0
a1.sinks.k1.hdfs.batchSize = 6000
a1.sinks.k1.hdfs.writeFormat = text
a1.sinks.k1.hdfs.fileType = DataStream
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.keep-alive = 90
a1.channels.c1.capacity = 1000000
a1.channels.c1.transactionCapacity = 6000
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
13.後臺啟動
[root@sht-sgmhadoopcm-01 flume-ng]# source /etc/profile
[hdfs@flume-agent-01 flume-ng]$ . ~/.bash_profile
[hdfs@flume-agent-02 flume-ng]$ . ~/.bash_profile
[hdfs@flume-agent-03 flume-ng]$ . ~/.bash_profile
[root@sht-sgmhadoopnn-01 flume-ng]# nohup flume-ng agent -c conf -f /tmp/flume-ng/conf/avro_memory_hdfs.properties -n a1 -Dflume.root.logger=INFO,console &
[hdfs@flume-agent-01 flume-ng]$ nohup flume-ng agent -c /tmp/flume-ng/conf -f /tmp/flume-ng/conf/exec_memory_avro.properties -n a1 -Dflume.root.logger=INFO,console &
[hdfs@flume-agent-01 flume-ng]$ nohup flume-ng agent -c /tmp/flume-ng/conf -f /tmp/flume-ng/conf/exec_memory_avro.properties -n a1 -Dflume.root.logger=INFO,console &
[hdfs@flume-agent-01 flume-ng]$ nohup flume-ng agent -c /tmp/flume-ng/conf -f /tmp/flume-ng/conf/exec_memory_avro.properties -n a1 -Dflume.root.logger=INFO,console &
14.校驗:將叢集的日誌下載到本地,開啟檢視即可(略)
------------------------------------------------------------------------------------------------------------------------------------------------
【備註】:
1.錯誤1 flume-ng安裝的機器上沒有hadoop環境,所以假如sink到hdfs話,需要用到hdfs的jar包
[ERROR - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:146)] Failed to start agent
because dependencies were not found in classpath. Error follows.
java.lang.NoClassDefFoundError: org/apache/hadoop/io/SequenceFile$CompressionType
只需在其他安裝hadoop機器上搜尋以下5個jar包,複製到$FLUME_HOME/lib目錄即可。
搜尋方法: find $HADOOP_HOME/ -name commons-configuration*.jar
commons-configuration-1.6.jar
hadoop-auth-2.7.3.jar
hadoop-common-2.7.3.jar
hadoop-hdfs-2.7.3.jar
hadoop-mapreduce-client-core-2.7.3.jar
protobuf-java-2.5.0.jar
htrace-core-3.1.0-incubating.jar
commons-io-2.4.jar
2.錯誤2 無法載入自定義外掛的類 Unable to load source type: com.onlinelog.analysis.AdvancedExecSource
2017-01-06 21:10:48,278 (conf-file-poller-0) [ERROR - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:142)] Failed to load configuration data. Exception follows.
org.apache.flume.FlumeException: Unable to load source type: com.onlinelog.analysis.AdvancedExecSource, class: com.onlinelog.analysis.AdvancedExecSource
執行hdfs或者root使用者的環境變數即可
[root@sht-sgmhadoopcm-01 flume-ng]# source /etc/profile
[hdfs@flume-agent-01 flume-ng]$ . ~/.bash_profile
[hdfs@flume-agent-02 flume-ng]$ . ~/.bash_profile
[hdfs@flume-agent-03 flume-ng]$ . ~/.bash_profile
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/30089851/viewspace-2132043/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 08【線上日誌分析】之Flume Agent(聚合節點) sink to kafka clusterKafka
- 06【線上日誌分析】之KafkaOffsetMonitor監控工具的搭建Kafka
- 大資料3-Flume收集資料+落地HDFS大資料
- Elasticsearch+kibana+logstash 搭建日誌收集分析平臺Elasticsearch
- 23【線上日誌分析】之改造CDH的HDFS的NN,DN程式,日誌輸出為json格式JSON
- 收集、分析線上日誌資料實戰——ELK
- 03【線上日誌分析】之hadoop-2.7.3編譯和搭建叢集環境(HDFS HA,Yarn HA)Hadoop編譯Yarn
- 日誌分析平臺ELK之日誌收集器filebeat
- 21【線上日誌分析】之記錄一個flume-ng的tail -f引數所誘發的血案AI
- 日誌分析平臺ELK之日誌收集器logstash
- 如何獲取Flume連線HDFS所需要的包
- 02【線上日誌分析】之基於Flume-ng Exec Source開發自定義外掛AdvancedExecSource
- Flume收集日誌到本地目錄
- 05【線上日誌分析】之Kafka 0.10.1.0 Cluster的搭建和Topic簡單操作實驗Kafka
- 16【線上日誌分析】之grafana-4.1.1 Install和新建日誌分析的DashBoardGrafana
- flume 寫往hdfs引數理解分析
- 基於Docker的日誌分析平臺(二) 環境搭建Docker
- 07【線上日誌分析】之kafka-manager監控工具的搭建(sbt安裝與編譯)Kafka編譯
- 大資料01-Flume 日誌收集大資料
- flume分散式日誌收集系統操作分散式
- 25【線上日誌分析】之基於Flume-ng Exec Source開發自定義外掛ExecSource_JSONJSON
- Flume監聽Nginx日誌流向HDFS安裝配置Nginx
- 從0到1搭建自助分析平臺
- 日誌分析平臺ELK之日誌收集器logstash常用外掛配置
- flume-ng+Kafka+Storm+HDFS 實時系統搭建KafkaORM
- 節點2線上日誌生成歸檔日誌在節點1上的初步分析
- 12【線上日誌分析】之RedisLive監控工具的詳細安裝Redis
- 如何在半小時搭建一個簡單的日誌分析平臺?
- RAC 線上日誌的管理
- 18【線上日誌分析】之Spark on Yarn配置日誌Web UI(HistoryServer服務)SparkYarnWebUIServer
- .NET Core + ELK搭建視覺化日誌分析平臺(上)視覺化
- 10【線上日誌分析】之基於Spark Streaming開發OnLineLogAanlysis1Spark
- 大資料03-整合 Flume 和 Kafka 收集日誌大資料Kafka
- 22【線上日誌分析】之專案第二階段概述
- 11【線上日誌分析】之redis-3.2.5 install(單節點)Redis
- 00【線上日誌分析】之專案概述和GitHub專案地址Github
- 搭建ELK日誌平臺(單機)
- ELK(ElasticSearch, Logstash, Kibana)搭建實時日誌分析平臺Elasticsearch