使用Fluentd + Elasticsearch收集訪問日誌

banq發表於2018-11-14

本文介紹如何：

跨伺服器收集和處理Web應用程式日誌。
將收集的日誌近乎實時地傳送到聚合器Fluentd。
將收集的日誌儲存到Elasticsearch中。
使用Kibana視覺化資料。

先決條件

對Fluentd，Elasticsearch和Kibana的基本瞭解
Fluentd，Elasticsearch和Kibana已安裝

我們想做什麼？
想象一下，你有一個應用程式，它與外部提供商交換資料。一切都很好，但有時會出現問題，您或他們需要知道您傳送的資料和他們要求的資料。然後你用谷歌搜尋並意識到你需要有一個訪問日誌，五分鐘後你將包括slf4j + logback / log4j2並寫入伺服器中的檔案。您的應用程式開始獲得點選，現在你有十個節點的叢集，日誌分散在十個節點上。現在，每次需要查詢請求時，都需要在每個節點中執行，當你意識到你需要集中你的日誌，這篇文章來幫助你。

我們怎麼做？
還有一堆的工具，你可以用它來集中應用程式日誌：rsyslog, logstash, flume, scribe, fluentd, 從應用程式的角度來看，我將使用logback來記錄和流暢地將資料傳送給流利的人。E lasticsearch將保留日誌資料，以便稍後由kibana查詢。

將日誌傳送到本地fluentd
首先，我們需要能夠記錄請求和響應。這可以透過不同的方式實現，我將使用logback-access庫，它就像一個用於logback的外掛，並且與Jetty完美契合。
使用您最喜歡的依賴管理器將其包含在您的應用中：

<dependency>
    <groupId>ch.qos.logback</groupId>
    <artifactId>logback-access</artifactId>
    <version>1.2.3</version>
</dependency>`

這個庫提供了幾個類，我們將使用ch.qos.logback.access.servlet.TeeFilter來訪問請求和響應有效負載（正文），並使用ch.qos.logback.access.jetty.RequestLogImp ll來發布請求並響應要回溯的資料，以便在我們的日誌佈局中使用它們。現在我們需要將這些類插入Jetty，有兩行要突出顯示：
contextHandler.addFilter(new FilterHolder(new TeeFilter()), “/*”, util.EnumSet.of(DispatcherType.INCLUDE, DispatcherType.REQUEST, DispatcherType.FORWARD))

我們使用TeeFilter來攔截所有匹配正規表示式“/ *”的請求，以複製請求和響應有效負載，以供我們記錄。
requestLog.setResource("/logback-access.xml")

Logback-access使用它自己的配置檔案，它是可配置的（預設路徑是{jetty.home} /etc/logback-access.xml）。應該是這樣的：

<configuration>
    <appender name="FLUENCY" class="ch.qos.logback.more.appenders.FluencyLogbackAppender">
        <!-- Tag for Fluentd. Farther information: http://docs.fluentd.org/articles/config-file -->
        <tag>accesslog</tag>
        <!-- Host name/address and port number which Flentd placed -->
        <remoteHost>localhost</remoteHost>
        <port>20001</port>

        <!-- [Optional] Configurations to customize Fluency's behavior: https://github.com/komamitsu/fluencyusage  -->
        <ackResponseMode>false</ackResponseMode>
        <fileBackupDir>/tmp</fileBackupDir>
        <!-- Initial chunk buffer size is 1MB (by default)-->
        <bufferChunkInitialSize>2097152</bufferChunkInitialSize>
        <!--Threshold chunk buffer size to flush is 4MB (by default)-->
        <bufferChunkRetentionSize>16777216</bufferChunkRetentionSize>
        <!-- Max total buffer size is 512MB (by default)-->
        <maxBufferSize>268435456</maxBufferSize>
        <!-- Max wait until all buffers are flushed is 10 seconds (by default)-->
        <waitUntilBufferFlushed>30</waitUntilBufferFlushed>
        <!-- Max wait until the flusher is terminated is 10 seconds (by default) -->
        <waitUntilFlusherTerminated>40</waitUntilFlusherTerminated>
        <!-- Flush interval is 600ms (by default)-->
        <flushIntervalMillis>200</flushIntervalMillis>
        <!-- Max retry of sending events is 8 (by default) -->
        <senderMaxRetryCount>12</senderMaxRetryCount>
        <!-- [Optional] Enable/Disable use of EventTime to get sub second resolution of log event date-time -->
        <useEventTime>true</useEventTime>

        <encoder>
            <pattern><![CDATA[REQUEST FROM %remoteIP ON %date{yyyy-MM-dd HH:mm:ss,UTC} UTC // %responseHeader{X-UOW} // responseHeader{X-RequestId}    %n
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
%fullRequest
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
%fullResponse
                ]]>
            </pattern>
        </encoder>
    </appender>

    <appender-ref ref="FLUENCY"/>
</configuration>

使用logback-more-appenders的 ch.qos.logback.more.appenders.FluencyLogbackAppender 插入logback-access 和Fluency。

<dependency>
    <groupId>org.komamitsu</groupId>
    <artifactId>fluency</artifactId>
    <version>1.8.1</version>
</dependency>
<dependency>
    <groupId>com.sndyuk</groupId>
    <artifactId>logback-more-appenders</artifactId>
    <version>1.5.0</version>
  </dependency>

Fluency 有很多緩衝風格配置你需要調整，這裡有很好的解釋。對於本文，我們將重點關注tag，remoteHost和port。

tag用於標記事件。我們將使用它來匹配我們在流利的事件中的事件，並能夠解析，過濾和轉發它們到elasticsearch。
remoteHost是事件將被髮送的地方，在這種情況下我們將有一個本地流利的所以我們使用'localhost'
port ， fluentd監聽埠

encoder.pattern定義事件的佈局。它與您的日誌模式相同，您可以使用佔位符，但在此提交發布之前無法使用MDC資料。以下是我們的活動將如何顯示的示例：

REQUEST FROM 69.28.94.231 ON 2018-10-30 00:00:00 UTC // myapp-node-00-1540857599992 // h5hSUaVHvr >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> POST /my/app/path HTTP/1.1 X-Forwarded-Proto: https X-Forwarded-For: 69.28.94.231 Host: my.company.com Content-Length: 30 Content-Type: application/json

{"message": "This is the body of the request" }

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< HTTP/1.1 200 OK X-RequestId: h5hSUaVHvr X-UOW: myapp-node-00-1540857599992 Date: Mon, 29 Oct 2018 23:59:59 GMT Content-Type: application/json; charset=UTF-8

{"message": "This is the body of the response", "status": "Okey!"}

從本地Fluency 轉發日誌到遠端Fluency
我們配置本地流利，以處理我們的事件並將它們轉發給Fluency的聚合器。配置檔案（預設情況下為/etc/td-agent/td-agent.conf）:

<source>
    @type forward
    port 20001
</source>

<filter accesslog>
    @type parser
    key_name msg
    reserve_data false
    <parse>
        @type multiline
        format_firstline /^REQUEST FROM/
        format1 /REQUEST FROM (?<request.ip>[^ ]*) ON (?<time>\d{4}-\d{2}-\d{2} \d{2}\:\d{2}\:\d{2} [^ ]+) // (?<request.uow>[^ ]*) // (?<request.id>[^ ]*)\n/
        format2 />{49}\n/
        format3 /(?<request.method>[^ ]*) (?<request.path>[^ ]*) (?<request.protocol>[^ ]*)\n/
        format4 /(?<request.headers>(?:.|\n)*?)\n\n/
        format5 /(?<request.body>(?:.|\n)*?)\n/
        format6 /<{49}\n/
        format7 /(?<response.protocol>[^ ]*) (?<response.status.code>[^ ]*) (?<response.status.description>[^\n]*)\n/
        format8 /(?<response.headers>(?:.|\n)*?)\n\n/
        format9 /(?<response.body>(?:.|\n)*?)\n\Z/
    </parse>
</filter>

# Parse request.headers="Header: Value\n Header: Value\n" to become and Object request.headers={"Header": "Value", "Header": "Value"}
<filter accesslog>
  @type record_transformer
  enable_ruby true
  renew_record false
  auto_typecast true
  <record>
    hostname "#{Socket.gethostname}"
    request.headers ${Hash[record["request.headers"].each_line.map { |l| l.chomp.split(': ', 2) }]}
    response.headers ${Hash[record["response.headers"].each_line.map { |l| l.chomp.split(': ', 2) }]}
  </record>
</filter>

<match accesslog>
    @type forward
    send_timeout 5s
    recover_wait 10s
    hard_timeout 30s
    flush_interval 5s
    <server>
        name elastic-node-00
        host elastic-node-00
        port 24224
        weight 100
    </server>
</match>

<match **>
    @type file
      path /tmp/fluentd/output/messages
</match>

強調：

source.port與我們在logback-access.xml中配置的埠相同，用於傳送logaccess事件。
filter 和match標籤有' accesslog '關鍵字。這是我之前提到過的標籤。我們正在使用完美的匹配，但可以有一個正規表示式。
match標籤將我們的事件轉發到位於主機'elastic-node-00'中的Fluency聚合器並偵聽埠24224
Filter按順序應用
filter.parse有一個正規表示式來解析我們的事件。組標籤（如response.body或 request.method）將在過濾後用作json屬性。例如，我們的示例事件在每個過濾器後將如下所示：

First filter
{ 
 ...
 "time": "2018-10-30 00:00:00 UTC"
 "request.ip": "192.168.0.1",
 "request.uow": "myapp-node-00-1540857599992",
 "request.id": "h5hSUaVHvr",
 "request.method": "POST",
 "request.path": "/my/app/path",
 "request.protocol": "HTTP/1.1",
 "request.headers": "X-Forwarded-Proto: https\nX-Forwarded-For: 69.28.94.231\nHost: my.company.com\nContent-Length: 30\nContent-Type: application/json",
 "request.body": "{\"message\": \"This is the body of the request\" }",
 "response.protocol": "HTTP/1.1",
 "response.status.code": "200",
 "response.status.description": "OK"
 "response.headers": "X-RequestId: h5hSUaVHvr\nX-UOW: myapp-node-00-1540857599992\nDate: Mon, 29 Oct 2018 23:59:59 GMT\nContent-Type: application/json; charset=UTF-8"
 "response.body": "{\"message\": \"This is the body of the response\", \"status\": \"Okey!\"}"
 ...
}
Second filter
{ 
 ...
 "time": "2018-10-30 00:00:00 UTC"
 "request.ip": "192.168.0.1",
 "request.uow": "myapp-node-00-1540857599992",
 "request.id": "h5hSUaVHvr",
 "request.method": "POST",
 "request.path": "/my/app/path",
 "request.protocol": "HTTP/1.1",
 "request.headers": { "X-Forwarded-Proto": "https", 
       "X-Forwarded-For": "69.28.94.231", 
       "Host":" my.company.com", 
       "Content-Length": "30", 
       "Content-Type": "application/json" 
      },
 "request.body": "{\"message\": \"This is the body of the request\" }",
 "response.protocol": "HTTP/1.1",
 "response.status.code": "200",
 "response.status.description": "OK"
 "response.headers": {
       "X-RequestId: "h5hSUaVHvr",
       "X-UOW": "myapp-node-00-1540857599992",
       "Date": "Mon, 29 Oct 2018 23:59:59 GMT",
       "Content-Type": "application/json; charset=UTF-8"
      }
 "response.body": "{\"message\": \"This is the body of the response\", \"status\": \"Okey!\"}"
 ...
}

將收集的日誌儲存到Elasticsearch中

這部分非常簡單，我們必須接收事件並將它們轉發給elasticsearch。Fluentd配置檔案應如下所示：

<source>
  @type forward
  port 24224
  bind 0.0.0.0
</source>

<match accesslog>
  @type elasticsearch
  scheme http
  host localhost
  port 9200
  logstash_format true
  validate_client_version true
</match>

強調

source.port 與我們在 match.server.port中配置的埠相同
match.logstash_format 生成格式為 logstash -YYYY-mm-dd的Elasticsearch索引
match.port表示Elasticsearch API偵聽埠

在Kibana中檢視資料
現在您只需要進入Kibana應用所有訪問日誌

Docker Compose部署 EFK（Elasticsearch + Fluentd + Kibana）收集日誌
2020-07-07
DockerElasticsearch
fluentd收集kubernetes 叢集日誌分析
2019-04-12
Elasticsearch+Fluentd+Kafka搭建分散式日誌系
2021-09-09
ElasticsearchKafka分散式
ABP 使用ElasticSearch、Kibana、Docker 進行日誌收集
2022-03-07
ElasticsearchDocker
輕鬆上手Fluentd，結合 Rainbond 外掛市場，日誌收集更快捷
2022-06-22
AI
SpringBoot使用ELK日誌收集
2019-03-01
Spring Boot
使用Kafka做日誌收集
2021-01-01
Kafka
Elasticsearch+kibana+logstash 搭建日誌收集分析平臺
2022-01-26
Elasticsearch
FeignClient配置日誌訪問
2018-07-03
client
日誌收集之filebeat使用介紹
2021-04-20
Linux下使用GoAccess監控Nginx訪問日誌
2018-11-09
LinuxGoNginx
Tomcat訪問日誌淺析
2018-11-10
Tomcat
ELK日誌系統之使用Rsyslog快速方便的收集Nginx日誌
2018-08-29
Nginx
go fiber: 增加訪問日誌accesslog
2024-11-16
Go
Vector + ClickHouse 收集日誌
2024-03-15
rac日誌收集方法
2020-09-04
logstash收集springboot日誌
2021-04-28
Spring Boot
比較開源日誌：Logstash、FluentD 和 Fluent Bit
2024-03-17
Laravel 使用 Elasticsearch 作為日誌儲存
2019-11-25
LaravelElasticsearch
日誌服務之分析使用者訪問行為
2022-04-27
Lumen日誌接入 Elasticsearch
2019-12-03
Elasticsearch
Linux-ELK日誌收集
2021-07-06
Linux
微服務下，使用ELK做日誌收集及分析
2019-06-13
微服務
awk統計訪問nginx日誌次數
2024-06-03
Nginx
使用nginx控制ElasticSearch訪問許可權
2019-03-28
NginxElasticsearch訪問許可權
elasticsearch日誌刪除命令
2018-11-15
Elasticsearch
Elasticsearch 的事務日誌
2024-06-19
Elasticsearch
通過 Systemd Journal 收集日誌
2019-03-11
（四）Logstash收集、解析日誌方法
2020-11-22
ELK+logspout收集Docker日誌
2019-03-04
Docker
TFA-收集日誌及分析
2024-07-30
Apiclude中Talkingdata模組異常日誌不能收集問題
2019-02-26
API
.NetCore使用Docker安裝ElasticSearch、Kibana 記錄日誌
2021-07-01
NetCoreDockerElasticsearch
基於滴滴雲部署 Elasticsearch + Kibana + Fluentd
2019-02-28
Elasticsearch
如何訪問Docker容器中的Spring Boot日誌
2020-11-30
DockerSpring Boot
logstash採集Java日誌文字檔案配合grok收集到elasticsearch簡單示例
2020-10-06
JavaElasticsearch
Flume收集日誌到本地目錄
2018-08-10
Filebeat 收集日誌的那些事兒
2020-06-18

使用Fluentd + Elasticsearch收集訪問日誌

相關文章