Logstash收集json格式日誌檔案如何寫配置檔案

後開啟撒打發了發表於2017-11-22

1、日誌格式

{"10190":0,"10071":0,"10191":0,"10070":0,"48":"136587","type":"136587","10018":0}

我們如果收集這個日誌只是做簡單的配置。如下:

input {
    file {

        path => ["/home/elk/logstash-5.6.3/request"]
        type => "chenxun"
    }
}


output {

    stdout {
        codec => rubydebug
    }

    elasticsearch {
        hosts => "192.168.2.181:9200"

    }
}

那麼收集到的結果是:

{
    "_index": "logstash-2017.11.22",
    "_type": "chenxun",
    "_id": "AV_iTR0AM1H1mf2je0nC",
    "_version": 1,
    "_score": 1,
    "_source": {
        "@version": "1",
        "host": "Ubuntu-20170424",
        "path": "/home/elk/logstash-5.6.3/request",
        "@timestamp": "2017-11-22T05:57:05.383Z",
        "message": "{"10190":0,"10071":0,"10191":0,"10070":0,"48":"136587","type":"136587","10018":0}",
        "type": "chenxun"
    }
}

即會將json記錄做為一個字串放到”message”下,這不是我們想要的結果,是讓logstash自動解析json記錄,將各欄位放入elasticsearch中。下面介紹如何配置.

1.直接設定codec => json

input {
    file {

        path => ["/home/elk/logstash-5.6.3/request"]
        type => "chenxun"
        codec => json

    }   
}

這個時候看看結果: 已經把json解析到各個欄位中去了

{
    "_index": "logstash-2017.11.22",
    "_type": "136587",
    "_id": "AV_iXHbGM1H1mf2jfF4d",
    "_version": 1,
    "_score": 1,
    "_source": {
        "48": "136587",
        "10018": 0,
        "10070": 0,
        "10071": 0,
        "10190": 0,
        "10191": 0,
        "path": "/home/elk/logstash-5.6.3/request",
        "@timestamp": "2017-11-22T06:13:51.361Z",
        "@version": "1",
        "host": "Ubuntu-20170424",
        "type": "136587"
    }
}

可以設定編碼格式:(收集中文日誌)

codec => json {
            charset => "UTF-8"
        }

2、使用filter json

配置如下:

input {
    file {

        path => ["/home/elk/logstash-5.6.3/request"]

    }
}

filter {
        json {
            source => "message"
            #target => "doc"
            #remove_field => ["message"]
        }        
}

output {

    stdout {
        codec => rubydebug
    }

    elasticsearch {
        hosts => "192.168.2.181:9200"

    }
}

輸入結果:

{
    "_index": "logstash-2017.11.22",
    "_type": "136587",
    "_id": "AV_igupKM1H1mf2jfxm2",
    "_version": 1,
    "_score": 1,
    "_source": {
        "48": "136587",
        "10018": 0,
        "10070": 0,
        "10071": 0,
        "10190": 0,
        "10191": 0,
        "path": "/home/elk/logstash-5.6.3/request",
        "@timestamp": "2017-11-22T06:55:51.335Z",
        "@version": "1",
        "host": "Ubuntu-20170424",
        "message": "{"10190":0,"10071":0,"10191":0,"10070":0,"48":"136587","type":"136587","10018":0}",
    "type": "136587"
    }
}

可以看到,原始記錄被儲存,同時欄位也被解析儲存。如果確認不需要儲存原始記錄內容,可以加設定:remove_field => [“message”]

其中特別需要注意解析json資料的內容,logstash會在向es插入資料時預設會在_source下增加type,host,path三個欄位,如果json內容中本身也含有type,host,path欄位,那麼解析後將覆蓋掉logstash預設的這三個欄位,尤其是type欄位,這個同時也是做為index/type用的,覆蓋掉後,插入進es中的index/type就是json資料記錄中的內容,將不再是logstash config中配置的type值。

相關文章