收集、分析線上日誌資料實戰——ELK

weixin_34357887發表於2018-10-15


本文來自網易雲社群

作者:田躲躲


使用者行為統計(User Behavior Statistics, UBS)一直是網際網路產品中必不可少的環節,也俗稱埋點。對於產品經理,運營人員來說,埋點當然是越多,覆蓋範圍越廣越好。通過使用者行為分析系統可洞悉使用者基本操作習慣、探析使用者心理。通過行為資料的補充,構建出精細、完整的使用者畫像,對不同特徵使用者做個性化營銷,提升使用者體驗。讓產品設計人員準確評估使用者行為路徑轉化、產品改版優良、某一新功能對產品的影響幾何,讓運營人員做精準營銷並且評估營銷結果等。

目前所負責專案前期採用了前後端約定欄位,埋點統計使用者操作行為。資料存放在DDB中。如果使用者行為日誌非常大的話,這種方式肯定是不可行的。故採用了目前比較成熟的ELK代替之前的統計流程。本篇文章主要介紹ELK叢集搭建,基本API封裝,以及遇到的一些坑。

Elasticsearch

Elasticsearch是一個基於Lucene構建的開源、分散式、RESTful風格的搜尋引擎。它被設計用於雲端計算中,具有實時搜尋負載、穩定、快速、安裝使用方便等優點。(之前用過SolrCloud,ES對使用者的侵入性簡直可以忽略)

叢集安裝:

每臺機器先配置elasticsearch.yml,主要配置資訊如下:

#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: es-commenta-event #其他機器叢集名稱應該保持一致
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: es-node-c1
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /opt/elk/elasticsearch-5.1.1/data
#
# Path to log files:
#
path.logs: /opt/elk/elasticsearch-5.1.1/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 192.168.140.133 #本機器host
#
# Set a custom port for HTTP:
#
#http.port: 9200
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html>
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.zen.ping.unicast.hosts: ["192.168.140.133",  "192.168.140.134", "192.168.140.135"] #叢集host列表

# Prevent the "split brain" by configuring the majority of nodes (total number of nodes / 2 + 1):
#
discovery.zen.minimum_master_nodes: 2複製程式碼

叢集啟動:

Q1:can not run elasticsearch as root

因為是本地虛擬機器root安裝的,啟動的時候會報這個錯。解決方案是:

group esgroup 
useradd esuser -g esgroup -p espassword
chown -R esuser:esgroup /etc/
chown -R esuser:esgroup /opt/複製程式碼

切換到esuser使用者即可執行啟動命令。

Q2:Unsupported major.minor version 52.0

目前安裝的ES版本為5.1.1,需要Jdk1.8的版本,故安裝下Jdk1.8,配置下環境變數,即可執行啟動命令。

Q3:max virtual memory areas vm.max_map_count [65530] likely too low, increase to at least [262144]

ES啟動佔用更大的記憶體。修改如下:

sysctl -w vm.max_map_count=262144複製程式碼

每個ES服務設定好後,就可以真正啟動了。依次啟動機器的時候,可以看下機器日誌是否有node加入到叢集。如:

curl '192.168.140.133:9200'{    "name": "es-node-c1", 
    "cluster_name": "es-commenta-event", 
    "cluster_uuid": "wi_1VOWoRqecjIht3Ra3mg", 
    "version": {        "number": "5.1.1", 
        "build_hash": "5395e21", 
        "build_date": "2016-12-06T12:36:15.409Z", 
        "build_snapshot": false, 
        "lucene_version": "6.3.0"
    }, 
    "tagline": "You Know, for Search"}複製程式碼

目前有3臺虛擬機器,預設ES有5個節點,可以通過命令建立3個節點的index,每個主節點有一個複製節點。

curl -XPUT 'http://192.168.140.133:9200/commenta' -d '{"settings" : {"number_of_shards" : 3,"number_of_replicas" : 1}}'複製程式碼

叢集狀態:

curl 'http://192.168.140.133:9200/_cluster/health?pretty'{  "cluster_name" : "es-commenta-event",  "status" : "green",  "timed_out" : false,  "number_of_nodes" : 3,  "number_of_data_nodes" : 3,  "active_primary_shards" : 3,  "active_shards" : 6,  "relocating_shards" : 0,  "initializing_shards" : 0,  "unassigned_shards" : 0,  "delayed_unassigned_shards" : 0,  "number_of_pending_tasks" : 0,  "number_of_in_flight_fetch" : 0,  "task_max_waiting_in_queue_millis" : 0,  "active_shards_percent_as_number" : 50.0}複製程式碼

安裝外掛:

通過類SQL轉化成DSL
bin/elasticsearch-plugin install install https://github.com/NLPchina/elasticsearch-sql/releases/download/5.1.1.0/elasticsearch-sql-5.1.1.0.zip複製程式碼
X-Pack整合了許可權、監控等功能,是一款非常有用的外掛。但是商用的,收費。
bin/elasticsearch-plugin install x-pack複製程式碼

Logstash

Logstash是一款輕量級的日誌蒐集處理框架,可以方便的把分散的、多樣化的日誌蒐集起來,並進行自定義的處理,然後傳輸到指定的位置。

安裝:

到官網下載logstash5.1.1版本即可。

啟動:

1、無配置檔案啟動

bin/logstash -e 'input{ stdin{} } output{ stdout{} }'Sending Logstash's logs to /home/webedit/logstash/logstash-5.1.1/logs which is now configured via log4j2.properties
The stdin plugin is now waiting for input:
[2017-04-27T15:47:38,023][INFO ][logstash.pipeline        ] Starting pipeline {"id"=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>500}
[2017-04-27T15:47:38,039][INFO ][logstash.pipeline        ] Pipeline main started
[2017-04-27T15:47:38,115][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
hello elastic
2017-04-27T07:49:00.966Z localhost.localdomain hello elastic複製程式碼

logstash會採集命令列輸入的命令

2、配置檔案啟動

假設我們需要採集的日誌記錄是這種格式的:

INFO  [17.04.27 16:12:12][com.netease.mail.vip.commenta.filter.EventLogFilter]: |44171|1|1|1|1493280732227|0.0|123.58.160.131|133001|COMMENTA-B54C43F5-4FCB-4D10-B9EC-67862FBF0055|1493280732440|huiping_mp|0.7.0|null|1|複製程式碼

如何採集這種格式的日誌呢?這裡採用正規表示式去匹配,具體配置檔案如下:

input {

file {
    type => "commenta"
    path => ["/home/logs/commenta/stdout.log"]
    start_position => "beginning"
    codec => plain { charset => "Windows-1252" }
}

}

filter {if [type] == "commenta" {
    grok {
      match => { "message" => "%{DATA:className}\|%{BASE16FLOAT:id}\|%{DATA:eventType:int}\|%{DATA:page:int}\|%{DATA:eventFrom:int}\|%{DATA:eventTime}\|%{BASE16FLOAT:eventWeight}\|%{DATA:ip}\|%{BASE16FLOAT:userId}\|%{DATA:uniqueCode}\|%{DATA:createTime}\|%{DATA:clientFrom}\|%{DATA:appVersion}\|%{DATA:data}\|%{DATA:eventStep:int}\|"}
          remove_field => ["message"]
        }
}if '_grokparsefailure' in [tags] { #過濾掉不匹配的事件
	drop{}
}

	mutate  { #資料型別轉換  
                        convert => [ "eventWeight", "float"]
                        convert => [ "id", "float"]
                        convert => [ "userId", "float"]
                }


}
output{
       
       stdout { codec => rubydebug } #列印出行為日誌記錄在控制檯

       elasticsearch{
             hosts => ["192.168.140.133:9200","192.168.140.134:9200","192.168.140.135:9200"]
             index => "commenta"
        }
}複製程式碼

下面我們可以啟動logstash看下效果:

./bin/logstash -f ./config/logstash.conf{     "appVersion" => "0.7.0",           "data" => "null",             "ip" => "XXXXXXXXX",      "className" => "INFO  [17.04.27 16:12:12][com.netease.mail.vip.commenta.filter.EventLogFilter]: ",      "eventType" => 1,           "type" => "commenta",    "eventWeight" => 0.0,         "userId" => 133001.0,           "tags" => [],           "path" => "/home/logs/commenta/stdout.log",     "@timestamp" => 2017-04-27T08:18:58.245Z,     "uniqueCode" => "COMMENTA-B54C43F5-4FCB-4D10-B9EC-67862FBF0055",     "createTime" => "1493280732440",       "@version" => "1",           "host" => "testfb-m126-161",      "eventTime" => "1493280732227",      "eventStep" => 1,     "clientFrom" => "huiping_mp",             "id" => 44171.0,           "page" => 1,      "eventFrom" => 1}複製程式碼

通過列印在控制檯的日誌可以看到我們已經通過logstash收集到了行為日誌記錄(部分資料已脫敏)。當然我們也可以通過Kibana看到這些資料,下部分將會講到。

3、啟動問題

Q1:Unsupported major.minor version 52.0

使用的是Logstash版本為5.1.1,需要Jdk1.8的環境,故安裝下Jdk1.8,配置下環境變數,即可執行啟動命令。

Q2:unknown setting host for elasticsearch

配置Logstash的啟動檔案時,注意版本的問題,如host-->hosts

Kibana

Kibana是一個開源的分析與視覺化平臺,設計出來用於和Elasticsearch一起使用的。你可以用kibana搜尋、檢視、互動存放在Elasticsearch索引裡的資料,使用各種不同的圖表、表格、地圖等kibana能夠很輕易地展示高階資料分析與視覺化。

安裝:

到官網下載Kibana5.1.1版本即可。

啟動:

主要配置如下:

# Kibana is served by a back end server. This setting specifies the port to use.
#server.port: 5601

# Specifies the address to which the Kibana server will bind. IP addresses and host names are both valid values.
# The default is 'localhost', which usually means remote machines will not be able to connect.
# To allow connections from remote users, set this parameter to a non-loopback address.
server.host: "192.168.140.133"

# Enables you to specify a path to mount Kibana at if you are running behind a proxy. This only affects
# the URLs generated by Kibana, your proxy is expected to remove the basePath value before forwarding requests
# to Kibana. This setting cannot end in a slash.
#server.basePath: ""

# The maximum payload size in bytes for incoming server requests.
#server.maxPayloadBytes: 1048576

# The Kibana server's name.  This is used for display purposes.
#server.name: "your-hostname"

# The URL of the Elasticsearch instance to use for all your queries.
elasticsearch.url: "http://192.168.140.133:9200"
.......複製程式碼

啟動成功後,我們可以監控commenta*的索引(安裝ES的時候,建立了)

bin/kibana複製程式碼

這時候就可以看到Logstash收集到的資料日誌了

當然我們也可以配置一些統計:

為了更直觀的展示,我們可以把統計“拖拽”到Dashboard中。


至此,ELK已經搭建完成,並提供一些簡單的功能。 但是有一些統計Kibana是做不了的。這時候我們程式需要處理一下。

Java API

HandleEsClientServer

/* ES伺服器列表 */
    private String serverList;    /* 設定client.transport.sniff為true來使客戶端去嗅探整個叢集的狀態,把叢集中其它機器的ip地址加到客戶端中,它會自動幫你新增,並且自動發現新加入叢集的機器 */
    private Boolean sniff = false;    /* 叢集名稱 */
    private String clusterName;    /* 連線客戶端 */
    private Client client;    /* 搜尋基本工具類 */
    private SearchDao searchDao;    public HandleEsClientServer() {
    }    public HandleEsClientServer(String serverList, Boolean sniff, String clusterName) {        this.serverList = serverList;        this.sniff = sniff;        this.clusterName = clusterName;
    }    @Override
    public void afterPropertiesSet() throws Exception {

        logger.info("es server start at time={}, serverList={}, clusterName={}, sniff={}", DateUtil.toStr(new Date(),DateUtil.YYYY_MM_DD_HH_MM_SS),
                serverList, clusterName, sniff);        if (this.getServerList() == null || this.getServerList().length() == 0) {
            logger.error("es serverList is null...");            return;
        }

        List clusterList = Splitter.on(",").trimResults().omitEmptyStrings().splitToList(this.getServerList());

        List transportAddresses = new ArrayList<>();        for (String cluster : clusterList) {
            List host = Splitter.on(":").trimResults().omitEmptyStrings().splitToList(cluster);
            String ip = host.get(0);
            Integer port = Integer.valueOf(host.get(1));            try {
                transportAddresses.add(new InetSocketTransportAddress(InetAddress.getByAddress(getIpByte(ip)), port == null ? 9300 : port));
            } catch (UnknownHostException e) {
                logger.error("init es client error={} at time={} ", e, DateUtil.toStr(new Date(),DateUtil.YYYY_MM_DD_HH_MM_SS));                return;
            }
        }        //配置啟動引數
        Settings settings = Settings.builder()
                .put("cluster.name", clusterName)
                .put("client.transport.sniff", sniff)
                .build();        //初始化Client
        this.client = new PreBuiltTransportClient(settings)
                .addTransportAddresses(transportAddresses.toArray(new TransportAddress[transportAddresses.size()]));        this.searchDao = new SearchDao(this.client);

        logger.info("es server start success at time={}", DateUtil.toStr(new Date(),DateUtil.YYYY_MM_DD_HH_MM_SS));

    }複製程式碼

HandleEsData

 /**
     * 根據elasticsearch-sql外掛的sql語句查詢結果。
     * @param query
     * @return
     * @throws SqlParseException
     * @throws SQLFeatureNotSupportedException
     */
    public SqlResponse selectBySQL(String query) throws SqlParseException, SQLFeatureNotSupportedException {

        logger.info("selectBySQL, query={}",query);        try{
            SqlElasticSearchRequestBuilder select = (SqlElasticSearchRequestBuilder) searchDao.explain(query).explain();            return new SqlResponse((SearchResponse)select.get());
        }catch (Exception e){
            logger.error(e.getMessage(),e);
        }        return null;

    }/**
     * 批量插入資料,使用Obj的id欄位。
     * @param _index
     * @param _type
     * @param data
     * @param generate_id
     * @param 
     * @return
     */
    public  BulkResponse batchObjIndex(String _index, String _type, List data, boolean generate_id){

        logger.info("batchObjIndex, index={}, type={}, data={}, generate_id={}", _index, _type, data, generate_id);

        Assert.notEmpty(data, "data is not allowed empty");

        BulkRequestBuilder bulkRequest = client.prepareBulk();        for (T tObj : data) {
            Class clazz = tObj.getClass();
            String json = JSONObject.toJSONString(tObj, SerializerFeature.WriteMapNullValue);            if(generate_id){
                bulkRequest.add(client.prepareIndex(_index.toLowerCase(), _type.toLowerCase()).setSource(json));
            } else {                try {
                    Object value = clazz.getDeclaredMethod("getId").invoke(tObj);
                    String _id = String.valueOf(value);
                    bulkRequest.add(client.prepareIndex(_index.toLowerCase(), _type.toLowerCase(), _id).setSource(json));
                } catch (Exception e) {
                    logger.error(e.getMessage(),e);
                }
            }
        }        return bulkRequest.execute().actionGet();
    }複製程式碼

參考資料:

http://www.learnes.net/

http://udn.yyuap.com/doc/logstash-best-practice-cn/

https://www.gitbook.com/book/chenryn/elk-stack-guide-cn/details

https://www.elastic.co/guide/en/elasticsearch/reference/5.1/getting-started.html

https://www.elastic.co/guide/en/logstash/5.1/getting-started-with-logstash.html

https://www.elastic.co/guide/en/kibana/5.1/getting-started.html

http://elasticsearch.cn/


網易雲免費體驗館,0成本體驗20+款雲產品!

更多網易研發、產品、運營經驗分享請訪問網易雲社群


相關文章:
【推薦】 Lily-一個埋點管理工具
【推薦】 react技術棧實踐(2)


相關文章