日誌分析平臺ELK之搜尋引擎Elasticsearch叢集

1874發表於2020-10-01

原文網址 : https://www.cnblogs.com/qiuhom-1874/p/13758006.html

　　一、簡介

　　什麼是ELK？ELK是Elasticsearch、Logstash、Kibana這三個軟體的首字母縮寫；其中elasticsearch是用來做資料的儲存和搜尋的搜尋引擎；logstash是資料收集處理平臺，它能夠對特定的資料做分析、切詞、收集、過濾等等處理，通常用於對日誌的處理；kibana是用於把處理後的資料做視覺化展示，提供一個web介面，方便我們去elasticsearch中檢索想要的資料；elasticsearch是一個高度可擴充套件的開源全文搜尋和分析引擎，它可實現資料的實時全文搜尋，支援分散式實現高可用，提供RUSTfull風格的API介面，可以處理大規模日誌資料；

　　elasticsearch是基於java語言在lucene的框架上進行開發實現；lucene是java中的一個成熟免費的開源搜尋類庫，本質上lucene只是提供程式設計API介面，要想使用lucene框架做搜尋引擎，需要使用者自行開發lucene的外殼，實現呼叫lucene的API介面實現全文檢索和搜尋；elasticsearch就是以lucene為資訊檢索庫的搜尋引擎；

　　elasticsearch的基本元件

　　索引（index）：文件容器，具有類似屬性的文件的集合。類似關係型資料庫中的表的概念；在elasticsearch中索引名稱必須使用小寫字母；

　　型別（type）：型別是索引內部的邏輯分割槽，其意義完全取決於使用者需求。一個索引內部可定義一個或多個型別。一搬來說，型別就是擁有相同的域的文件的預定義；

　　文件（document）：文件是lucene索引和搜尋的原子單位，它包含了一個或多個域。是域的容器，基於JSON格式表示。一個域由一個名字，一個或多個值組成；擁有多個值得域，通常我們稱為多值域；

　　對映(mapping)：原始內容儲存為文件之前需要事先進行分析，例如切詞、過濾掉某些詞等；對映用於定義此分析機制該如何實現；除此之外，ES（elasticsearch）還為對映提供了諸如將域中的內容排序等功能。

　　elasticsearch叢集元件

　　cluster：ES的叢集標識為叢集名稱；預設為"elasticsearch"。節點就是靠此名字來決定加入到哪個叢集中。一個節點只能屬於於一個叢集。

　　Node：執行了單個ES例項的主機即為節點。用於儲存資料、參與叢集索引及搜尋操作。節點的標識靠節點名。

　　Shard：將索引切割成為的物理儲存元件；但每一個shard都是一個獨立且完整的索引；建立索引時，ES預設將其分割為5個shard，使用者也可以按需自定義，建立完成之後不可修改。shard有兩種型別primary shard和replica。Replica用於資料冗餘及查詢時的負載均衡。每個主shard的副本數量可自定義，且可動態修改。

　　ES Cluster工作過程

　　啟動時，通過多播(預設)或單播方式在9300/tcp查詢同一叢集中的其它節點，並與之建立通訊。叢集中的所有節點會選舉出一個主節點負責管理整個叢集狀態，以及在叢集範圍內決定各shards的分佈方式。站在使用者角度而言，每個node均可接收並響應使用者的各類請求。

　　叢集有狀態：green, red, yellow；green表示叢集狀態健康，各節點上的shard和我們定義的一樣；yellow表示叢集狀態亞健康，可能存在shard和我們定義的不一致，比如某個節點當機了，它上面的shard也隨著消失，此時叢集的狀態就是亞健康狀態；一般yellow狀態是很容易轉變為green狀態的；red表示叢集狀態不健康，比如3個節點有2個節點都當機了，那麼也就意味著這兩個節點上的shard丟失，當然shard丟失，對應的資料也會隨之丟失；所以red狀態表示叢集有丟失資料的風險；

　　二、elasticsearch叢集部署

　　環境說明

　　某個服務如果以分散式或叢集的模式工作，首先我們要把各節點的時間進行同步，這是叢集的基本原則；其次，一個叢集的名稱解析不能也不應該依賴外部的dns服務來解析，因為一旦dns服務掛掉，它會影響整個叢集的通訊，所以如果需要用到名稱解析，我們應該首先考慮hosts檔案來解析各節點名稱；如果叢集各節點間需要互相拷貝資料，我們應該還要做ssh 互信；以上三個條件是大多數叢集的最基本條件；

名稱	ip地址	埠
es1	192.168.0.41	9200/9300
es2	192.168.0.42	9200/9300

　　各節點安裝jdk

yum install -y java-1.8.0-openjdk-devel

　　提示：不同的es版本對jdk的版本要求也不一樣，這個可以去官方文件中看，對應es版本需要用到的jdk版本；

　　匯出JAVA_HOME

　　驗證java版本和JAVA_HOME環境變數

　　下載elasticsearch rpm包

[root@node01 ~]# wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.12.rpm
--2020-10-01 20:44:29--  https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.12.rpm
Resolving artifacts.elastic.co (artifacts.elastic.co)... 151.101.110.222, 2a04:4e42:36::734
Connecting to artifacts.elastic.co (artifacts.elastic.co)|151.101.110.222|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 148681336 (142M) [application/octet-stream]
Saving to: ‘elasticsearch-6.8.12.rpm’
100%[==========================================================================>] 148,681,336  133MB/s   in 1.1s   

2020-10-01 20:45:07 (133 MB/s) - ‘elasticsearch-6.8.12.rpm’ saved [148681336/148681336]

　　安裝elasticsearch rpm包

[root@node01 ~]# ll
total 145200
-rw-r--r-- 1 root root 148681336 Aug 18 19:38 elasticsearch-6.8.12.rpm
[root@node01 ~]# yum install ./elasticsearch-6.8.12.rpm 
Loaded plugins: fastestmirror
Examining ./elasticsearch-6.8.12.rpm: elasticsearch-6.8.12-1.noarch
Marking ./elasticsearch-6.8.12.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package elasticsearch.noarch 0:6.8.12-1 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

===================================================================================================================================
 Package                         Arch                     Version                    Repository                               Size
===================================================================================================================================
Installing:
 elasticsearch                   noarch                   6.8.12-1                   /elasticsearch-6.8.12                   229 M

Transaction Summary
===================================================================================================================================
Install  1 Package

Total size: 229 M
Installed size: 229 M
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Creating elasticsearch group... OK
Creating elasticsearch user... OK
  Installing : elasticsearch-6.8.12-1.noarch                                                                                   1/1 
### NOT starting on installation, please execute the following statements to configure elasticsearch service to start automatically using systemd
 sudo systemctl daemon-reload
 sudo systemctl enable elasticsearch.service
### You can start elasticsearch service by executing
 sudo systemctl start elasticsearch.service
Created elasticsearch keystore in /etc/elasticsearch
  Verifying  : elasticsearch-6.8.12-1.noarch                                                                                   1/1 

Installed:
  elasticsearch.noarch 0:6.8.12-1                                                                                                  

Complete!
[root@node01 ~]#

　　編輯配置檔案

　　提示：es的主配置檔案是/etc/elasticsearch/elasticsearch.yml；其中我們需要配置cluster.name，node.name，path.data，path.log，這四項是非常重要的，cluster.name是配置的叢集名稱，同一叢集各主機就是依賴這個配置判斷是否是同一叢集，所以在同一叢集的其他節點的配置，這個名稱必須一致；node.name是用於標識節點名稱，這個名稱在叢集中是唯一的，也就說這個名稱在同一叢集的其他節點必須唯一，不能重複；path.data用於指定es存放資料的目錄，建議各節點都配置同一個目錄方便管理；其次這個目錄還建議掛載一個儲存；path.logs用於指定es的日誌存放目錄；

　　提示：bootstrap.memory_lock: true這項配置表示啟動es時，立即分配jvm.options這個檔案中定義的記憶體大小；預設沒有啟用，如果要啟用，我們需要主機節點記憶體是否夠用，以及elasticsearch使用者是否能夠申請對應大小的記憶體；network.host用於指定es監聽的ip地址，0.0.0.0表示監聽本機所有可用地址；http.port用於指定對使用者提供服務的埠地址；discovery.zen.ping.unicast.hosts指定對那些主機做單播通訊來發現節點；discovery.zen.minimum_master_nodes指定master節點的的最小數量；不指定預設就是1；

　　完整的配置

[root@node01 ~]# cat /etc/elasticsearch/elasticsearch.yml
# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: test-els-cluster
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: node01
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /els/data
#
# Path to log files:
#
path.logs: /els/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 0.0.0.0
#
# Set a custom port for HTTP:
#
http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.zen.ping.unicast.hosts: ["node01", "node02"]
#
# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
#
discovery.zen.minimum_master_nodes: 1
#
# For more information, consult the zen discovery module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true
[root@node01 ~]#

View Code

　　建立資料目錄和日誌目錄，並把對應目錄修改成elasticsearch屬主和屬組

　　複製配置檔案到其他節點對應位置，並修改node.name為對應節點名稱，並在對應節點上建立資料目錄和日誌目錄並把其屬主和屬組修改成elasticsearch

　　提示：對於node02上的es配置和node01上的配置，唯一不同的就是節點名稱，其餘都是一樣的；

　　啟動node01、node02上的es，並把es設定為開機啟動

　　提示：可以看到node01和node02上的9200和9300都處於監聽狀態了；9200是使用者對外提供服務的埠，9300是用於叢集各節點通訊埠；到此2節點的es叢集就搭建好了；

　　驗證：訪問node01和node02的9200埠，看看響應內容cluster_name和cluster_uuid是否是一樣？

　　提示：可以看到訪問node01和node02的9200埠，響應內容都響應了相同cluster_name和cluster_uuid；說明node01和node02屬於同一個叢集；

　　檢視es介面提供的cat介面

[root@node01 ~]# curl http://node02:9200/_cat
=^.^=
/_cat/allocation
/_cat/shards
/_cat/shards/{index}
/_cat/master
/_cat/nodes
/_cat/tasks
/_cat/indices
/_cat/indices/{index}
/_cat/segments
/_cat/segments/{index}
/_cat/count
/_cat/count/{index}
/_cat/recovery
/_cat/recovery/{index}
/_cat/health
/_cat/pending_tasks
/_cat/aliases
/_cat/aliases/{alias}
/_cat/thread_pool
/_cat/thread_pool/{thread_pools}
/_cat/plugins
/_cat/fielddata
/_cat/fielddata/{fields}
/_cat/nodeattrs
/_cat/repositories
/_cat/snapshots/{repository}
/_cat/templates
[root@node01 ~]#

　　檢視叢集node資訊

[root@node01 ~]# curl http://node02:9200/_cat/nodes
192.168.0.42 19 96 1 0.00 0.05 0.05 mdi - node02
192.168.0.41 15 96 1 0.03 0.04 0.05 mdi * node01
[root@node01 ~]#

　　提示：後面帶*號的表示master節點；

　　檢視叢集健康狀態

[root@node01 ~]# curl http://node02:9200/_cat/health
1601559464 13:37:44 test-els-cluster green 2 2 0 0 0 0 0 0 - 100.0%
[root@node01 ~]#

　　檢視叢集索引資訊

[root@node01 ~]# curl http://node02:9200/_cat/indices
[root@node01 ~]#

　　提示：這裡顯示空，是因為叢集裡沒有任何資料；

　　檢視叢集分片資訊

[root@node01 ~]# curl http://node02:9200/_cat/shards
[root@node01 ~]#

　　獲取myindex索引下的test型別的1號文件資訊

[root@node01 ~]# curl http://node02:9200/myindex/test/1
{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_expression","resource.id":"myindex","index_uuid":"_na_","index":"myindex"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_expression","resource.id":"myindex","index_uuid":"_na_","index":"myindex"},"status":404}[root@node01 ~]# 
[root@node01 ~]# curl http://node02:9200/myindex/test/1?pretty
{
  "error" : {
    "root_cause" : [
      {
        "type" : "index_not_found_exception",
        "reason" : "no such index",
        "resource.type" : "index_expression",
        "resource.id" : "myindex",
        "index_uuid" : "_na_",
        "index" : "myindex"
      }
    ],
    "type" : "index_not_found_exception",
    "reason" : "no such index",
    "resource.type" : "index_expression",
    "resource.id" : "myindex",
    "index_uuid" : "_na_",
    "index" : "myindex"
  },
  "status" : 404
}
[root@node01 ~]#

　　提示：?pretty表示用易讀的JSON格式輸出；從上面的反饋內容，它告訴我們沒有找到指定的索引；

　　新增一個文件到es叢集的指定索引

[root@node01 ~]# curl -XPUT http://node01:9200/myindex/test/1 -d ' 
{"name":"zhangsan","age":18,"gender":"nan"}'
{"error":"Content-Type header [application/x-www-form-urlencoded] is not supported","status":406}[root@node01 ~]#

　　提示：這裡向es寫指定文件到指定索引下，返回不支援header頭部；解決辦法，手動指定頭部型別；

[root@node01 ~]# curl -XPUT http://node01:9200/myindex/test/1  -H 'content-Type:application/json'  -d '
{"name":"zhangsan","age":18,"gender":"nan"}'
{"_index":"myindex","_type":"test","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":0,"_primary_term":1}[root@node01 ~]#

　　驗證：檢視myindex索引下的test型別的1號文件，看看是否能夠查到我們剛才寫的資料？

[root@node01 ~]# curl  http://node01:9200/myindex/test/1?pretty
{
  "_index" : "myindex",
  "_type" : "test",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "name" : "zhangsan",
    "age" : 18,
    "gender" : "nan"
  }
}
[root@node01 ~]#

　　提示：可以看到返回了我們剛才寫的文件內容；

　　現在再次檢視叢集的索引資訊和分片資訊

　　提示：可以看到現在es叢集中有一個myindex的索引，其狀態為green；分片資訊中也可以看到有5各主分片和5個replica分片；並且每個分片都的master和replica都沒有在同一個節點；

　　搜尋所有的索引和型別

　　提示：jq是用於以美觀方式顯示json資料，作用同pretty的一樣；以上命令表示從所有型別所用索引中搜尋，name欄位為zhangsan的資訊；如果命中了，就會把對應文件列印出來；未命中就告訴我們未命中；如下

[root@node01 ~]# curl http://node01:9200/_search?q=age:19|jq       
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   135  100   135    0     0   2906      0 --:--:-- --:--:-- --:--:--  2934
{
  "took": 37,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}
[root@node01 ~]# curl http://node01:9200/_search?q=age:18|jq 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   247  100   247    0     0  10795      0 --:--:-- --:--:-- --:--:-- 11227
{
  "took": 12,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "myindex",
        "_type": "test",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "zhangsan",
          "age": 18,
          "gender": "nan"
        }
      }
    ]
  }
}
[root@node01 ~]#

　　提示：如果要在指定索引中搜尋在前面的url加上指定的索引名稱即可；

　　提示：如果有多個索引我們也可以根據多個索引名稱的特點來使用*來匹配；如下

[root@node01 ~]# curl http://node01:9200/*/_search?q=age:18|jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   247  100   247    0     0   8253      0 --:--:-- --:--:-- --:--:--  8517
{
  "took": 20,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "myindex",
        "_type": "test",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "zhangsan",
          "age": 18,
          "gender": "nan"
        }
      }
    ]
  }
}
[root@node01 ~]# curl http://node01:9200/my*/_search?q=age:18|jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   247  100   247    0     0   7843      0 --:--:-- --:--:-- --:--:--  7967
{
  "took": 19,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "myindex",
        "_type": "test",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "zhangsan",
          "age": 18,
          "gender": "nan"
        }
      }
    ]
  }
}
[root@node01 ~]#

　　搜尋指定的單個索引的指定型別

　　提示：以上就是在es叢集的命令列介面常用操作，通常我們用es叢集，不會在命令列中做搜尋，我們會利用web介面來做；命令列只是用於測試；好了到此es叢集就搭建好了；後續我們就可以用logstash收集指定地方的資料，傳給es，然後再利用kibana的web介面來展示es中的資料；

2023最新ELK日誌平臺（elasticsearch+logstash+kibana）搭建
2023-03-19
Elasticsearch
Centos7下ELK+Redis日誌分析平臺的叢集環境部署記錄
2018-05-29
CentOSRedis
手把手教你搭建一套ELK日誌搜尋運維平臺
2020-03-20
運維
日誌分析平臺ELK之日誌收集器logstash
2020-10-02
日誌分析平臺ELK之日誌收集器filebeat
2020-10-04
ELK 日誌分析系統 ----------- 部署ElasticSearch群集
2020-10-29
Elasticsearch
ElasticSearch全文搜尋引擎
2019-07-29
Elasticsearch
搭建ELK日誌平臺（單機）
2021-12-20
日誌分析系統 - k8s部署ElasticSearch叢集
2022-01-05
K8SElasticsearch
ELK一個優秀的日誌收集、搜尋、分析的解決方案
2021-01-22
.NET Core + ELK搭建視覺化日誌分析平臺(上)
2020-03-04
視覺化
Elasticsearch+kibana+logstash 搭建日誌收集分析平臺
2022-01-26
Elasticsearch
分散式搜尋引擎Elasticsearch的架構分析
2020-12-08
分散式Elasticsearch架構
認識搜尋引擎 Elasticsearch
2021-07-15
Elasticsearch
日誌分析平臺ELK之日誌收集器logstash常用外掛配置
2020-10-03
運維平臺之應用日誌解決方案--ELK
2020-12-10
運維應用日誌
解密Elasticsearch：深入探究這款搜尋和分析引擎
2023-05-06
解密Elasticsearch
ELK+FileBeat+Kafka搭建日誌管理平臺
2019-05-17
Kafka
fluentd收集kubernetes 叢集日誌分析
2019-04-12
[日誌分析篇]-利用ELK分析jumpserver日誌-日誌拆分篇
2024-10-24
Server
用Elasticsearch構建電商搜尋平臺
2018-08-03
Elasticsearch
搜尋引擎ElasticSearch18_ElasticSearch簡介1
2024-05-23
Elasticsearch
ELK日誌分析系統
2020-11-18
使用ELK構建微服務的日誌平臺
2018-08-23
微服務
elasticsearch之拼音搜尋
2022-01-14
Elasticsearch
Nebula 基於 ElasticSearch 的全文搜尋引擎的文字搜尋
2021-06-17
Elasticsearch
ELK構建MySQL慢日誌收集平臺詳解
2018-08-24
MySql
ELK-日誌分析系統
2020-09-23
Linux日誌搜尋 grep
2024-09-07
Linux
用elasticsearch和nuxtjs搭建bt搜尋引擎
2018-10-02
ElasticsearchUXJS
elasticsearch 搜尋引擎工具的高階使用
2024-03-18
Elasticsearch
在 Spring Boot 中使用搜尋引擎 Elasticsearch
2021-11-18
Spring BootElasticsearch
搜尋引擎ElasticSearch18_ElasticSearch程式設計操作5
2024-05-27
Elasticsearch程式設計
直播平臺開發，基礎搜尋方式之拼音搜尋
2024-08-10
開放搜尋開源相容版，支援Elasticsearch做搜尋召回引擎
2021-09-25
Elasticsearch
ELK實時分析之php的laravel專案日誌
2019-01-03
PHPLaravel
ELK日誌
2020-11-23
ELK+FileBeat日誌分析系統
2021-07-15

日誌分析平臺ELK之搜尋引擎Elasticsearch叢集

相關文章