es原始碼啟動

QiaoZhi發表於2024-06-13

透過這篇文章,瞭解ES 如何原始碼啟動、如何定位對應請求的實現類。

1. 準備環境

Jdk: 17

Es: 7.17

IDEA: 2024.1

Gradle: 8.7

  1. 安裝jdk、idea

  2. 下載es 原始碼: (我從github 下載的7.17.8 的程式碼)
    https://github.com/elastic/elasticsearch 或者: https://gitee.com/mirrors/elasticsearch

  3. gradle下載(這一步也可以跳過)

其實就是讓gradle 預設走本地檔案,不然下載比較慢。

1. elasticsearch原始碼\gradle\wrapper\gradle-wrapper.properties
distributionBase=GRADLE_USER_HOME
distributionPath=wrapper/dists
distributionUrl=https\://services.gradle.org/distributions/gradle-7.5.1-all.zip
zipStoreBase=GRADLE_USER_HOME
zipStorePath=wrapper/dists
distributionSha256Sum=db9c8211ed63f61f60292c69e80d89196f9eb36665e369e7f00ac4cc841c2219
2. https\://services.gradle.org/distributions/gradle-7.5.1-all.zip 下載
3. 放置 gradle-7.5.1-all.zip 到elasticsearch\gradle\wrapper
4. 修改gradle-wrapper.properties
distributionUrl=gradle-7.5.1-all.zip
  1. 修改全域性gradle倉庫地址
    USER_HOME/.gradle/下面建立新檔案 init.gradle(沒有這個檔案的可以手動建立),輸入下面的內容並儲存。
    修改gradle的遠端倉庫地址為阿里雲的倉庫

    allprojects{
        repositories {
            def ALIYUN_REPOSITORY_URL = 'https://maven.aliyun.com/repository/public/'
            def ALIYUN_GRADLE_PLUGIN_URL = 'https://maven.aliyun.com/repository/gradle-plugin/'
            all { ArtifactRepository repo ->
                if(repo instanceof MavenArtifactRepository){
                    def url = repo.url.toString()
                    if (url.startsWith('https://repo1.maven.org/maven2/')) {
                        project.logger.lifecycle "Repository ${repo.url} replaced by $ALIYUN_REPOSITORY_URL."
                        remove repo
                    }
                    if (url.startsWith('https://jcenter.bintray.com/')) {
                        project.logger.lifecycle "Repository ${repo.url} replaced by $ALIYUN_REPOSITORY_URL."
                        remove repo
                    }
                    if (url.startsWith('https://plugins.gradle.org/m2/')) {
                        project.logger.lifecycle "Repository ${repo.url} replaced by $ALIYUN_GRADLE_PLUGIN_URL."
                        remove repo
                    }
                }
            }
            maven { url ALIYUN_REPOSITORY_URL }
            maven { url ALIYUN_GRADLE_PLUGIN_URL }
        }
    }
    

2. IDEA 執行

1. 環境準備

  1. IDEA 匯入原始碼專案

File->Open->選中es根目錄進入匯入

  1. project struct 設定專案SDK, 這裡選擇idea 自帶的預設的17

  1. 設定gradle 的編譯環境

perference 搜尋gradle:

2. 開始編譯

  1. 編譯原始碼

匯入IDEA 之後右下角會彈窗load gradle project,如果自己沒點,可以點gradle然後手動Reload

點選完成之後需要等待一段時間,build 比較費時間。

這裡不需要自己設定子專案為 gradle 專案,我在一開始還自己設定了,在自己 reload all projects 的時候會自動載入子專案。

  1. 構建釋出包

操作:根據自己的作業系統,選擇對應的 no-jdk-*-tar 的 build 按鈕,構建 Elasticsearch 釋出包。

構建完成:在對應的 xxx-tar 目錄會有相應的build 目錄以及檔案

構建原因:distribution/archives/no-jdk-darwin-aarch64-tar/build/install/elasticsearch-7.17.8-SNAPSHOT 目錄下會有許多模組, Elasticsearch 採用模組化,所以我們在改動到 modules 模組的程式碼時,都需要重新 build 一次,即使只新增了程式碼註釋。否則,IDEA Debug 除錯時,程式碼行號會對應不上哈。

構建的過程中,發現資源下載失敗:

錯誤資訊如下:
Could not determine the dependencies of task ':x-pack:plugin:ml:bundlePlugin'.

Could not resolve all task dependencies for configuration ':x-pack:plugin:ml:nativeBundle'.
Could not resolve org.elasticsearch.ml:ml-cpp:7.17.8-SNAPSHOT.
Required by:
project :x-pack:plugin:ml
> Could not resolve org.elasticsearch.ml:ml-cpp:7.17.8-SNAPSHOT.
> Could not get resource 'https://artifacts-snapshot.elastic.co/ml-cpp/7.17.8-SNAPSHOT/downloads/ml-cpp/ml-cpp-7.17.8-SNAPSHOT.zip'.
> Could not HEAD 'https://artifacts-snapshot.elastic.co/ml-cpp/7.17.8-SNAPSHOT/downloads/ml-cpp/ml-cpp-7.17.8-SNAPSHOT.zip'.

Connect to 127.0.0.1:33210 [/127.0.0.1] failed: Connection refused

解決辦法: 參考 https://github.com/elastic/elasticsearch/issues/48350

修改elasticsearch-7.17.8/x-pack/plugin/ml/build.gradle檔案:

最終下載地址:

https://prelert-artifacts.s3.amazonaws.com/maven/org/elasticsearch/ml/ml-cpp/7.17.8-SNAPSHOT/ml-cpp-7.17.8-SNAPSHOT.zip

ps: 如果下載失敗,可能需要FQ,或者自己下載下載修改該檔案走localRepo 的邏輯。

3. 原始碼啟動

0. 原始碼簡介

整個es java 原始碼大概233W行,可以想象如果想弄清楚是多麼的複雜。

es 採用模組化操作, server 是 和服務端的主要程式; transport-netty4 模組是 Elasticsearch 基於 Netty 實現網路通訊,我們常用的 9200 或 9300 就是由它提供的。

程式的啟動入口在: server/src/main/java/org/elasticsearch/bootstrap/Elasticsearch.java

接收前端的請求在包:server/src/main/java/org/elasticsearch/action

1. 相關檔案修改

  1. 修改主啟動類:

server 工程下 org.elasticsearch.bootstrap.Elasticsearch#main(java.lang.String[]), main 方法開頭增加:

        String esHome = "/Users/xxx/app/xm/es_source/elasticsearch-7.17.8/distribution/archives/no-jdk-darwin-aarch64-tar/build/install/elasticsearch-7.17.8-SNAPSHOT"; // 自己build出來的檔案基路徑
        System.setProperty("es.path.home", esHome); // 設定 Elasticsearch 的【根】目錄
        System.setProperty("es.path.conf", esHome + "/config");  // 設定 Elasticsearch 的【配置】目錄
        System.setProperty("log4j2.disable.jmx", "true"); // 禁用 log4j2 的 JMX 監控,避免報錯
        System.setProperty("java.security.policy", esHome + "/config/java.policy"); // 設定 Java 的安全策略
  1. distribution/archives/no-jdk-darwin-aarch64-tar/build/install/elasticsearch-7.17.8-SNAPSHOT/config/elasticsearch.yml 檔案增加:
node.name: node-1 # 設定 ES 節點名
xpack.security.enabled: false # 禁用 X-Pack 提供的安全認證功能,方便測試
ingest.geoip.downloader.enabled: false # 先關閉geoip庫的更新

啟動之後如果報磁碟水位的問題:

1. 問題:
[node-1] high disk watermark [90%] exceeded on [eo6zdEm8RWWOodoaSMXNXw][node-1][/Users/xxx/Desktop/es_file/es-7.17.8/0/data/nodes/0] free: 18.9gb[8.3%], shards will be relocated away from this node; currently relocating away shards totalling [0] bytes; the node is expected to continue to exceed the high disk watermark when these relocations are complete

2. 修復方案:上面檔案繼續新增
cluster.routing.allocation.disk.threshold_enabled: false
  1. distribution/archives/no-jdk-darwin-aarch64-tar/build/install/elasticsearch-7.17.8-SNAPSHOT/config 新增檔案java.policy
grant {
    permission java.security.AllPermission;
};

  1. server/src/main/resources/org/elasticsearch/bootstrap/security.policy 檔案刪掉codeBase 相關:

2. 啟動

  1. 不指定data、logs目錄方法啟動

執行主類 server 模組下: org.elasticsearch.bootstrap.Elasticsearch#main(java.lang.String[])

會看到日誌:

訪問9200:

xxxx % curl localhost:9200/
{
  "name" : "node-1",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "V3cJUOHbQA2ZqeHZO67JdA",
  "version" : {
    "number" : "7.17.8",
    "build_flavor" : "unknown",
    "build_type" : "unknown",
    "build_hash" : "unknown",
    "build_date" : "unknown",
    "build_snapshot" : true,
    "lucene_version" : "8.11.1",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}
  1. 指定data、logs目錄方法啟動

​ org.elasticsearch.cli.EnvironmentAwareCommand#execute(org.elasticsearch.cli.Terminal, joptsimple.OptionSet) 這裡可以看到給es 傳變數可以有兩種方式:

第一種是程式碼啟動的環境變數設定: es.path.data, org.elasticsearch.bootstrap.Elasticsearch#main(java.lang.String[]) 增加

        // 設定data目錄和日誌檔案目錄
        System.setProperty("es.path.data", "/Users/xxx/Desktop/es_file/es-7.17.8/0/data"); // 設定 Elasticsearch 的【根】目錄
        System.setProperty("es.path.logs", "/Users/xxx/Desktop/es_file/es-7.17.8/0/logs");  // 設定 Elasticsearch 的【配置】目錄

第二種是程式引數加: -Epath.logs=xxx

-Epath.data=/Users/xxx/Desktop/es_file/es-7.17.8/0/data -Epath.logs=/Users/xxx/Desktop/es_file/es-7.17.8/0/logs

3. 建立&檢視索引、插入資料debug

0. 邏輯解釋

  1. Elasticsearch 提供 RESTful API,對應到原始碼就是 server 專案下的 action 包
  2. 每個 API 轉發到對應的 TransportXXXAction 的實現類,進行相應的程式碼邏輯。而 TransportXXXAction 需要在 ActionModule 中進行註冊。

1. 建立索引

對應的類是:TransportCreateIndexAction

下斷點:

呼叫:

curl -X PUT -H 'Content-Type:application/json' -d '{"mappings":{"properties":{"name":{"type":"keyword"},"age":{"type":"long"},"address":{"type":"text","analyzer":"standard"},"location":{"type":"geo_point"},"birth_date":{"type":"date"},"birth_date_value":{"type":"long"},"likes":{"type":"keyword"},"well_person":{"type":"boolean"},"salary":{"type":"integer_range"},"school":{"type":"wildcard"},"feature":{"type":"nested","properties":{"height":{"type":"double"},"weight":{"type":"double"}}}}}}' localhost:9200/qlq_user

會進入自己的斷點,說明成功。

2. 檢視索引

對於固定的url,可以用路徑uri 進行搜尋

對應類:org.elasticsearch.rest.action.cat.RestIndicesAction#doCatRequest

xxx % curl localhost:9200/_cat/indices
yellow open qlq_user OwnK3cMUT2-L7Rog062oHA 1 1 0 0 226b 226b

3. 新增文件

對應方法: org.elasticsearch.action.bulk.TransportShardBulkAction#dispatchedShardOperationOnPrimary

請求

curl -X POST -H 'Content-Type:application/json' -d '{"name":"張三","school":"Beijing Xicheng Middle School","age":30,"address":"北京市朝陽區","location":{"lat":39.9075,"lon":116.39723},"birth_date":"1990-01-01","birth_date_value":631120800000,"likes":["讀書","旅行"],"feature":[{"height":175.5,"weight":70.0}],"salary":{"gte":5000,"lte":10000},"well_person":true}' localhost:9200/qlq_user/_doc/

4. 查詢文件

1. 查詢總數

介面:org.elasticsearch.action.search.TransportSearchAction#executeRequest

測試:

xxx % curl localhost:9200/qlq_user/_count
{"count":1,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0}}%

2. 查詢資料

介面: org.elasticsearch.action.search.TransportSearchAction#executeRequest

測試:

xxx % curl -X GET -H 'Content-Type:application/json' -d '{"query":{"term":{"likes":{"value":"讀書"}}}}' localhost:9200/qlq_user/_search

{"took":4,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":0.3616575,"hits":[{"_index":"qlq_user","_type":"_doc","_id":"Q38ee48BpAUI2PvOZWk9","_score":0.3616575,"_source":{"name":"張三","school":"Beijing Xicheng Middle School","age":30,"address":"北京市朝陽區","location":{"lat":39.9075,"lon":116.39723},"birth_date":"1990-01-01","birth_date_value":631120800000,"likes":["讀書","旅行"],"feature":[{"height":175.5,"weight":70.0}],"salary":{"gte":5000,"lte":10000},"well_person":true}}]}}

5. 刪除索引

介面:

org.elasticsearch.action.admin.indices.delete.TransportDeleteIndexAction#doExecute

測試:

xxx % curl -X DELETE localhost:9200/qlq_user
{"acknowledged":true}

4. 錯誤:

  1. Gradle JVM 引數錯誤

錯誤資訊:

Unrecognized option: --add-exports
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

-----------------------
Check the JVM arguments defined for the gradle process in:
 - gradle.properties in project root directory

原因:我一開始用的JDK8 版本比較低,導致JVM引數不符合。

修復:調整為高版本JDK,我這裡用17.

  1. 編譯相關tar 報錯
錯誤:
Could not determine the dependencies of task ':x-pack:plugin:ml:bundlePlugin'.
> Could not resolve all task dependencies for configuration ':x-pack:plugin:ml:nativeBundle'.
   > Could not resolve org.elasticsearch.ml:ml-cpp:7.17.8-SNAPSHOT.
     Required by:
         project :x-pack:plugin:ml
      > Could not resolve org.elasticsearch.ml:ml-cpp:7.17.8-SNAPSHOT.
         > Could not get resource 'https://artifacts-snapshot.elastic.co/ml-cpp/7.17.8-SNAPSHOT/downloads/ml-cpp/ml-cpp-7.17.8-SNAPSHOT.zip'.
            > Could not HEAD 'https://artifacts-snapshot.elastic.co/ml-cpp/7.17.8-SNAPSHOT/downloads/ml-cpp/ml-cpp-7.17.8-SNAPSHOT.zip'.
               > Connect to 127.0.0.1:33210 [/127.0.0.1] failed: Connection refused
解決辦法: 

5. 原始碼以叢集方式啟動

啟動三個節點, 原來shell 指令碼啟動方式如下:

sh elasticsearch -Ehttp.port=9200 -Epath.data=/Users/qiao-zhi/app/software/elk/data/0 -Epath.logs=/Users/qiao-zhi/app/software/elk/log/0 -Enode.roles=data 
sh elasticsearch -Ehttp.port=9201 -Epath.data=/Users/qiao-zhi/app/software/elk/data/1 -Epath.logs=/Users/qiao-zhi/app/software/elk/log/1 -Enode.roles=master 
sh elasticsearch -Ehttp.port=9202 -Epath.data=/Users/qiao-zhi/app/software/elk/data/2 -Epath.logs=/Users/qiao-zhi/app/software/elk/log/2 
  1. 程式碼去掉指定data 和 log 目錄
        // 設定data目錄和日誌檔案目錄
//        System.setProperty("es.path.data", "/Users/xxx/Desktop/es_file/es-7.17.8/0/data"); // 設定 Elasticsearch 的【根】目錄
//        System.setProperty("es.path.logs", "/Users/xxx/Desktop/es_file/es-7.17.8/0/logs");  // 設定 Elasticsearch 的【配置】目錄
  1. JVM 啟動引數設定(允許併發執行)
-Ehttp.port=9201 -Enode.name=node1 -Epath.data=/Users/xxx/Desktop/es_file/es-7.17.8/1/data -Epath.logs=/Users/xxx/Desktop/es_file/es-7.17.8/1/log -Enode.roles=master

-Ehttp.port=9200 -Enode.name=node2 -Epath.data=/Users/xxx/Desktop/es_file/es-7.17.8/0/data -Epath.logs=/Users/xxx/Desktop/es_file/es-7.17.8/0/log -Enode.roles=data

-Ehttp.port=9202 -Enode.name=node3 -Epath.data=/Users/xxx/Desktop/es_file/es-7.17.8/2/data -Epath.logs=/Users/xxx/Desktop/es_file/es-7.17.8/2/log
  1. 啟動後檢視叢集資訊
GET /_cat/nodes?v
---
ip        heap.percent ram.percent cpu load_1m load_5m load_15m node.role   master name
127.0.0.1            3          99  35    3.53                  d           -      node2
127.0.0.1            3          99  35    3.53                  cdfhilmrstw -      node3
127.0.0.1            4          99  35    3.53                  m           *      node1

參考

https://www.iocoder.cn/Elasticsearch/build-debugging-environment/

相關文章