透過這篇文章,瞭解ES 如何原始碼啟動、如何定位對應請求的實現類。
1. 準備環境
Jdk: 17
Es: 7.17
IDEA: 2024.1
Gradle: 8.7
-
安裝jdk、idea
-
下載es 原始碼: (我從github 下載的7.17.8 的程式碼)
https://github.com/elastic/elasticsearch 或者: https://gitee.com/mirrors/elasticsearch -
gradle下載(這一步也可以跳過)
其實就是讓gradle 預設走本地檔案,不然下載比較慢。
1. elasticsearch原始碼\gradle\wrapper\gradle-wrapper.properties
distributionBase=GRADLE_USER_HOME
distributionPath=wrapper/dists
distributionUrl=https\://services.gradle.org/distributions/gradle-7.5.1-all.zip
zipStoreBase=GRADLE_USER_HOME
zipStorePath=wrapper/dists
distributionSha256Sum=db9c8211ed63f61f60292c69e80d89196f9eb36665e369e7f00ac4cc841c2219
2. https\://services.gradle.org/distributions/gradle-7.5.1-all.zip 下載
3. 放置 gradle-7.5.1-all.zip 到elasticsearch\gradle\wrapper
4. 修改gradle-wrapper.properties
distributionUrl=gradle-7.5.1-all.zip
-
修改全域性gradle倉庫地址
在USER_HOME/.gradle/
下面建立新檔案init.gradle
(沒有這個檔案的可以手動建立),輸入下面的內容並儲存。
修改gradle的遠端倉庫地址為阿里雲的倉庫allprojects{ repositories { def ALIYUN_REPOSITORY_URL = 'https://maven.aliyun.com/repository/public/' def ALIYUN_GRADLE_PLUGIN_URL = 'https://maven.aliyun.com/repository/gradle-plugin/' all { ArtifactRepository repo -> if(repo instanceof MavenArtifactRepository){ def url = repo.url.toString() if (url.startsWith('https://repo1.maven.org/maven2/')) { project.logger.lifecycle "Repository ${repo.url} replaced by $ALIYUN_REPOSITORY_URL." remove repo } if (url.startsWith('https://jcenter.bintray.com/')) { project.logger.lifecycle "Repository ${repo.url} replaced by $ALIYUN_REPOSITORY_URL." remove repo } if (url.startsWith('https://plugins.gradle.org/m2/')) { project.logger.lifecycle "Repository ${repo.url} replaced by $ALIYUN_GRADLE_PLUGIN_URL." remove repo } } } maven { url ALIYUN_REPOSITORY_URL } maven { url ALIYUN_GRADLE_PLUGIN_URL } } }
2. IDEA 執行
1. 環境準備
- IDEA 匯入原始碼專案
File->Open->選中es根目錄進入匯入
- project struct 設定專案SDK, 這裡選擇idea 自帶的預設的17
- 設定gradle 的編譯環境
perference 搜尋gradle:
2. 開始編譯
- 編譯原始碼
匯入IDEA 之後右下角會彈窗load gradle project,如果自己沒點,可以點gradle然後手動Reload
點選完成之後需要等待一段時間,build 比較費時間。
這裡不需要自己設定子專案為 gradle 專案,我在一開始還自己設定了,在自己 reload all projects 的時候會自動載入子專案。
- 構建釋出包
操作:根據自己的作業系統,選擇對應的 no-jdk-*-tar
的 build 按鈕,構建 Elasticsearch 釋出包。
構建完成:在對應的 xxx-tar 目錄會有相應的build 目錄以及檔案
構建原因:distribution/archives/no-jdk-darwin-aarch64-tar/build/install/elasticsearch-7.17.8-SNAPSHOT 目錄下會有許多模組, Elasticsearch 採用模組化,所以我們在改動到 modules 模組的程式碼時,都需要重新 build 一次,即使只新增了程式碼註釋。否則,IDEA Debug 除錯時,程式碼行號會對應不上哈。
構建的過程中,發現資源下載失敗:
錯誤資訊如下:
Could not determine the dependencies of task ':x-pack:plugin:ml:bundlePlugin'.
Could not resolve all task dependencies for configuration ':x-pack:plugin:ml:nativeBundle'.
Could not resolve org.elasticsearch.ml:ml-cpp:7.17.8-SNAPSHOT.
Required by:
project :x-pack:plugin:ml
> Could not resolve org.elasticsearch.ml:ml-cpp:7.17.8-SNAPSHOT.
> Could not get resource 'https://artifacts-snapshot.elastic.co/ml-cpp/7.17.8-SNAPSHOT/downloads/ml-cpp/ml-cpp-7.17.8-SNAPSHOT.zip'.
> Could not HEAD 'https://artifacts-snapshot.elastic.co/ml-cpp/7.17.8-SNAPSHOT/downloads/ml-cpp/ml-cpp-7.17.8-SNAPSHOT.zip'.Connect to 127.0.0.1:33210 [/127.0.0.1] failed: Connection refused
解決辦法: 參考 https://github.com/elastic/elasticsearch/issues/48350
修改elasticsearch-7.17.8/x-pack/plugin/ml/build.gradle檔案:
最終下載地址:
https://prelert-artifacts.s3.amazonaws.com/maven/org/elasticsearch/ml/ml-cpp/7.17.8-SNAPSHOT/ml-cpp-7.17.8-SNAPSHOT.zip
ps: 如果下載失敗,可能需要FQ,或者自己下載下載修改該檔案走localRepo 的邏輯。
3. 原始碼啟動
0. 原始碼簡介
整個es java 原始碼大概233W行,可以想象如果想弄清楚是多麼的複雜。
es 採用模組化操作, server 是 和服務端的主要程式; transport-netty4 模組是 Elasticsearch 基於 Netty 實現網路通訊,我們常用的 9200 或 9300 就是由它提供的。
程式的啟動入口在: server/src/main/java/org/elasticsearch/bootstrap/Elasticsearch.java
接收前端的請求在包:server/src/main/java/org/elasticsearch/action
1. 相關檔案修改
- 修改主啟動類:
server 工程下 org.elasticsearch.bootstrap.Elasticsearch#main(java.lang.String[]), main 方法開頭增加:
String esHome = "/Users/xxx/app/xm/es_source/elasticsearch-7.17.8/distribution/archives/no-jdk-darwin-aarch64-tar/build/install/elasticsearch-7.17.8-SNAPSHOT"; // 自己build出來的檔案基路徑
System.setProperty("es.path.home", esHome); // 設定 Elasticsearch 的【根】目錄
System.setProperty("es.path.conf", esHome + "/config"); // 設定 Elasticsearch 的【配置】目錄
System.setProperty("log4j2.disable.jmx", "true"); // 禁用 log4j2 的 JMX 監控,避免報錯
System.setProperty("java.security.policy", esHome + "/config/java.policy"); // 設定 Java 的安全策略
- distribution/archives/no-jdk-darwin-aarch64-tar/build/install/elasticsearch-7.17.8-SNAPSHOT/config/elasticsearch.yml 檔案增加:
node.name: node-1 # 設定 ES 節點名
xpack.security.enabled: false # 禁用 X-Pack 提供的安全認證功能,方便測試
ingest.geoip.downloader.enabled: false # 先關閉geoip庫的更新
啟動之後如果報磁碟水位的問題:
1. 問題:
[node-1] high disk watermark [90%] exceeded on [eo6zdEm8RWWOodoaSMXNXw][node-1][/Users/xxx/Desktop/es_file/es-7.17.8/0/data/nodes/0] free: 18.9gb[8.3%], shards will be relocated away from this node; currently relocating away shards totalling [0] bytes; the node is expected to continue to exceed the high disk watermark when these relocations are complete
2. 修復方案:上面檔案繼續新增
cluster.routing.allocation.disk.threshold_enabled: false
- distribution/archives/no-jdk-darwin-aarch64-tar/build/install/elasticsearch-7.17.8-SNAPSHOT/config 新增檔案java.policy
grant {
permission java.security.AllPermission;
};
- server/src/main/resources/org/elasticsearch/bootstrap/security.policy 檔案刪掉codeBase 相關:
2. 啟動
- 不指定data、logs目錄方法啟動
執行主類 server 模組下: org.elasticsearch.bootstrap.Elasticsearch#main(java.lang.String[])
會看到日誌:
訪問9200:
xxxx % curl localhost:9200/
{
"name" : "node-1",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "V3cJUOHbQA2ZqeHZO67JdA",
"version" : {
"number" : "7.17.8",
"build_flavor" : "unknown",
"build_type" : "unknown",
"build_hash" : "unknown",
"build_date" : "unknown",
"build_snapshot" : true,
"lucene_version" : "8.11.1",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
- 指定data、logs目錄方法啟動
org.elasticsearch.cli.EnvironmentAwareCommand#execute(org.elasticsearch.cli.Terminal, joptsimple.OptionSet) 這裡可以看到給es 傳變數可以有兩種方式:
第一種是程式碼啟動的環境變數設定: es.path.data, org.elasticsearch.bootstrap.Elasticsearch#main(java.lang.String[]) 增加
// 設定data目錄和日誌檔案目錄
System.setProperty("es.path.data", "/Users/xxx/Desktop/es_file/es-7.17.8/0/data"); // 設定 Elasticsearch 的【根】目錄
System.setProperty("es.path.logs", "/Users/xxx/Desktop/es_file/es-7.17.8/0/logs"); // 設定 Elasticsearch 的【配置】目錄
第二種是程式引數加: -Epath.logs=xxx
-Epath.data=/Users/xxx/Desktop/es_file/es-7.17.8/0/data -Epath.logs=/Users/xxx/Desktop/es_file/es-7.17.8/0/logs
3. 建立&檢視索引、插入資料debug
0. 邏輯解釋
- Elasticsearch 提供 RESTful API,對應到原始碼就是 server 專案下的 action 包
- 每個 API 轉發到對應的 TransportXXXAction 的實現類,進行相應的程式碼邏輯。而 TransportXXXAction 需要在 ActionModule 中進行註冊。
1. 建立索引
對應的類是:TransportCreateIndexAction
下斷點:
呼叫:
curl -X PUT -H 'Content-Type:application/json' -d '{"mappings":{"properties":{"name":{"type":"keyword"},"age":{"type":"long"},"address":{"type":"text","analyzer":"standard"},"location":{"type":"geo_point"},"birth_date":{"type":"date"},"birth_date_value":{"type":"long"},"likes":{"type":"keyword"},"well_person":{"type":"boolean"},"salary":{"type":"integer_range"},"school":{"type":"wildcard"},"feature":{"type":"nested","properties":{"height":{"type":"double"},"weight":{"type":"double"}}}}}}' localhost:9200/qlq_user
會進入自己的斷點,說明成功。
2. 檢視索引
對於固定的url,可以用路徑uri 進行搜尋
對應類:org.elasticsearch.rest.action.cat.RestIndicesAction#doCatRequest
xxx % curl localhost:9200/_cat/indices
yellow open qlq_user OwnK3cMUT2-L7Rog062oHA 1 1 0 0 226b 226b
3. 新增文件
對應方法: org.elasticsearch.action.bulk.TransportShardBulkAction#dispatchedShardOperationOnPrimary
請求
curl -X POST -H 'Content-Type:application/json' -d '{"name":"張三","school":"Beijing Xicheng Middle School","age":30,"address":"北京市朝陽區","location":{"lat":39.9075,"lon":116.39723},"birth_date":"1990-01-01","birth_date_value":631120800000,"likes":["讀書","旅行"],"feature":[{"height":175.5,"weight":70.0}],"salary":{"gte":5000,"lte":10000},"well_person":true}' localhost:9200/qlq_user/_doc/
4. 查詢文件
1. 查詢總數
介面:org.elasticsearch.action.search.TransportSearchAction#executeRequest
測試:
xxx % curl localhost:9200/qlq_user/_count
{"count":1,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0}}%
2. 查詢資料
介面: org.elasticsearch.action.search.TransportSearchAction#executeRequest
測試:
xxx % curl -X GET -H 'Content-Type:application/json' -d '{"query":{"term":{"likes":{"value":"讀書"}}}}' localhost:9200/qlq_user/_search
{"took":4,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":0.3616575,"hits":[{"_index":"qlq_user","_type":"_doc","_id":"Q38ee48BpAUI2PvOZWk9","_score":0.3616575,"_source":{"name":"張三","school":"Beijing Xicheng Middle School","age":30,"address":"北京市朝陽區","location":{"lat":39.9075,"lon":116.39723},"birth_date":"1990-01-01","birth_date_value":631120800000,"likes":["讀書","旅行"],"feature":[{"height":175.5,"weight":70.0}],"salary":{"gte":5000,"lte":10000},"well_person":true}}]}}
5. 刪除索引
介面:
org.elasticsearch.action.admin.indices.delete.TransportDeleteIndexAction#doExecute
測試:
xxx % curl -X DELETE localhost:9200/qlq_user
{"acknowledged":true}
4. 錯誤:
- Gradle JVM 引數錯誤
錯誤資訊:
Unrecognized option: --add-exports
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
-----------------------
Check the JVM arguments defined for the gradle process in:
- gradle.properties in project root directory
原因:我一開始用的JDK8 版本比較低,導致JVM引數不符合。
修復:調整為高版本JDK,我這裡用17.
- 編譯相關tar 報錯
錯誤:
Could not determine the dependencies of task ':x-pack:plugin:ml:bundlePlugin'.
> Could not resolve all task dependencies for configuration ':x-pack:plugin:ml:nativeBundle'.
> Could not resolve org.elasticsearch.ml:ml-cpp:7.17.8-SNAPSHOT.
Required by:
project :x-pack:plugin:ml
> Could not resolve org.elasticsearch.ml:ml-cpp:7.17.8-SNAPSHOT.
> Could not get resource 'https://artifacts-snapshot.elastic.co/ml-cpp/7.17.8-SNAPSHOT/downloads/ml-cpp/ml-cpp-7.17.8-SNAPSHOT.zip'.
> Could not HEAD 'https://artifacts-snapshot.elastic.co/ml-cpp/7.17.8-SNAPSHOT/downloads/ml-cpp/ml-cpp-7.17.8-SNAPSHOT.zip'.
> Connect to 127.0.0.1:33210 [/127.0.0.1] failed: Connection refused
解決辦法:
5. 原始碼以叢集方式啟動
啟動三個節點, 原來shell 指令碼啟動方式如下:
sh elasticsearch -Ehttp.port=9200 -Epath.data=/Users/qiao-zhi/app/software/elk/data/0 -Epath.logs=/Users/qiao-zhi/app/software/elk/log/0 -Enode.roles=data
sh elasticsearch -Ehttp.port=9201 -Epath.data=/Users/qiao-zhi/app/software/elk/data/1 -Epath.logs=/Users/qiao-zhi/app/software/elk/log/1 -Enode.roles=master
sh elasticsearch -Ehttp.port=9202 -Epath.data=/Users/qiao-zhi/app/software/elk/data/2 -Epath.logs=/Users/qiao-zhi/app/software/elk/log/2
- 程式碼去掉指定data 和 log 目錄
// 設定data目錄和日誌檔案目錄
// System.setProperty("es.path.data", "/Users/xxx/Desktop/es_file/es-7.17.8/0/data"); // 設定 Elasticsearch 的【根】目錄
// System.setProperty("es.path.logs", "/Users/xxx/Desktop/es_file/es-7.17.8/0/logs"); // 設定 Elasticsearch 的【配置】目錄
- JVM 啟動引數設定(允許併發執行)
-Ehttp.port=9201 -Enode.name=node1 -Epath.data=/Users/xxx/Desktop/es_file/es-7.17.8/1/data -Epath.logs=/Users/xxx/Desktop/es_file/es-7.17.8/1/log -Enode.roles=master
-Ehttp.port=9200 -Enode.name=node2 -Epath.data=/Users/xxx/Desktop/es_file/es-7.17.8/0/data -Epath.logs=/Users/xxx/Desktop/es_file/es-7.17.8/0/log -Enode.roles=data
-Ehttp.port=9202 -Enode.name=node3 -Epath.data=/Users/xxx/Desktop/es_file/es-7.17.8/2/data -Epath.logs=/Users/xxx/Desktop/es_file/es-7.17.8/2/log
- 啟動後檢視叢集資訊
GET /_cat/nodes?v
---
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
127.0.0.1 3 99 35 3.53 d - node2
127.0.0.1 3 99 35 3.53 cdfhilmrstw - node3
127.0.0.1 4 99 35 3.53 m * node1
參考
https://www.iocoder.cn/Elasticsearch/build-debugging-environment/