ES 是近乎線性擴充套件的分散式系統,所以可以總結成同一個測試模式: 1.使用和線上叢集相同硬體配置的伺服器搭建一個單節點叢集。 2.使用和線上叢集相同的對映建立一個 0 副本,1 分片的測試索引。 3.使用和線上叢集相同的資料寫入進行壓測。 4.觀察寫入效能,或者執行查詢請求觀察搜尋聚合效能。 5.持續壓測數小時,使用監控系統記錄 eps、requesttime、fielddata cache、GC count 等關鍵資料。
測試完成後,根據監控系統資料,確定單分片的效能拐點,或者適合自己預期值的臨界點。這個資料,就是一個基準資料。之後的擴容計劃,都可以以這個基準單位進行。
需要注意的是,測試是以分片為單位的,在實際使用中,因為主分片和副本分片都是在各自節點做 indexing 和 merge 操作,需要消耗同樣的寫入效能。所以,實際叢集的容量預估中,要考慮副本數的影響。也就是說,假如你在基準測試中得到單機寫入效能在 10000 eps,那麼開啟一個副本後所能達到的 eps 就只有 5000 了。還想寫入 10000 eps 的話,就需要加一倍機器。
因為esrally 需要使用 pip3
一、PIP 3安裝說明
1、下載安裝 登陸伺服器
[server]$ cd ~
[server]$ mkdir tmp
[server]$ cd tmp
[server]$ wget https://www.python.org/ftp/python/3.6.2/Python-3.6.2.tgz
[server]$ tar zxvf Python-3.6.2.tgz
[server]$ cd Python-3.6.2
[server]$ ./configure --prefix=$HOME/opt/python-3.6.2
[server]$ make
[server]$ make install
修改profile
[server]$ vim /etc/profile
在檔案尾部加入配置
export PATH=$HOME/opt/python-3.6.2/bin:$PATH
[server]$ source /etc/profile
驗證資訊
[server]$ which python3
/root/opt/python-3.6.2/bin/python3
[server]$ python3 --version
Python 3.6.2
複製程式碼
安裝完畢
如果沒有git 還需要安裝下
yum install -y curl-devel expat-devel gettext-devel openssl-devel zlib-devel gcc perl-ExtUtils-MakeMaker
yum install -y asciidoc xmlto autoconf
yum remove git
yum install git git version 1.7.1 //版本太舊了 而且這裡需要1.9+ SO 。。。。
去 https://github.com/git/git/releases 找個自己喜歡的版本
wget https://github.com/git/git/archive/v2.22.0.tar.gz
tar -zxvf v2.22.0.tar.gz
cd git-2.22.0
make configure
./configure --prefix=/usr/local/git --with-iconv=/usr/local/libiconv
make all doc
make install install-doc install-html
## 建立軟連結
ln -s /usr/local/git/bin/git /usr/bin/git
## 驗證
git -version
$]# git version 2.22.0
複製程式碼
二 、 安裝 esrally
官方文件有介紹 Install Python 3.5+ including pip3, git 1.9+ and an appropriate JDK to run Elasticsearch Be sure that JAVA_HOME points to that JDK. Then run the following command, optionally prefixed by sudoif necessary: python 3.5+ git 1.9+ JAVA_HOME 必須配置了JDK
[server]$ pip3 install esrally //pip3 install esrally --target=/data/secoo_program/esrally
如果第一步有任何問題 ,看文件 https://esrally.readthedocs.io/en/stable/install.html
[server]$ esrally configure //首次配置 檢測環境,官方詳細配置 https://esrally.readthedocs.io/en/stable/configuration.html
____ ____
/ __ \____ _/ / /_ __
/ /_/ / __ `/ / / / / /
/ _, _/ /_/ / / / /_/ /
/_/ |_|\__,_/_/_/\__, /
/____/
Running simple configuration. Run the advanced configuration with:
esrally configure --advanced-config
* Setting up benchmark root directory in /root/.rally/benchmarks
* Setting up benchmark source directory in /root/.rally/benchmarks/src/elasticsearch
Configuration successfully written to /root/.rally/rally.ini. Happy benchmarking!
More info about Rally:
* Type esrally --help
* Read the documentation at https://esrally.readthedocs.io/en/1.2.1/
* Ask a question on the forum at https://discuss.elastic.co/c/elasticsearch/rally
複製程式碼
配置完成
三、 使用
官方小demo esrally --distribution-version=6.5.3 這個操作會下載Elasticsearch 6.5.3,然後執行Rally的預設 track - geonames track 。執行完成後,會在命令列產生一個總結報告:
------------------------------------------------------
_______ __ _____
/ ____(_)___ ____ _/ / / ___/_________ ________
/ /_ / / __ \/ __ `/ / \__ \/ ___/ __ \/ ___/ _ \
/ __/ / / / / / /_/ / / ___/ / /__/ /_/ / / / __/
/_/ /_/_/ /_/\__,_/_/ /____/\___/\____/_/ \___/
------------------------------------------------------
| Metric | Task | Value | Unit |
|-------------------------------:|---------------------:|----------:|-------:|
| Indexing time | | 28.0997 | min |
| Merge time | | 6.84378 | min |
| Refresh time | | 3.06045 | min |
| Flush time | | 0.106517 | min |
| Merge throttle time | | 1.28193 | min |
| Median CPU usage | | 471.6 | % |
| Total Young Gen GC | | 16.237 | s |
| Total Old Gen GC | | 1.796 | s |
| Index size | | 2.60124 | GB |
| Totally written | | 11.8144 | GB |
| Heap used for segments | | 14.7326 | MB |
| Heap used for doc values | | 0.115917 | MB |
| Heap used for terms | | 13.3203 | MB |
| Heap used for norms | | 0.0734253 | MB |
| Heap used for points | | 0.5793 | MB |
| Heap used for stored fields | | 0.643608 | MB |
| Segment count | | 97 | |
| Min Throughput | index-append | 31925.2 | docs/s |
| Median Throughput | index-append | 39137.5 | docs/s |
| Max Throughput | index-append | 39633.6 | docs/s |
| 50.0th percentile latency | index-append | 872.513 | ms |
| 90.0th percentile latency | index-append | 1457.13 | ms |
| 99.0th percentile latency | index-append | 1874.89 | ms |
| 100th percentile latency | index-append | 2711.71 | ms |
| 50.0th percentile service time | index-append | 872.513 | ms |
| 90.0th percentile service time | index-append | 1457.13 | ms |
| 99.0th percentile service time | index-append | 1874.89 | ms |
| 100th percentile service time | index-append | 2711.71 | ms |
| ... | ... | ... | ... |
| ... | ... | ... | ... |
| Min Throughput | painless_dynamic | 2.53292 | ops/s |
| Median Throughput | painless_dynamic | 2.53813 | ops/s |
| Max Throughput | painless_dynamic | 2.54401 | ops/s |
| 50.0th percentile latency | painless_dynamic | 172208 | ms |
| 90.0th percentile latency | painless_dynamic | 310401 | ms |
| 99.0th percentile latency | painless_dynamic | 341341 | ms |
| 99.9th percentile latency | painless_dynamic | 344404 | ms |
| 100th percentile latency | painless_dynamic | 344754 | ms |
| 50.0th percentile service time | painless_dynamic | 393.02 | ms |
| 90.0th percentile service time | painless_dynamic | 407.579 | ms |
| 99.0th percentile service time | painless_dynamic | 430.806 | ms |
| 99.9th percentile service time | painless_dynamic | 457.352 | ms |
| 100th percentile service time | painless_dynamic | 459.474 | ms |
----------------------------------
[INFO] SUCCESS (took 2634 seconds)
----------------------------------
複製程式碼
我這裡的需求很簡單,需要測試的是現有叢集,所以使用pipeline方式,官方自帶的資料樣本必須要安裝git ,然後下載,而且下載巨慢,可以考慮自己生成資料。
esrally --track=pmc --target-hosts=10.5.5.10:9243,10.5.5.11:9243,10.5.5.12:9243 --pipeline=benchmark-only --client-options="use_ssl:true,verify_certs:true,basic_auth_user:'elastic',basic_auth_password:'changeme'"
複製程式碼
四、構建自己的資料測試
官方文件 esrally.readthedocs.io/en/stable/a…
1、下載資料樣本
mkdir tutorial
wget http://download.geonames.org/export/dump/allCountries.zip
unzip allCountries.zip
複製程式碼
2、轉換資料 因為ES 需要JSON ,所以需要把資料樣本轉換下,指令碼命名toJSON.py
import json
cols = (("geonameid", "int", True),
("name", "string", True),
("asciiname", "string", False),
("alternatenames", "string", False),
("latitude", "double", True),
("longitude", "double", True),
("feature_class", "string", False),
("feature_code", "string", False),
("country_code", "string", True),
("cc2", "string", False),
("admin1_code", "string", False),
("admin2_code", "string", False),
("admin3_code", "string", False),
("admin4_code", "string", False),
("population", "long", True),
("elevation", "int", False),
("dem", "string", False),
("timezone", "string", False))
def main():
with open("allCountries.txt", "rt", encoding="UTF-8") as f:
for line in f:
tup = line.strip().split("\t")
record = {}
for i in range(len(cols)):
name, type, include = cols[i]
if tup[i] != "" and include:
if type in ("int", "long"):
record[name] = int(tup[i])
elif type == "double":
record[name] = float(tup[i])
elif type == "string":
record[name] = tup[i]
print(json.dumps(record, ensure_ascii=False))
if __name__ == "__main__":
main()
複製程式碼
所有的都放在剛才新建的資料夾裡面,使用如下命令轉換 python3 toJSON.py > documents.json
3、建立對映檔案index.json
{
"settings": {
"index.number_of_replicas": 0
},
"mappings": {
"docs": {
"dynamic": "strict",
"properties": {
"geonameid": {
"type": "long"
},
"name": {
"type": "text"
},
"latitude": {
"type": "double"
},
"longitude": {
"type": "double"
},
"country_code": {
"type": "text"
},
"population": {
"type": "long"
}
}
}
}
}
複製程式碼
This tutorial assumes that you want to benchmark a version of Elasticsearch prior to 7.0.0. If you want to benchmark Elasticsearch 7.0.0 or later you need to remove the mapping type above. 4、建立track.json
{
"version": 2,
"description": "Tutorial benchmark for Rally",
"indices": [
{
"name": "geonames",
"body": "index.json",
"types": [ "docs" ]
}
],
"corpora": [
{
"name": "rally-tutorial",
"documents": [
{
"source-file": "documents.json",
"document-count": 11658903,
"uncompressed-bytes": 1544799789
}
]
}
],
"schedule": [
{
"operation": {
"operation-type": "delete-index"
}
},
{
"operation": {
"operation-type": "create-index"
}
},
{
"operation": {
"operation-type": "cluster-health",
"request-params": {
"wait_for_status": "green"
}
}
},
{
"operation": {
"operation-type": "bulk",
"bulk-size": 5000
},
"warmup-time-period": 120,
"clients": 8
},
{
"operation": {
"operation-type": "force-merge"
}
},
{
"operation": {
"name": "query-match-all",
"operation-type": "search",
"body": {
"query": {
"match_all": {}
}
}
},
"clients": 8,
"warmup-iterations": 1000,
"iterations": 1000,
"target-throughput": 100
}
]
}
複製程式碼
5、驗證檔案 數量:wc -l documents.json 大小:stat -f "%z" documents.json
注意:此處在執行自己的track,在track.json有配置資料的大小和總量 "document-count": 11658903, "uncompressed-bytes": 1544799789 如果執行時候不一致會導致失敗,只需要改成一樣就可以了
6、執行你自己的track esrally list tracks --track-path=~/rally-tracks/tutoria 這個path就是剛才你建立的資料夾路徑,剛才所有的操作都在這個資料夾進行
dm@io:~ $ esrally list tracks --track-path=~/rally-tracks/tutorial
____ ____
/ __ \____ _/ / /_ __
/ /_/ / __ `/ / / / / /
/ _, _/ /_/ / / / /_/ /
/_/ |_|\__,_/_/_/\__, /
/____/
Available tracks:
Name Description Documents Compressed Size Uncompressed Size
---------- ----------------------------- ----------- --------------- -----------------
tutorial Tutorial benchmark for Rally 11658903 N/A 1.4 GB
執行
Congratulations, you have created your first track! You can test it with
esrally --distribution-version=6.4.0 --track-path=~/rally-tracks/tutorial
複製程式碼
執行測試已有叢集
esrally --track-path=/data/secoo_program/esrally/tutorial/ --pipeline=benchmark-only --target-hosts=192.168.41.4:9200,192.168.41.5:9200,192.168.41.6:9200,192.168.41.7:9200,192.168.41.8:9200,192.168.41.9:9200 --client-options="use_ssl:false,verify_certs:true,basic_auth_user:'elastic',basic_auth_password:'fcj5cU1Oh3YUcU3NL6vw'" --offline --report-file=/tmp/logs/report.md
複製程式碼
結果