TiDBv6.0與TiDBv5.1.2 TiKV 節點重啟後 leader 平衡加速,提升業務恢復速度對比測試

TiDBCommunityTechPortal發表於2022-05-26

1.目標:

試用TiDB v5.1.2與TiDB v6.0.0 TiKV 節點重啟後 leader 平衡加速,提升業務恢復速度對比

2.硬體配置:

角色 cup/記憶體/硬碟
TiDB&PD 16核/16G記憶體 /SSD200G 3臺
TiKV 16核/32G記憶體 /SSD500G 3臺
Monitor 16核/16G記憶體/ SSD50G 1臺

3.拓撲檔案配置

TiDBv5.1.2拓撲檔案引數配置

server_configs:
  pd:
    replication.enable-placement-rules: true
  tikv:
    server.grpc-concurrency: 8
    server.enable-request-batch: false
    storage.scheduler-worker-pool-size: 8
    raftstore.store-pool-size: 5
    raftstore.apply-pool-size: 5
    rocksdb.max-background-jobs: 12
    raftdb.max-background-jobs: 12
    rocksdb.defaultcf.compression-per-level: ["no","no","zstd","zstd","zstd","zstd","zstd"]
    raftdb.defaultcf.compression-per-level: ["no","no","zstd","zstd","zstd","zstd","zstd"]
    rocksdb.defaultcf.block-cache-size: 12GB
    raftdb.defaultcf.block-cache-size: 2GB
    rocksdb.writecf.block-cache-size: 6GB
    readpool.unified.min-thread-count: 8
    readpool.unified.max-thread-count: 16
    readpool.storage.normal-concurrency: 12
    raftdb.allow-concurrent-memtable-write: true
    pessimistic-txn.pipelined: true
  tidb:
    prepared-plan-cache.enabled: true
    tikv-client.max-batch-wait-time: 2000000

TiDBv6.0.0拓撲檔案引數配置

只比TiDBv5.1.2拓撲檔案引數配置多了storage.reserve-space: 0MB,可以忽略這個引數的設定

server_configs:
  pd:
    replication.enable-placement-rules: true
  tikv:
    server.grpc-concurrency: 8
    server.enable-request-batch: false
    storage.scheduler-worker-pool-size: 8
    raftstore.store-pool-size: 5
    raftstore.apply-pool-size: 5
    rocksdb.max-background-jobs: 12
    raftdb.max-background-jobs: 12
    rocksdb.defaultcf.compression-per-level: ["no","no","zstd","zstd","zstd","zstd","zstd"]
    raftdb.defaultcf.compression-per-level: ["no","no","zstd","zstd","zstd","zstd","zstd"]
    rocksdb.defaultcf.block-cache-size: 12GB
    raftdb.defaultcf.block-cache-size: 2GB
    rocksdb.writecf.block-cache-size: 6GB
    readpool.unified.min-thread-count: 8
    readpool.unified.max-thread-count: 16
    readpool.storage.normal-concurrency: 12
    raftdb.allow-concurrent-memtable-write: true
    pessimistic-txn.pipelined: true
    storage.reserve-space: 0MB
  tidb:
    prepared-plan-cache.enabled: true
    tikv-client.max-batch-wait-time: 2000000

4.TiUP部署TiDBv5.1.2和TiDBv6.0.0

no-alt

no-alt

5.測試TiKV 節點重啟後 leader 平衡時間方法

給叢集TiDBv5.1.2和TiDBv6.0.0插入不同資料(分別是100萬,400萬,700萬,1000萬),並檢視TiKV 節點重啟後 leader平衡時間

sysbench oltp_common
    --threads=16
    --rand-type=uniform
    --db-driver=mysql
    --mysql-db=sbtest
    --mysql-host=$host
    --mysql-port=$port
    --mysql-user=root
    --mysql-password=password
    prepare --tables=16 --table-size=10000000

透過以下這種方式查詢表資料:

select
(select count(1) from sbtest1)  "sbtest1",
(select count(1) from sbtest2)  "sbtest2",
(select count(1) from sbtest3)  "sbtest3",
(select count(1) from sbtest4)  "sbtest4",
(select count(1) from sbtest5)  "sbtest5",
(select count(1) from sbtest6)  "sbtest6",
(select count(1) from sbtest7)  "sbtest7",
(select count(1) from sbtest8)  "sbtest8",
(select count(1) from sbtest9)  "sbtest9",
(select count(1) from sbtest10)  "sbtest10",
(select count(1) from sbtest11)  "sbtest11",
(select count(1) from sbtest12)  "sbtest12",
(select count(1) from sbtest13)  "sbtest13",
(select count(1) from sbtest14)  "sbtest14",
(select count(1) from sbtest15)  "sbtest15",
(select count(1) from sbtest16)  "sbtest16"
FROM  dual

等待插入完成後,檢視Grafana監控下PD->Statistics-balance->Store leader count 各個TiKV leader 平均後,重啟其中一臺TiKV, 透過Grafana監控圖表中Store leader count看leader平衡時間,如下圖:

no-alt

:
重啟可以透過systemctl stop tikv-20160.service和systemctl start tikv-20160.service模擬實現

6.測試結果

對比資料圖

no-alt

從表中對比資料得到:

1.大體上確實是TiDBv6.0 TiKV 節點重啟後 leader 平衡加速了,比TiDBv5.1.2快了進30s。

2.從表中的資料看到:TiDBv6.0 TiKV 節點重啟後,不管資料多少基本30s就完成了leader 平衡。

2.從表中也可以看到,TiDBv6.0 leader 平衡完了後,也會出現少量leader調整,這種情況少有。

3.TiDBv6.0 TiKV關閉後,leader 平衡時間基本上與TiDBv5.1.2沒變化。

以上TiDB v5.1.2與TiDB v6.0.0 TiKV 節點重啟後 leader 平衡加速, 提升業務恢復速度的對比,是沒有修改balance-leader-scheduler 策略的情況下做的,可以看到預設情況下是有提升的,如想要獲取更大的加速效果,請按以下操作:

1.透過PD Control調整叢集引數。

2.scheduler config balance-leader-scheduler介紹

用於檢視和控制 balance-leader-scheduler 策略。

從 TiDB v6.0.0 起,PD 為 balance-leader-scheduler 引入了 Batch 引數,用於控制 balance-leader 執行任務的速度。你可以透過 pd-ctl 修改 balance-leader batch 配置項設定該功能。

在 v6.0.0 前,PD 不帶有該配置(即 balance-leader batch=1)。在 v6.0.0 或更高版本中,balance-leader batch 的預設值為 4。如果你想為該配置項設定大於 4 的值,你需要同時調大 scheduler-max-waiting-operator(預設值 5)。同時調大兩個配置項後,你才能體驗預期的加速效果。

>> scheduler config balance-leader-scheduler set batch 3  // 將 balance-leader 排程器可以批次執行的運算元大小設定為 3

參考文件:docs.pingcap.com/zh/tidb/v6.0/pd-c...

原文連結:tidb.net/blog/0cbf5031 原文作者:@ngvf

本作品採用《CC 協議》,轉載必須註明作者和本文連結

相關文章