MySQL優化--IO排程演算法優化

懂點IT的耿小廚發表於2020-08-24

之前已經在微信公眾號分享了資料庫優化的方法,連結為https://mp.weixin.qq.com/s/6Atzk9UKPJRxxAs0nsKBXg 。 其中作業系統部分介紹了IO排程演算法的優化,本文將通過壓力測試的方式來對比不同的排程演算法下磁碟IO的表現。

1 準備工作

1.1  安裝sysbench

本次採用sysbench進行壓測,先安裝sysbench,步驟如下:

curl -s https://packagecloud.io/install/repositories/akopytov/sysbench/script.rpm.sh | sudo bash
yum -y install sysbench
sysbench --version

1.2  準備測試檔案

生成後續需要使用的測試檔案,block大小為16k(MySQL DBA都懂的,哈哈),建立4個檔案,合計20G

[root@mha1 ~]# sysbench fileio --file-num=4 --file-block-size=16384 --file-total-size=20G prepare
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)

4 files, 5242880Kb each, 20480Mb total
Creating files for the test...
Extra file open flags: (none)
Creating file test_file.0
Creating file test_file.1
Creating file test_file.2
Creating file test_file.3
21474836480 bytes written in 47.94 seconds (427.24 MiB/sec).

1.3 準備測試表

因為也要進行資料庫讀寫方面的測試,因此需要先建立相關表及資料

[root@mha1 ~]# sysbench --db-driver=mysql --time=300 --threads=10 --report-interval=1 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=testdb --tables=20 --table_size=1000000 oltp_insert --db-ps-mode=disable prepare
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)

Initializing worker threads...

Creating table 'sbtest3'...
Creating table 'sbtest2'...
Creating table 'sbtest5'...
Creating table 'sbtest6'...
Creating table 'sbtest8'...
Creating table 'sbtest7'...
Creating table 'sbtest4'...
Creating table 'sbtest1'...
Creating table 'sbtest10'...
Creating table 'sbtest9'...
Inserting 1000000 records into 'sbtest4'
Inserting 1000000 records into 'sbtest7'
Inserting 1000000 records into 'sbtest5'
Inserting 1000000 records into 'sbtest3'
Inserting 1000000 records into 'sbtest1'
Inserting 1000000 records into 'sbtest2'
Inserting 1000000 records into 'sbtest8'
Inserting 1000000 records into 'sbtest6'
Inserting 1000000 records into 'sbtest10'
Inserting 1000000 records into 'sbtest9'
Creating a secondary index on 'sbtest7'...
Creating a secondary index on 'sbtest10'...
Creating a secondary index on 'sbtest8'...
Creating a secondary index on 'sbtest5'...
Creating a secondary index on 'sbtest2'...
Creating a secondary index on 'sbtest9'...
Creating a secondary index on 'sbtest1'...
Creating table 'sbtest17'...
Creating a secondary index on 'sbtest3'...
Inserting 1000000 records into 'sbtest17'
Creating a secondary index on 'sbtest4'...
Creating a secondary index on 'sbtest6'...
Creating table 'sbtest20'...
Inserting 1000000 records into 'sbtest20'
Creating table 'sbtest18'...
Inserting 1000000 records into 'sbtest18'
Creating table 'sbtest15'...
Inserting 1000000 records into 'sbtest15'
Creating table 'sbtest19'...
Inserting 1000000 records into 'sbtest19'
Creating table 'sbtest14'...
Inserting 1000000 records into 'sbtest14'
Creating table 'sbtest11'...
Inserting 1000000 records into 'sbtest11'
Creating table 'sbtest13'...
Creating table 'sbtest12'...
Inserting 1000000 records into 'sbtest13'
Inserting 1000000 records into 'sbtest12'
Creating table 'sbtest16'...
Inserting 1000000 records into 'sbtest16'
Creating a secondary index on 'sbtest17'...
Creating a secondary index on 'sbtest20'...
Creating a secondary index on 'sbtest18'...
Creating a secondary index on 'sbtest19'...
Creating a secondary index on 'sbtest15'...
Creating a secondary index on 'sbtest11'...
Creating a secondary index on 'sbtest12'...
Creating a secondary index on 'sbtest13'...
Creating a secondary index on 'sbtest14'...
Creating a secondary index on 'sbtest16'...

2    檢視支援的排程演算法

本次磁碟為SSD硬碟,作業系統版本文Centos7.8 。下面將排程演算法修改為三種不同的值來進行隨機讀與隨機寫的壓力測試

本系統為Centos7.8,需要檢視支援的IO排程演算法,然後再進行修改測試。

[root@mha1 ~]# dmesg | grep -i scheduler
[    4.885816] io scheduler noop registered
[    4.885820] io scheduler deadline registered (default)
[    4.885867] io scheduler cfq registered
[    4.885870] io scheduler mq-deadline registered
[    4.885872] io scheduler kyber registered

可見,再本系統中,預設的排程演算法為 deadline。

也可以通過如下命令檢視當前的排程演算法,其中中括號裡代表當前使用的排程演算法。

[root@mha1 ~]# cat /sys/block/sda/queue/scheduler 
noop [deadline] cfq 

3   deadline演算法

Deadline在機械盤的情況下對資料庫環境(ORACLE RAC,MySQL等)是最好的選擇。下面將進行隨機寫與隨機讀的壓力測試

3.1  隨機寫

[root@mha1 ~]# sysbench fileio \
>  --time=180 \
>  --threads=24 \
>  --file-total-size=20G \
>  --file-test-mode=rndwr \
>  --file-num=4 \
>  --file-extra-flags=direct \
>  --file-fsync-freq=0 \
>  --file-block-size=16384 \
>  run
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)

Running the test with following options:
Number of threads: 24
Initializing random number generator from current time


Extra file open flags: directio
4 files, 5GiB each
20GiB total file size
Block size 16KiB
Number of IO requests: 0
Read/Write ratio for combined random IO test: 1.50
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random write test
Initializing worker threads...

Threads started!


File operations:
    reads/s:                      0.00
    writes/s:                     6935.37
    fsyncs/s:                     0.53

Throughput:
    read, MiB/s:                  0.00
    written, MiB/s:               108.37

General statistics:
    total time:                          180.0138s
    total number of events:              1248484

Latency (ms):
         min:                                    0.10
         avg:                                    3.46
         max:                                  107.39
         95th percentile:                       14.73
         sum:                              4317610.93

Threads fairness:
    events (avg/stddev):           52020.1667/426.95
    execution time (avg/stddev):   179.9005/0.01

隨機寫的iops為6935.37,磁碟寫入速度是108.37MiB/s

3.2 隨機讀

[root@mha1 ~]# sysbench fileio \
>  --time=180 \
>  --threads=24 \
>  --file-total-size=20G \
>  --file-test-mode=rndrd \
>  --file-num=4 \
>  --file-extra-flags=direct \
>  --file-fsync-freq=0 \
>  --file-block-size=16384 \
>  run
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)

Running the test with following options:
Number of threads: 24
Initializing random number generator from current time


Extra file open flags: directio
4 files, 5GiB each
20GiB total file size
Block size 16KiB
Number of IO requests: 0
Read/Write ratio for combined random IO test: 1.50
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random read test
Initializing worker threads...

Threads started!


File operations:
    reads/s:                      7956.88
    writes/s:                     0.00
    fsyncs/s:                     0.00

Throughput:
    read, MiB/s:                  124.33
    written, MiB/s:               0.00

General statistics:
    total time:                          180.0075s
    total number of events:              1432313

Latency (ms):
         min:                                    0.10
         avg:                                    3.01
         max:                                  322.24
         95th percentile:                        5.47
         sum:                              4309094.67

Threads fairness:
    events (avg/stddev):           59679.7083/2688.56
    execution time (avg/stddev):   179.5456/0.18

隨機讀的iops為7956.88,磁碟讀取速度是124.33MiB/s

3.3   測試資料庫寫

sysbench --db-driver=mysql --time=300 --threads=8 --report-interval=1 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=testdb --tables=10 --table_size=1000000 oltp_write_only --db-ps-mode=disable run

SQL statistics:
    queries performed:
        read:                            0
        write:                           589934
        other:                           294968
        total:                           884902
    transactions:                        147483 (491.43 per sec.)
    queries:                             884902 (2948.62 per sec.)
    ignored errors:                      2      (0.01 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          300.1050s
    total number of events:              147483

Latency (ms):
         min:                                    2.58
         avg:                                   16.27
         max:                                 2608.34
         95th percentile:                       35.59
         sum:                              2399415.58

Threads fairness:
    events (avg/stddev):           18435.3750/90.33
    execution time (avg/stddev):   299.9269/0.04

可見,隨機寫入的TPS為491.43 ,查詢次數為2948.62

3.4  測試資料庫讀

sysbench --db-driver=mysql --time=300 --threads=8 --report-interval=1 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=testdb --tables=10 --table_size=1000000 oltp_read_only --db-ps-mode=disable run

SQL statistics:
    queries performed:
        read:                            1651692
        write:                           0
        other:                           235956
        total:                           1887648
    transactions:                        117978 (393.13 per sec.)
    queries:                             1887648 (6290.13 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          300.0949s
    total number of events:              117978

Latency (ms):
         min:                                    3.08
         avg:                                   20.34
         max:                                  170.48
         95th percentile:                       29.19
         sum:                              2399636.31

Threads fairness:
    events (avg/stddev):           14747.2500/1513.84
    execution time (avg/stddev):   299.9545/0.04

可見,隨機讀時的TPS為393.13 ,查詢次數為6290.13

4  noop演算法

4.1  修改為noop演算法

noop稱為電梯排程演算法,是基於FIFO佇列實現的。所有的請求都是先進先出的,因為SSD的隨機讀、隨機寫速度快,因此該演算法適合SSD硬碟。

[root@mha1 ~]# echo 'noop' >/sys/block/sda/queue/scheduler
[root@mha1 ~]# cat /sys/block/sda/queue/scheduler 
[noop] deadline cfq 

4.2  隨機寫

[root@mha1 ~]# sysbench fileio \
>  --time=180 \
>  --threads=24 \
>  --file-total-size=20G \
>  --file-test-mode=rndwr \
>  --file-num=4 \
>  --file-extra-flags=direct \
>  --file-fsync-freq=0 \
>  --file-block-size=16384 \
>  run
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)

Running the test with following options:
Number of threads: 24
Initializing random number generator from current time


Extra file open flags: directio
4 files, 5GiB each
20GiB total file size
Block size 16KiB
Number of IO requests: 0
Read/Write ratio for combined random IO test: 1.50
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random write test
Initializing worker threads...

Threads started!


File operations:
    reads/s:                      0.00
    writes/s:                     7057.60
    fsyncs/s:                     0.53

Throughput:
    read, MiB/s:                  0.00
    written, MiB/s:               110.27

General statistics:
    total time:                          180.0136s
    total number of events:              1270481

Latency (ms):
         min:                                    0.10
         avg:                                    3.40
         max:                                  240.39
         95th percentile:                       14.46
         sum:                              4317435.99

Threads fairness:
    events (avg/stddev):           52936.7083/487.57
    execution time (avg/stddev):   179.8932/0.02

隨機寫的iops為7057.60,磁碟寫入速度是110.27MiB/s

4.3  隨機讀

[root@mha1 ~]# sysbench fileio \
>  --time=180 \
>  --threads=24 \
>  --file-total-size=20G \
>  --file-test-mode=rndrd \
>  --file-num=4 \
>  --file-extra-flags=direct \
>  --file-fsync-freq=0 \
>  --file-block-size=16384 \
>  run
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)

Running the test with following options:
Number of threads: 24
Initializing random number generator from current time


Extra file open flags: directio
4 files, 5GiB each
20GiB total file size
Block size 16KiB
Number of IO requests: 0
Read/Write ratio for combined random IO test: 1.50
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random read test
Initializing worker threads...

Threads started!


File operations:
    reads/s:                      8399.89
    writes/s:                     0.00
    fsyncs/s:                     0.00

Throughput:
    read, MiB/s:                  131.25
    written, MiB/s:               0.00

General statistics:
    total time:                          180.0100s
    total number of events:              1512081

Latency (ms):
         min:                                    0.10
         avg:                                    2.85
         max:                                  315.77
         95th percentile:                        5.00
         sum:                              4312384.33

Threads fairness:
    events (avg/stddev):           63003.3750/10086.77
    execution time (avg/stddev):   179.6827/0.12

隨機讀的iops為8399.89,磁碟讀取速度是131.25MiB/s

4.4  資料庫寫入

sysbench --db-driver=mysql --time=300 --threads=8 --report-interval=1 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=testdb --tables=10 --table_size=1000000 oltp_write_only --db-ps-mode=disable run

SQL statistics:
    queries performed:
        read:                            0
        write:                           653457
        other:                           326730
        total:                           980187
    transactions:                        163364 (544.38 per sec.)
    queries:                             980187 (3266.28 per sec.)
    ignored errors:                      2      (0.01 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          300.0903s
    total number of events:              163364

Latency (ms):
         min:                                    2.62
         avg:                                   14.69
         max:                                  220.12
         95th percentile:                       32.53
         sum:                              2399040.57

Threads fairness:
    events (avg/stddev):           20420.5000/112.69
    execution time (avg/stddev):   299.8801/0.04

可見,隨機寫入的TPS為 544.38 ,查詢次數3266.28 

4.5  資料庫只讀

sysbench --db-driver=mysql --time=300 --threads=8 --report-interval=1 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=testdb --tables=10 --table_size=1000000 oltp_read_only --db-ps-mode=disable run
SQL statistics:
    queries performed:
        read:                            1596364
        write:                           0
        other:                           228052
        total:                           1824416
    transactions:                        114026 (379.97 per sec.)
    queries:                             1824416 (6079.59 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          300.0869s
    total number of events:              114026

Latency (ms):
         min:                                    3.08
         avg:                                   21.04
         max:                                  321.03
         95th percentile:                       31.37
         sum:                              2399600.56

Threads fairness:
    events (avg/stddev):           14253.2500/1475.71
    execution time (avg/stddev):   299.9501/0.02

 可見,只讀時的TPS為 379.97,查詢次數為6079.59

5  cfq演算法

 5.1 修改為cfq演算法

cfq稱為絕對公平排程演算法,為每個程式及執行緒單獨建立一個佇列來管理IO請求,起到每個程式和執行緒均勻分佈IO的效果。此演算法適用於通用伺服器,centos6中為預設的IO排程演算法。

[root@mha1 ~]# echo 'cfq' >/sys/block/sda/queue/scheduler 
[root@mha1 ~]# cat /sys/block/sda/queue/scheduler 
noop deadline [cfq] 

5.2 隨機寫

[root@mha1 ~]# sysbench fileio \
>  --time=180 \
>  --threads=24 \
>  --file-total-size=20G \
>  --file-test-mode=rndwr \
>  --file-num=4 \
>  --file-extra-flags=direct \
>  --file-fsync-freq=0 \
>  --file-block-size=16384 \
>  run
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)

Running the test with following options:
Number of threads: 24
Initializing random number generator from current time


Extra file open flags: directio
4 files, 5GiB each
20GiB total file size
Block size 16KiB
Number of IO requests: 0
Read/Write ratio for combined random IO test: 1.50
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random write test
Initializing worker threads...

Threads started!


File operations:
    reads/s:                      0.00
    writes/s:                     6614.37
    fsyncs/s:                     0.53

Throughput:
    read, MiB/s:                  0.00
    written, MiB/s:               103.35

General statistics:
    total time:                          180.0118s
    total number of events:              1190677

Latency (ms):
         min:                                    0.10
         avg:                                    3.63
         max:                                  348.78
         95th percentile:                       15.27
         sum:                              4317092.54

Threads fairness:
    events (avg/stddev):           49611.5417/517.80
    execution time (avg/stddev):   179.8789/0.03

隨機寫的iops為6614.37,磁碟寫入速度是103.35MiB/s

5.3 隨機讀

[root@mha1 ~]# sysbench fileio \
>  --time=180 \
>  --threads=24 \
>  --file-total-size=20G \
>  --file-test-mode=rndrd \
>  --file-num=4 \
>  --file-extra-flags=direct \
>  --file-fsync-freq=0 \
>  --file-block-size=16384 \
>  run
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)

Running the test with following options:
Number of threads: 24
Initializing random number generator from current time


Extra file open flags: directio
4 files, 5GiB each
20GiB total file size
Block size 16KiB
Number of IO requests: 0
Read/Write ratio for combined random IO test: 1.50
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random read test
Initializing worker threads...

Threads started!


File operations:
    reads/s:                      7481.39
    writes/s:                     0.00
    fsyncs/s:                     0.00

Throughput:
    read, MiB/s:                  116.90
    written, MiB/s:               0.00

General statistics:
    total time:                          180.0086s
    total number of events:              1346731

Latency (ms):
         min:                                    0.10
         avg:                                    3.20
         max:                                  374.49
         95th percentile:                        5.77
         sum:                              4312382.07

Threads fairness:
    events (avg/stddev):           56113.7917/3058.00
    execution time (avg/stddev):   179.6826/0.17

隨機讀的iops為7481.39,磁碟讀取速度是116.90MiB/s

 5.4 資料庫寫

sysbench --db-driver=mysql --time=300 --threads=8 --report-interval=1 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=testdb --tables=10 --table_size=1000000 oltp_write_only --db-ps-mode=disable run
SQL statistics:
    queries performed:
        read:                            0
        write:                           598765
        other:                           299384
        total:                           898149
    transactions:                        149691 (498.54 per sec.)
    queries:                             898149 (2991.25 per sec.)
    ignored errors:                      2      (0.01 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          300.2552s
    total number of events:              149691

Latency (ms):
         min:                                    2.55
         avg:                                   16.02
         max:                                  779.62
         95th percentile:                       35.59
         sum:                              2397311.08

Threads fairness:
    events (avg/stddev):           18711.3750/132.24
    execution time (avg/stddev):   299.6639/0.38

可見,隨機寫入的TPS為498.54 ,查詢次數為2991.25 

5.5  資料庫讀

sysbench --db-driver=mysql --time=300 --threads=8 --report-interval=1 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=testdb --tables=10 --table_size=1000000 oltp_read_only --db-ps-mode=disable run
SQL statistics:
    queries performed:
        read:                            1448342
        write:                           0
        other:                           206906
        total:                           1655248
    transactions:                        103453 (344.66 per sec.)
    queries:                             1655248 (5514.58 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          300.1562s
    total number of events:              103453

Latency (ms):
         min:                                    3.11
         avg:                                   23.19
         max:                                  222.31
         95th percentile:                       38.25
         sum:                              2399486.55

Threads fairness:
    events (avg/stddev):           12931.6250/1278.72
    execution time (avg/stddev):   299.9358/0.01

 可見,只讀時的TPS為 344.66,查詢次數為5514.58

6 小結

根據測試結果對比一下三種排程演算法的讀寫速度

演算法 IOPS 磁碟寫速度 IOPS 磁碟讀速度 oltp_write_only  oltp_read_only 
deadline 6935.37 118.37MiB/s 7956.88 124.33MiB/s TPS為491.43 ,查詢次數為2948.62  TPS為393.13 ,查詢次數為6290.13
noop 7057.60 110.27MiB/s 8399.89 131.25MiB/s TPS為 544.38 ,查詢次數為3266.28 TPS為 379.97,查詢次數為6079.59
cfq 6614.37 103.35MiB/s 7481.39 116.90MiB/s TPS為498.54 ,查詢次數為2991.25  TPS為 344.66,查詢次數為5514.58

因為本次測試環境為SSD硬碟,因此,在此情況下建議選擇noop磁碟IO排程演算法,此結論也符合我們的預期。

特別注意:磁碟IO的排程演算法還需要根據磁碟情況、資料庫型別、資料庫架構、業務場景(OLTP、OLAP等)等各種場景進行區分,不同的場景排程演算法也要調整,不可一概而論。如果不確定的話,建議進行壓測來判斷,選擇符合對應場景下最合適的演算法。

想了解更多內容或參與技術交流可以關注微信公眾號【資料庫乾貨鋪】或進技術交流群溝通。

相關文章