阿里雲RDS金融資料庫(三節點版)-效能篇

德哥發表於2017-07-14

背景

終於到了效能篇，三節點同時滿足了企業對資料庫的可用性、可靠性的要求，那麼效能如何呢？

提到效能測試，我有幾點一定要說明一下，很多朋友對效能的理解可能有偏差，那麼如何評判效能的好壞呢？

1、首先要明確測試的環境，包括資料庫主機（主要包括CPU、記憶體、網路卡，如果你的資料庫能用上FPGA、GPU的計算能力，還得算上他們，例如PostgreSQL就可以利用GPU和FPGA進行計算加速。）、資料儲存（主要是各個塊大小、Queue deep的連續、隨機IOPS能力，連續、隨機讀寫頻寬）、測試機和資料庫主機的網路（頻寬和RT），測試機的硬體指標。

2、明確參照物，沒有參照物我們無法評判效能的好與壞。例如三節點、兩節點、單節點的對比。

3、明確測試的benchmark，例如對於OLTP場景，可以按工業標準TPC-C進行測試。或者使用者可以自己建模進行測試。而對於MySQL的測試，大家喜歡用sysbench以及sysbench中自帶的那些test case。

《資料庫界的華山論劍 tpc.org》

4、測試中，注意資料庫驅動、快取預熱、測試客戶端、連線數、測試資料量、測試資料的分佈。。。等對測試結果帶來的干擾。我曾經就有遇到過有朋友同樣的硬體環境，因為各種原因，測試出來的結果大相庭徑的。

例如測試客戶端，開啟了DEBUG日誌輸出，導致測試TPS下降嚴重。同樣的測試CASE，用JAVA寫的和用C寫的，測試結果也有一定的差異。

5、資料庫、儲存、OS、防火牆的優化對測試結果的影響也巨大。如果要對比測試結果，那麼環境務必保持一致，包括這些配置。

在考慮到以上因素的情況下，與參照物進行對比（例如pg 9.6和pg 10比較，pg和mysql 比較, 三節點和單節點、2節點的比較等），評判效能的好壞才有價值。

效能評測case設計

相信大家會比較關心三節點和單節點、雙節點的效能對比，為了更加貼近現實場景我們來看看架構的區別。

單節這裡不多說，沒有負擔效能肯定是首當其衝的。

我們從雙節點開始說。

雙節點可以部署在同一機房，也可以部署在同城異地機房。

當雙節點部署在同城異地機房時，RT一定是打折扣的，所以對於小事務效能一定會下降明顯，但是獲得的好處是抵禦機房級故障。

同一機房可以獲得良好的RT，但是無法抵禦機房級故障，也無法同時做到可用性和可靠性，在滿足可用性時，可靠性就必須打折扣（因為不能使用完全同步複製）。

對於三節點，部署非常有彈性，我們可以選擇同機房+同城機房的部署方法。可以抵禦機房級故障，同時還有極好的RT。做到了效能、可靠性兼得。

取樣同機房+異地機房的部署方法，不僅能抵禦機房故障，還能抵禦城市災難。

對於三節點，還可以選擇3機房的部分方法，犧牲一定的RT，可以抵禦城市級災難。

根據以上論證，不難發現，效能和部署有關，部署和保護級別有關。

三節點的部署極為靈活，根據主備節點的分佈，抵禦的風險級別不一樣，當然RT也不一樣。

我們可以得出這樣的預期。

效能怎麼樣呢？

三節點 vs 2節點 vs 單節點

阿里雲RDS目前提供了單節點、雙節點、三節點幾種形態的產品。

1、單節點，主打經濟實用，雖然是單節點，但是資料儲存依舊是有多份的，備份和時間點恢復一個都不少。

2、雙節點，經典的一主一備架構，相比單節點可以做到更高的可用性和可靠性。

3、三節點，具備可用性的同時，還具備可靠性（多副本強同步模式，資料0丟失），是企業核心資料庫、金融級業務的最佳選擇。

在PostgreSQL中，同步模式為事務級可控，目前包含：

非同步(到達wal buffer)、本地fsync(持久化)、本地持久化+備庫write(到達os buffer)、本地fsync(持久化)+備庫fsync(持久化)、本地持久化+備庫apply（WAL應用）。

在PostgreSQL中，副本數全域性可控，目前包含：

“`
{FIRST | ANY } num {(standby_name1 , …. ) | (*) }
“`

FIRST為經典模式，排在前面的num個STANDBY為同步備，後面的為候選同步備。

ANY為quorum based模式，表示所有節點都是同步備，到達num指定的足夠副本數即可。

根據不同的 “同步模式+副本數” 組合，可以根據業務需求，形成非常靈活的資料庫HA架構。

下面分別對比三種形態的效能，給使用者一個參考，使用者有個直觀的認識。

測試環境

1、雙節點配置：

32C，80萬IOPS，512G記憶體，同機房10GB網路。同步複製模式（等待超過1秒自動降級為非同步模式）。

2、三節點配置（同機房+同城機房版本）：

32C，80萬IOPS，512G記憶體，同機房10GB網路，同城機房間網路頻寬未知。同步複製模式（至少1個備庫響應COMMIT RECORD ACK）。

都不使用分組提交，WAL級別都為replica。

以PostgreSQL 單節點、雙節點、三節點為例，對比各個模式的效能。

1、只讀事務

預期：

只讀事務，不管是幾節點，效能是一樣的。

-- 構造1000萬記錄，按PK查詢。    
create table test1(id int primary key, info text, crt_time timestamp);    
insert into test1 select generate_series(1,10000000), md5(random()::text), now();    
    
-- 測試指令碼如下，隨機按PK查詢。    
vi test1.sql    
set id random(1,10000000)    
select * from test1 where id=:id;    
    
-- 併發測試    
pgbench -M prepared -n -r -P 1 -f ./test1.sql -c 64 -j 64 -T 120

測試結果

雙節點：    
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 120 s  
number of transactions actually processed: 127744490  
latency average = 0.060 ms  
latency stddev = 0.017 ms  
tps = 1064517.898216 (including connections establishing)  
tps = 1064671.106607 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.001  set id random(1,10000000)  
         0.059  select * from test1 where id=:id;  
    
三節點：    
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 120 s  
number of transactions actually processed: 128084101  
latency average = 0.060 ms  
latency stddev = 0.029 ms  
tps = 1067351.606315 (including connections establishing)  
tps = 1067570.502782 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.001  set id random(1,10000000)  
         0.059  select * from test1 where id=:id;

2、只寫小事務

預期：

只寫的小事務，保護級別越高，RT就越高，RT在整個事務中的佔比越高，效能影響就越大。

-- 構造表，UPSERT操作，按PK，有則更新，無則插入。    
create table test2(id int primary key, info text, crt_time timestamp);    
    
-- 測試指令碼如下，ID範圍1到10億，有則更新，無則插入。    
vi test2.sql    
set id random(1,1000000000)    
insert into test2 values (:id, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;    
    
-- 併發測試    
pgbench -M prepared -n -r -P 1 -f ./test2.sql -c 64 -j 64 -T 120

測試結果

雙節點：    
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 120 s  
number of transactions actually processed: 16471972  
latency average = 0.466 ms  
latency stddev = 7.586 ms  
tps = 137244.632272 (including connections establishing)  
tps = 137264.346853 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.001  set id random(1,1000000000)  
         0.465  insert into test2 values (:id, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
    
三節點：    
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 120 s  
number of transactions actually processed: 15043211  
latency average = 0.510 ms  
latency stddev = 9.061 ms  
tps = 125351.050926 (including connections establishing)  
tps = 125433.189301 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.001  set id random(1,1000000000)  
         0.509  insert into test2 values (:id, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;

讀寫混合事務（讀多寫少）

預期：

讀多寫少的混合事務，保護級別越高，RT就越高，但RT在整個事務中的佔比與事務本身的耗時有關，整個事務的時間越短，RT的效能影響就越明顯。

-- 構造讀請求表，構造1000萬記錄，按PK查詢。    
create table test3(id int primary key, info text, crt_time timestamp);    
insert into test3 select generate_series(1,10000000), md5(random()::text), now();    
    
-- 構造表，UPSERT操作，按PK，有則更新，無則插入。    
create table test4(id int primary key, info text, crt_time timestamp);    
    
-- 測試指令碼如下，10個只讀，一筆寫操作，ID範圍1到10億，有則更新，無則插入。    
vi test3.sql    
set id1 random(1,10000000)    
set id2 random(1,1000000000)    
begin;  
select * from test3 where id=:id1;    
select * from test3 where id=:id1+1000;    
select * from test3 where id=:id1+5000;    
select * from test3 where id=:id1+10000;    
select * from test3 where id=:id1+100;    
select * from test3 where id=:id1-1000;    
select * from test3 where id=:id1-5000;    
select * from test3 where id=:id1-10000;    
select * from test3 where id=:id1+800;    
select * from test3 where id=:id1-800;    
insert into test4 values (:id2, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;    
end;  
  
-- 併發測試    
pgbench -M prepared -n -r -P 1 -f ./test3.sql -c 64 -j 64 -T 120

測試結果

雙節點：    
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 120 s  
number of transactions actually processed: 6262104  
latency average = 1.226 ms  
latency stddev = 7.466 ms  
tps = 52175.718194 (including connections establishing)  
tps = 52182.927235 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.001  set id1 random(1,10000000)  
         0.001  set id2 random(1,1000000000)  
         0.034  begin;  
         0.069  select * from test3 where id=:id1;  
         0.065  select * from test3 where id=:id1+1000;  
         0.063  select * from test3 where id=:id1+5000;  
         0.062  select * from test3 where id=:id1+10000;  
         0.060  select * from test3 where id=:id1+100;  
         0.060  select * from test3 where id=:id1-1000;  
         0.060  select * from test3 where id=:id1-5000;  
         0.059  select * from test3 where id=:id1-10000;  
         0.058  select * from test3 where id=:id1+800;  
         0.058  select * from test3 where id=:id1-800;  
         0.104  insert into test4 values (:id2, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.471  end;  
    
三節點：    
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 120 s  
number of transactions actually processed: 5926527  
latency average = 1.296 ms  
latency stddev = 9.677 ms  
tps = 49377.940916 (including connections establishing)  
tps = 49386.111317 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.002  set id1 random(1,10000000)  
         0.001  set id2 random(1,1000000000)  
         0.035  begin;  
         0.070  select * from test3 where id=:id1;  
         0.066  select * from test3 where id=:id1+1000;  
         0.064  select * from test3 where id=:id1+5000;  
         0.063  select * from test3 where id=:id1+10000;  
         0.061  select * from test3 where id=:id1+100;  
         0.061  select * from test3 where id=:id1-1000;  
         0.060  select * from test3 where id=:id1-5000;  
         0.060  select * from test3 where id=:id1-10000;  
         0.059  select * from test3 where id=:id1+800;  
         0.058  select * from test3 where id=:id1-800;  
         0.113  insert into test4 values (:id2, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.524  end;

讀寫混合事務（讀少寫多）

預期：

讀少寫多的混合事務，保護級別越高，RT就越高，但RT在整個事務中的佔比與事務本身的耗時有關，整個事務的時間越短，RT的效能影響就越明顯。

-- 測試指令碼如下，10個寫，一個讀。    
vi test4.sql    
set id1 random(1,10000000)    
set id2 random(1,1000000000)    
begin;  
select * from test3 where id=:id1;    
insert into test4 values (:id2, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;    
insert into test4 values (:id2+101, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;    
insert into test4 values (:id2+1020, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;    
insert into test4 values (:id2+2030, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;    
insert into test4 values (:id2+5040, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;    
insert into test4 values (:id2+10500, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;    
insert into test4 values (:id2-106, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;    
insert into test4 values (:id2-1070, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;    
insert into test4 values (:id2-5080, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;    
insert into test4 values (:id2-9090, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;    
end;  
    
-- 併發測試    
pgbench -M prepared -n -r -P 1 -f ./test4.sql -c 64 -j 64 -T 120

測試結果

雙節點：    
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 120 s  
number of transactions actually processed: 3290435  
latency average = 2.334 ms  
latency stddev = 22.840 ms  
tps = 27416.206491 (including connections establishing)  
tps = 27419.825894 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.001  set id1 random(1,10000000)  
         0.001  set id2 random(1,1000000000)  
         0.035  begin;  
         0.079  select * from test3 where id=:id1;  
         0.162  insert into test4 values (:id2, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.110  insert into test4 values (:id2+101, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.107  insert into test4 values (:id2+1020, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.106  insert into test4 values (:id2+2030, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.107  insert into test4 values (:id2+5040, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.106  insert into test4 values (:id2+10500, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.099  insert into test4 values (:id2-106, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.099  insert into test4 values (:id2-1070, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.100  insert into test4 values (:id2-5080, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.099  insert into test4 values (:id2-9090, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         1.118  end;  
    
三節點：    
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 120 s  
number of transactions actually processed: 3179246  
latency average = 2.416 ms  
latency stddev = 25.711 ms  
tps = 26485.773742 (including connections establishing)  
tps = 26489.361989 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.002  set id1 random(1,10000000)  
         0.001  set id2 random(1,1000000000)  
         0.035  begin;  
         0.079  select * from test3 where id=:id1;  
         0.161  insert into test4 values (:id2, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.107  insert into test4 values (:id2+101, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.104  insert into test4 values (:id2+1020, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.102  insert into test4 values (:id2+2030, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.101  insert into test4 values (:id2+5040, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.100  insert into test4 values (:id2+10500, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.096  insert into test4 values (:id2-106, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.094  insert into test4 values (:id2-1070, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.093  insert into test4 values (:id2-5080, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         0.092  insert into test4 values (:id2-9090, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
         1.241  end;

效能對比報表如圖：

1、TPSQPS對比

TPS	只讀	只寫	讀多寫少	讀少寫多
雙節點(tps)	1064671	137264	52182	27419
三節點(tps)	1067570	125433	49386	26489
雙節點(qps)	1064671	137264	574002	301609
三節點(qps)	1067570	125433	543246	291379
效能損耗	-0.27%	8.6%	5.3%	3.4%

2、平均響應時間

平均響應時間	只讀	只寫	讀多寫少	讀少寫多
雙節點(ms)	0.060	0.466	1.226	2.334
三節點(ms)	0.060	0.510	1.296	2.416

3、響應抖動對比(方差)

抖動主要和SSD的GC回收的管理機制，以及主備網路的穩定性有關。

響應時間方差	只讀	只寫	讀多寫少	讀少寫多
雙節點(ms)	0.017	7.586	7.466	22.840
三節點(ms)	0.029	9.061	9.677	25.711

4、對比圖

1、TPS

2、QPS

3、事務響應時間

4、事務響應時間抖動（方差）

抖動主要和SSD的GC回收的管理機制，以及主備網路的穩定性有關。

複製層面 – MySQL和PostgreSQL的差異

複製機制決定了兩種產品的差異。

PostgreSQL，通過WAL的物理式複製同步備庫。產生多少WAL就複製多少WAL，不需要等待事務結束才開始複製。因此備庫與主庫的WAL延遲與事務大小無關，僅僅與網路頻寬和網路RT有關。每次事務結束時（不論事務大小），僅僅等待COMMIT RECORD ACK即可（commit record是固定大小的，非常小），所以不管事務多大，延遲都是等效的。

MySQL，通過binlog進行復制同步備庫。主庫上沒有結束的事務，binlog不會發給備庫，因此備庫的延遲和事務大小直接相關。事務越大(指產生影響的ROW越多的事務)，產生的BINLOG越多，事務提交的RT越高，延遲越嚴重。MySQL業務應儘量避免大事務。

小結

從測試結果不難發現，三節點寫事務效能相比雙節點低一丁點(5%左右)，換來的是魚與熊掌兼得(高可用和高可靠)。阿里雲RDS(三節點版)已成為金融使用者的最佳選擇。

三節點的效能影響主要來自事務提交後，等待WAL或binlog傳送給備庫，收到ACK需要多久。PostgreSQL和MySQL的差異如上所述。

經過以上測試，不同型別的場景，預期和實際測試效果一致。

單節點的效能一定是最好的（因為不需要等待備庫複製事務的WAL ACK），但是對於可用性和可靠性的測試意義不大。另一方面我們也能得到這樣的推論。

1、如果兩節點為非同步複製配置，那麼效能應該和單節點相當。

2、如果兩節點為同步（帶自動降級功能）複製配置，那麼效能和三節點相當，這與測試完全相符，三節點和兩節點的效能不相上下。

三節點與兩節點效能方面的差異微乎其微。

1、只讀事務

只讀事務，不管是幾節點，效能是一樣的。

2、只寫小事務

只寫的小事務，保護級別越高，RT就越高，RT在整個事務中的佔比越高，效能影響就越大。

3、讀寫混合事務（讀多寫少）

讀多寫少的混合事務，保護級別越高，RT就越高，但RT在整個事務中的佔比與事務本身的耗時有關，整個事務的時間越短，RT的效能影響就越明顯。

4、讀寫混合事務（讀少寫多）

讀少寫多的混合事務，保護級別越高，RT就越高，但RT在整個事務中的佔比與事務本身的耗時有關，整個事務的時間越短，RT的效能影響就越明顯。

系列文章

《阿里雲RDS金融資料庫(三節點版) – 背景篇》

《阿里雲RDS金融資料庫(三節點版) – 理論篇》

《阿里雲RDS金融資料庫(三節點版) – 效能篇》

《阿里雲RDS金融資料庫(三節點版) – 案例篇》