PostgreSQL秒殺場景優化
秒殺場景的典型瓶頸在於對同一條記錄的多次更新請求,然後只有一個或者少量請求是成功的,其他請求是以失敗或更新不到告終。
例如,Iphone的1元秒殺,如果我只放出1臺Iphone,我們把它看成一條記錄,秒殺開始後,誰先搶到(更新這條記錄的鎖),誰就算秒殺成功。
例如:
使用一個標記位來表示這條記錄是否已經被更新,或者記錄更新的次數(幾臺Iphone)。
update tbl set xxx=xxx,upd_cnt=upd_cnt+1 where id=pk and upd_cnt+1<=5; -- 假設可以秒殺5臺
這種方法的弊端:
獲得鎖的使用者在處理這條記錄時,可能成功,也可能失敗,或者可能需要很長時間,(例如資料庫響應慢)在它結束事務前,其他會話只能等著。
等待是非常不科學的,因為對於沒有獲得鎖的使用者,等待是在浪費時間。
所以一般的優化處理方法是先使用for update nowait的方式來避免等待,即如果無法即可獲得鎖,那麼就不等待。
例如:
begin;
select 1 from tbl where id=pk for update nowait; -- 如果使用者無法即刻獲得鎖,則返回錯誤。從而這個事務回滾。
update tbl set xxx=xxx,upd_cnt=upd_cnt+1 where id=pk and upd_cnt+1<=5;
end;
這種方法可以減少使用者的等待時間,因為無法即刻獲得鎖後就直接返回了。
但是這種方法也存在一定的弊端,對於一個商品,如果可以秒殺多臺的話,我們用1條記錄來儲存多臺,降低了秒殺的併發性。
因為我們用的是行鎖。
解決這個問題辦法很多,最終就是要提高併發性,例如:
1. 分段秒殺,把商品數量打散,拆成多個段,從而提高併發處理能力。
總體來說,優化的思路是減少鎖等待時間,避免序列,儘量並行。
優化到這裡就結束了嗎?顯然沒有,以上方法任意資料庫都可以做到,如果就這樣結束怎麼體現PostgreSQL的特性呢?
PostgreSQL還提供了一個鎖型別,advisory鎖,這種鎖比行鎖更加輕量,支援會話級別和事務級別。(但是需要注意ID是全域性的,否則會相互干擾,也就是說,所有參與秒殺或者需要用到advisory lock的ID需要在單個庫內保持全域性唯一)
例子:
update tbl set xxx=xxx,upd_cnt=upd_cnt+1 where id=pk and upd_cnt+1<=5 and pg_try_advisory_xact_lock(:id);
最後必須要對比一下for update nowait和advisory lock的效能。
下面是在一臺本地虛擬機器上的測試。
新建一張秒殺表
postgres=# d t1
Table "public.t1"
Column | Type | Modifiers
--------+---------+-----------
id | integer | not null
info | text |
Indexes:
"t1_pkey" PRIMARY KEY, btree (id)
只有一條記錄,不斷的被更新
postgres=# select * from t1;
id | info
----+-------------------------------
1 | 2015-09-14 09:47:04.703904+08
(1 row)
壓測for update nowait的方式:
CREATE OR REPLACE FUNCTION public.f1(i_id integer)
RETURNS void
LANGUAGE plpgsql
AS $function$
declare
begin
perform 1 from t1 where id=i_id for update nowait;
update t1 set info=now()::text where id=i_id;
exception when others then
return;
end;
$function$;
postgres@digoal-> cat test1.sql
setrandom id 1 1
select f1(:id);
壓測advisory lock的方式:
postgres@digoal-> cat test.sql
setrandom id 1 1
update t1 set info=now()::text where id=:id and pg_try_advisory_xact_lock(:id);
清除壓測統計資料:
postgres=# select pg_stat_reset();
pg_stat_reset
---------------
(1 row)
postgres=# select * from pg_stat_all_tables where relname=`t1`;
-[ RECORD 1 ]-------+-------
relid | 184731
schemaname | public
relname | t1
seq_scan | 0
seq_tup_read | 0
idx_scan | 0
idx_tup_fetch | 0
n_tup_ins | 0
n_tup_upd | 0
n_tup_del | 0
n_tup_hot_upd | 0
n_live_tup | 0
n_dead_tup | 0
n_mod_since_analyze | 0
last_vacuum |
last_autovacuum |
last_analyze |
last_autoanalyze |
vacuum_count | 0
autovacuum_count | 0
analyze_count | 0
autoanalyze_count | 0
壓測結果:
postgres@digoal-> pgbench -M prepared -n -r -P 1 -f ./test1.sql -c 20 -j 20 -T 60
......
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 20
number of threads: 20
duration: 60 s
number of transactions actually processed: 792029
latency average: 1.505 ms
latency stddev: 4.275 ms
tps = 13196.542846 (including connections establishing)
tps = 13257.270709 (excluding connections establishing)
statement latencies in milliseconds:
0.002625 setrandom id 1 1
1.502420 select f1(:id);
postgres=# select * from pg_stat_all_tables where relname=`t1`;
-[ RECORD 1 ]-------+-------
relid | 184731
schemaname | public
relname | t1
seq_scan | 0
seq_tup_read | 0
idx_scan | 896963 // 大多數是無用功
idx_tup_fetch | 896963 // 大多數是無用功
n_tup_ins | 0
n_tup_upd | 41775
n_tup_del | 0
n_tup_hot_upd | 41400
n_live_tup | 0
n_dead_tup | 928
n_mod_since_analyze | 41774
last_vacuum |
last_autovacuum |
last_analyze |
last_autoanalyze |
vacuum_count | 0
autovacuum_count | 0
analyze_count | 0
autoanalyze_count | 0
postgres@digoal-> pgbench -M prepared -n -r -P 1 -f ./test.sql -c 20 -j 20 -T 60
......
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 20
number of threads: 20
duration: 60 s
number of transactions actually processed: 1392372
latency average: 0.851 ms
latency stddev: 2.475 ms
tps = 23194.831054 (including connections establishing)
tps = 23400.411501 (excluding connections establishing)
statement latencies in milliseconds:
0.002594 setrandom id 1 1
0.848536 update t1 set info=now()::text where id=:id and pg_try_advisory_xact_lock(:id);
postgres=# select * from pg_stat_all_tables where relname=`t1`;
-[ RECORD 1 ]-------+--------
relid | 184731
schemaname | public
relname | t1
seq_scan | 0
seq_tup_read | 0
idx_scan | 1368933 // 大多數是無用功
idx_tup_fetch | 1368933 // 大多數是無用功
n_tup_ins | 0
n_tup_upd | 54957
n_tup_del | 0
n_tup_hot_upd | 54489
n_live_tup | 0
n_dead_tup | 1048
n_mod_since_analyze | 54957
last_vacuum |
last_autovacuum |
last_analyze |
last_autoanalyze |
vacuum_count | 0
autovacuum_count | 0
analyze_count | 0
autoanalyze_count | 0
我們注意到,不管用哪種方法,都會浪費掉很多次的無用功掃描。
為了解決無用掃描的問題,可以使用以下函式。(當然,還有更好的方法是對使用者透明。)
CREATE OR REPLACE FUNCTION public.f(i_id integer)
RETURNS void
LANGUAGE plpgsql
AS $function$
declare
a_lock boolean := false;
begin
select pg_try_advisory_xact_lock(i_id) into a_lock;
if a_lock then
update t1 set info=now()::text where id=i_id;
end if;
exception when others then
return;
end;
$function$;
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 20
number of threads: 20
duration: 60 s
number of transactions actually processed: 1217195
latency average: 0.973 ms
latency stddev: 3.563 ms
tps = 20283.314001 (including connections establishing)
tps = 20490.143363 (excluding connections establishing)
statement latencies in milliseconds:
0.002703 setrandom id 1 1
0.970209 select f(:id);
postgres=# select * from pg_stat_all_tables where relname=`t1`;
-[ RECORD 1 ]-------+-------
relid | 184731
schemaname | public
relname | t1
seq_scan | 0
seq_tup_read | 0
idx_scan | 75927
idx_tup_fetch | 75927
n_tup_ins | 0
n_tup_upd | 75927
n_tup_del | 0
n_tup_hot_upd | 75902
n_live_tup | 0
n_dead_tup | 962
n_mod_since_analyze | 75927
last_vacuum |
last_autovacuum |
last_analyze |
last_autoanalyze |
vacuum_count | 0
autovacuum_count | 0
analyze_count | 0
autoanalyze_count | 0
除了吞吐率的提升,我們其實還看到真實的處理數(更新次數)也有提升,所以不僅僅是降低了等待延遲,實際上也提升了處理能力。
最後提供一個物理機上的資料參考,使用128個併發連線,同時對一條記錄進行更新:
不做任何優化的併發處理能力:
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 128
number of threads: 128
duration: 100 s
number of transactions actually processed: 285673
latency average: 44.806 ms
latency stddev: 45.751 ms
tps = 2855.547375 (including connections establishing)
tps = 2855.856976 (excluding connections establishing)
statement latencies in milliseconds:
0.002509 setrandom id 1 1
44.803299 update t1 set info=now()::text where id=:id;
使用for update nowait的併發處理能力:
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 128
number of threads: 128
duration: 100 s
number of transactions actually processed: 6663253
latency average: 1.919 ms
latency stddev: 2.804 ms
tps = 66623.169445 (including connections establishing)
tps = 66630.307999 (excluding connections establishing)
statement latencies in milliseconds:
0.001934 setrandom id 1 1
1.917297 select f1(:id);
使用advisory lock後的併發處理能力:
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 80
number of threads: 80
duration: 60 s
number of transactions actually processed: 13883387
latency average: 0.344 ms
latency stddev: 0.535 ms
tps = 231197.323122 (including connections establishing)
tps = 231376.427515 (excluding connections establishing)
statement latencies in milliseconds:
0.344042 select f(1);
此時的perf top
PerfTop: 23883 irqs/sec kernel:32.2% exact: 0.0% [1000Hz cycles], (all, 32 CPUs)
--------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ____________________________ ______________________________________________
10645.00 3.5% GetSnapshotData /u02/digoal/soft_bak/pgsql9.5/bin/postgres
7963.00 2.6% AllocSetAlloc /u02/digoal/soft_bak/pgsql9.5/bin/postgres
4720.00 1.6% _int_malloc /lib64/libc-2.12.so
4270.00 1.4% __schedule [kernel.kallsyms]
4234.00 1.4% fmgr_info_cxt_security /u02/digoal/soft_bak/pgsql9.5/bin/postgres
4217.00 1.4% LWLockAcquire /u02/digoal/soft_bak/pgsql9.5/bin/postgres
3958.00 1.3% hash_search_with_hash_value /u02/digoal/soft_bak/pgsql9.5/bin/postgres
3656.00 1.2% __GI_vfprintf /lib64/libc-2.12.so
3572.00 1.2% update_blocked_averages [kernel.kallsyms]
3338.00 1.1% PostgresMain /u02/digoal/soft_bak/pgsql9.5/bin/postgres
3267.00 1.1% __switch_to [kernel.kallsyms]
3095.00 1.0% __strlen_sse42 /lib64/libc-2.12.so
2996.00 1.0% memcpy /lib64/libc-2.12.so
2930.00 1.0% _int_free /lib64/libc-2.12.so
2568.00 0.8% LWLockRelease /u02/digoal/soft_bak/pgsql9.5/bin/postgres
2446.00 0.8% SearchCatCache /u02/digoal/soft_bak/pgsql9.5/bin/postgres
2178.00 0.7% ExecInitExpr /u02/digoal/soft_bak/pgsql9.5/bin/postgres
2053.00 0.7% hash_any /u02/digoal/soft_bak/pgsql9.5/bin/postgres
2035.00 0.7% __GI___libc_malloc /lib64/libc-2.12.so
2009.00 0.7% _raw_spin_lock_irqsave [kernel.kallsyms]
1804.00 0.6% exec_stmt /u02/digoal/soft_bak/pgsql9.5/lib/plpgsql.so
1764.00 0.6% __memset_sse2 /lib64/libc-2.12.so
1717.00 0.6% pqParseInput3 /u02/digoal/soft_bak/pgsql9.5/lib/libpq.so.5.8
1696.00 0.6% do_select [kernel.kallsyms]
1686.00 0.6% __strcpy_ssse3 /lib64/libc-2.12.so
1685.00 0.6% update_curr [kernel.kallsyms]
1619.00 0.5% enqueue_entity [kernel.kallsyms]
1607.00 0.5% pfree/u02/digoal/soft_bak/pgsql9.5/bin/postgres
1598.00 0.5% doCustom /u02/digoal/soft_bak/pgsql9.5/bin/pgbench
1594.00 0.5% idle_cpu [kernel.kallsyms]
1589.00 0.5% update_cfs_rq_blocked_load [kernel.kallsyms]
1554.00 0.5% lapic_next_deadline [kernel.kallsyms]
1512.00 0.5% update_cfs_shares[kernel.kallsyms]
1491.00 0.5% MemoryContextCreate /u02/digoal/soft_bak/pgsql9.5/bin/postgres
1482.00 0.5% _raw_spin_lock [kernel.kallsyms]
1423.00 0.5% palloc /u02/digoal/soft_bak/pgsql9.5/bin/postgres
1419.00 0.5% __GI___sigsetjmp /lib64/libc-2.12.so
1412.00 0.5% __cfree /lib64/libc-2.12.so
1399.00 0.5% unix_stream_recvmsg [kernel.kallsyms]
1393.00 0.5% __fget_light [kernel.kallsyms]
1359.00 0.4% ResourceOwnerReleaseInternal /u02/digoal/soft_bak/pgsql9.5/bin/postgres
1351.00 0.4% AllocSetFree /u02/digoal/soft_bak/pgsql9.5/bin/postgres
1277.00 0.4% unix_stream_sendmsg [kernel.kallsyms]
1246.00 0.4% __memcmp_sse4_1 /lib64/libc-2.12.so
1240.00 0.4% plpgsql_exec_function /u02/digoal/soft_bak/pgsql9.5/lib/plpgsql.so
1225.00 0.4% expression_tree_walker /u02/digoal/soft_bak/pgsql9.5/bin/postgres
1160.00 0.4% exec_stmt_block /u02/digoal/soft_bak/pgsql9.5/lib/plpgsql.so
使用advisory lock,效能相比不做任何優化效能提升了約66倍,相比for update nowait效能提升了約1.8倍。
這種優化可以快速告訴使用者是否能秒殺到此類商品,而不需要等待其他使用者更新結束後才知道。所以大大降低了RT,提高了吞吐率。
最後提一下9.5的新特性, select ,,, for update ,,, skip locked.
http://blog.163.com/digoal@126/blog/static/163877040201551552017215/
如果能做到UPDATE語法裡面,就完美了,直接跳過無法獲得鎖的行。併發能力瞬間提升,也不用advisory了。
[參考]
1. http://www.postgresql.org/docs/9.5/static/functions-admin.html#FUNCTIONS-ADVISORY-LOCKS
相關文章
- 秒殺系統的場景特點
- 【Redis場景5】叢集秒殺最佳化-分散式鎖Redis分散式
- 秒殺系統前端優化前端優化
- RocketMQ實戰--高併發秒殺場景MQ
- 高頻面試題:秒殺場景設計面試題
- 秒殺業務架構優化之路架構優化
- 架構 秒殺系統優化思路架構優化
- 秒殺系統架構優化思路架構優化
- 【效能優化】秒殺系統效能優化初體驗優化
- 【Redis場景4】單機環境下秒殺問題Redis
- 高併發業務場景下的秒殺解決方案 (初探)
- 秒殺系統中流量削峰場景應該怎麼解決?
- Redis+Lua解決高併發場景搶購秒殺問題Redis
- 電商搶購秒殺系統的設計及應用場景分析
- 秒殺最佳化-基於阻塞佇列實現秒殺最佳化佇列
- (四)Java高併發秒殺API之高併發優化JavaAPI優化
- 雲端計算秒殺電商秒殺
- 儀表盤場景的前端優化前端優化
- 高頻寫入redis場景優化Redis優化
- 淘特 Flutter 流式場景的深度優化Flutter優化
- 用PostgreSQL找回618秒逝去的青春-遞迴收斂優化SQL遞迴優化
- go-zero微服務實戰系列(九、極致優化秒殺效能)Go微服務優化
- 高併發秒殺系統架構詳解,不是所有的秒殺都是秒殺!架構
- postgresql的failover場景處理SQLAI
- PostgreSQL秒殺4種方法-增加批量流式加減庫存方法SQL
- 秒殺流程圖流程圖
- 【高併發】秒殺系統架構解密,不是所有的秒殺都是秒殺(升級版)!!架構解密
- 【PostgreSQL 】PostgreSQL 15對distinct的優化SQL優化
- JavaScript 效能優化殺手JavaScript優化
- 同步秒殺實現:Redis在秒殺功能的實踐Redis
- 如何設計一個優秀的秒殺系統?
- 不走索引場景的一次分析優化索引優化
- 【Python秒殺指令碼】淘寶或京東等秒殺搶購Python指令碼
- 秒殺系統分析
- 前端效能優化--從 10 多秒到 1.05 秒前端優化
- 前端效能優化–從 10 多秒到 1.05 秒前端優化
- postgresql 優化與維護SQL優化
- 從京東618秒殺聊聊秒殺限流的多種實現