PostgreSQL 13支援增量排序(Incremental Sorting)
PostgreSQL 13一個重要的功能是支援增量排序,使用order by 時可以加速排序,SQL如下
select * from test order by a,b limit 10;
如果在欄位a上面建立了索引,需要對欄位a、b進行排序,如果一個結果已經按幾個前導鍵排序,這就允許對附加的b進行批量排序。
enable_incremental_sort
PostgreSQL新增了配置enable_incremental_sort用於控制是否開啟增量排序,此引數預設開啟
測試準備
在PostgreSQL 13中建立測試表進行測試
postgres=# create table test(id int,c1 int ,c2 int,info varchar(300),crt_time timestamp);
CREATE TABLE
postgres=# insert into test select t,t,2,'test',clock_timestamp() from generate_series(1,1000000)t;
INSERT 0 1000000
postgres=# create index i_test_id on test(id);
CREATE INDEX
--檢視資料如下
postgres=# select * from test order by id,c1 limit 10;
id | c1 | c2 | info | crt_time
----+----+----+------+----------------------------
1 | 1 | 2 | test | 2022-06-02 14:23:38.253289
2 | 2 | 2 | test | 2022-06-02 14:23:38.253777
3 | 3 | 2 | test | 2022-06-02 14:23:38.253785
4 | 4 | 2 | test | 2022-06-02 14:23:38.253787
5 | 5 | 2 | test | 2022-06-02 14:23:38.25379
6 | 6 | 2 | test | 2022-06-02 14:23:38.253791
7 | 7 | 2 | test | 2022-06-02 14:23:38.253793
8 | 8 | 2 | test | 2022-06-02 14:23:38.253795
9 | 9 | 2 | test | 2022-06-02 14:23:38.253809
10 | 10 | 2 | test | 2022-06-02 14:23:38.25381
(10 rows)
PostgreSQL 13 測試
- 這裡我是在pg14中做的測試,pg13這個引數名叫
enable_incrementalsort
postgres=# show enable_incremental_sort;
enable_incremental_sort
-------------------------
on
(1 row)
postgres=# explain analyze select * from test order by id,c1 limit 10;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.46..1.16 rows=10 width=25) (actual time=0.159..0.163 rows=10 loops=1)
-> Incremental Sort (cost=0.46..70373.03 rows=1000000 width=25) (actual time=0.157..0.159 rows=10 loops=1)
Sort Key: id, c1
Presorted Key: id
Full-sort Groups: 1 Sort Method: quicksort Average Memory: 25kB Peak Memory: 25kB
-> Index Scan using i_test_id on test (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.103..0.106 rows=11 loops=1)
Planning Time: 0.427 ms
Execution Time: 0.265 ms
(8 rows)
- 可以看到
Incremental Sort
和Presorted Key: id
並且走了i_test_id
索引,SQL耗時0.265ms
關閉enable_incremental_sort
postgres=# set enable_incremental_sort=off;
SET
postgres=# explain analyze select * from test order by id,c1 limit 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Limit (cost=38962.64..38962.67 rows=10 width=25) (actual time=272.945..272.953 rows=10 loops=1)
-> Sort (cost=38962.64..41462.64 rows=1000000 width=25) (actual time=272.933..272.937 rows=10 loops=1)
Sort Key: id, c1
Sort Method: top-N heapsort Memory: 25kB
-> Seq Scan on test (cost=0.00..17353.00 rows=1000000 width=25) (actual time=0.028..118.098 rows=1000000 loops=1)
Planning Time: 0.305 ms
Execution Time: 273.023 ms
(7 rows)
- 關閉增量排序後SQL耗時
273.023 ms
,效能差了幾個數量級
PostgreSQL 12 測試
- Abase 7.0基於PostgreSQL 12.3
同樣使用上面的建表語句,執行SQL如下
postgres=# explain analyze select * from test order by id,c1 limit 10;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Limit (cost=38962.64..38962.67 rows=10 width=536) (actual time=288.847..288.851 rows=10 loops=1)
-> Sort (cost=38962.64..41462.64 rows=1000000 width=536) (actual time=288.839..288.840 rows=10 loops=1)
Sort Key: id, c1
Sort Method: top-N heapsort Memory: 25kB
-> Seq Scan on test (cost=0.00..17353.00 rows=1000000 width=536) (actual time=0.078..173.460 rows=1000000 loops=1)
Planning Time: 24.726 ms
Execution Time: 289.135 ms
(7 rows)
PG 12中執行計劃和PG 14關閉enable_incremental_sort引數一樣,效能較低
當然這只是一個簡單的查詢,如果包含where,以及連表等情況是否也可以使用 Incremental Sort
帶條件
加上c1 > 100000,c1沒有建立索引
postgres=# explain analyze select * from test where c1 > 100000 order by id,c1 limit 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.47..1.23 rows=10 width=25) (actual time=49.470..49.476 rows=10 loops=1)
-> Incremental Sort (cost=0.47..68345.40 rows=899386 width=25) (actual time=49.467..49.469 rows=10 loops=1)
Sort Key: id, c1
Presorted Key: id
Full-sort Groups: 1 Sort Method: quicksort Average Memory: 25kB Peak Memory: 25kB
-> Index Scan using i_test_id on test (cost=0.42..27873.02 rows=899386 width=25) (actual time=49.383..49.387 rows=11 loops=1)
Filter: (c1 > 100000)
Rows Removed by Filter: 100000
Planning Time: 0.879 ms
Execution Time: 49.594 ms
(10 rows)
加上 id > 100000,id有索引
postgres=# explain analyze select * from test where id > 100000 order by id,c1 limit 10;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.46..1.19 rows=10 width=25) (actual time=0.160..0.164 rows=10 loops=1)
-> Incremental Sort (cost=0.46..65542.05 rows=899386 width=25) (actual time=0.148..0.150 rows=10 loops=1)
Sort Key: id, c1
Presorted Key: id
Full-sort Groups: 1 Sort Method: quicksort Average Memory: 25kB Peak Memory: 25kB
-> Index Scan using i_test_id on test (cost=0.42..25069.68 rows=899386 width=25) (actual time=0.115..0.119 rows=11 loops=1)
Index Cond: (id > 100000)
Planning Time: 0.408 ms
Execution Time: 0.258 ms
(9 rows)
可以看到即使where條件沒有索引,排序欄位有索引也可以使用增量排序功能,而且效果也還不錯。做了一個過濾操作 Filter: (c1 > 100000)
PG 13 多欄位排序
- 根據id,c1,c2進行排序,一樣可以走增量排序
postgres=# explain analyze select * from test order by id,c1,c2 limit 10;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.46..1.16 rows=10 width=25) (actual time=0.175..0.179 rows=10 loops=1)
-> Incremental Sort (cost=0.46..70373.03 rows=1000000 width=25) (actual time=0.172..0.174 rows=10 loops=1)
Sort Key: id, c1, c2
Presorted Key: id
Full-sort Groups: 1 Sort Method: quicksort Average Memory: 25kB Peak Memory: 25kB
-> Index Scan using i_test_id on test (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.126..0.130 rows=11 loops=1)
Planning Time: 0.485 ms
Execution Time: 0.237 ms
(8 rows)
PG 13 join
- 複製一張test2
postgres=# create table test2 as select * from test;
SELECT 1000000
postgres=# create index i_test2_id on test2(id);
CREATE INDEX
- join連表查詢,並且排序欄位test.id,test.c1
postgres=# explain analyze select *from test join test2 on test.id = test2.id order by test.id,test.c1 limit 10;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1.93..3.04 rows=10 width=50) (actual time=0.089..0.092 rows=10 loops=1)
-> Incremental Sort (cost=1.93..110738.33 rows=1000000 width=50) (actual time=0.087..0.089 rows=10 loops=1)
Sort Key: test.id, test.c1
Presorted Key: test.id
Full-sort Groups: 1 Sort Method: quicksort Average Memory: 26kB Peak Memory: 26kB
-> Merge Join (cost=1.85..65738.33 rows=1000000 width=50) (actual time=0.044..0.068 rows=11 loops=1)
Merge Cond: (test.id = test2.id)
-> Index Scan using i_test_id on test (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.022..0.036 rows=11 loops=1)
-> Index Scan using i_test2_id on test2 (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.014..0.018 rows=11 loops=1)
Planning Time: 0.599 ms
Execution Time: 0.174 ms
(11 rows)
postgres=# set enable_incremental_sort=off ;
SET
postgres=# explain analyze select *from test join test2 on test.id = test2.id order by test.id,test.c1 limit 10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=87347.97..87347.99 rows=10 width=50) (actual time=1964.394..1964.407 rows=10 loops=1)
-> Sort (cost=87347.97..89847.97 rows=1000000 width=50) (actual time=1964.391..1964.402 rows=10 loops=1)
Sort Key: test.id, test.c1
Sort Method: top-N heapsort Memory: 26kB
-> Merge Join (cost=1.85..65738.33 rows=1000000 width=50) (actual time=0.070..1690.949 rows=1000000 loops=1)
Merge Cond: (test.id = test2.id)
-> Index Scan using i_test_id on test (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.042..571.732 rows=1000000 loops=1)
-> Index Scan using i_test2_id on test2 (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.017..585.722 rows=1000000 loops=1)
Planning Time: 1.292 ms
Execution Time: 1964.517 ms
(10 rows)
join後排序也可以走增量排序,使用增量排序耗時:0.174 ms,而關閉增量後耗時1964.517 ms
- 如果join後排序的欄位來自不同的表test.id,test2.c1
postgres=# explain analyze select *from test join test2 on test.id = test2.id order by test.id,test2.c1 limit 10;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1.93..3.04 rows=10 width=50) (actual time=0.151..0.155 rows=10 loops=1)
-> Incremental Sort (cost=1.93..110738.33 rows=1000000 width=50) (actual time=0.149..0.151 rows=10 loops=1)
Sort Key: test.id, test2.c1
Presorted Key: test.id
Full-sort Groups: 1 Sort Method: quicksort Average Memory: 26kB Peak Memory: 26kB
-> Merge Join (cost=1.85..65738.33 rows=1000000 width=50) (actual time=0.075..0.088 rows=11 loops=1)
Merge Cond: (test.id = test2.id)
-> Index Scan using i_test_id on test (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.040..0.044 rows=11 loops=1)
-> Index Scan using i_test2_id on test2 (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.025..0.028 rows=11 loops=1)
Planning Time: 0.778 ms
Execution Time: 0.230 ms
(11 rows)
postgres=# set enable_incremental_sort=off ;
SET
postgres=# explain analyze select *from test join test2 on test.id = test2.id order by test.id,test2.c1 limit 10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=87347.97..87347.99 rows=10 width=50) (actual time=1493.513..1493.519 rows=10 loops=1)
-> Sort (cost=87347.97..89847.97 rows=1000000 width=50) (actual time=1493.510..1493.513 rows=10 loops=1)
Sort Key: test.id, test2.c1
Sort Method: top-N heapsort Memory: 26kB
-> Merge Join (cost=1.85..65738.33 rows=1000000 width=50) (actual time=0.065..1228.403 rows=1000000 loops=1)
Merge Cond: (test.id = test2.id)
-> Index Scan using i_test_id on test (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.027..318.044 rows=1000000 loops=1)
-> Index Scan using i_test2_id on test2 (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.027..390.231 rows=1000000 loops=1)
Planning Time: 0.761 ms
Execution Time: 1493.685 ms
(10 rows)
join後排序的欄位來自不同的表test.id,test2.c1,也可以走增量排序,開啟增量耗時:0.230,關閉後耗時:1493.685 ms
來看看一個比較慢的SQL:
- 這個SQL兩表關聯,而且使用了c2=2這一列全部為2,並且使用offset 100000
postgres=# explain analyze select *from test join test2 on test.id = test2.id where test.c2 = 2 order by test.id,test2.c1 limit 10 offset 100000;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=11325.58..11326.72 rows=10 width=50) (actual time=198.125..198.131 rows=10 loops=1)
-> Incremental Sort (cost=2.02..113237.64 rows=1000000 width=50) (actual time=0.127..193.661 rows=100010 loops=1)
Sort Key: test.id, test2.c1
Presorted Key: test.id
Full-sort Groups: 3126 Sort Method: quicksort Average Memory: 29kB Peak Memory: 29kB
-> Merge Join (cost=1.94..68237.64 rows=1000000 width=50) (actual time=0.052..152.908 rows=100011 loops=1)
Merge Cond: (test.id = test2.id)
-> Index Scan using i_test_id on test (cost=0.42..27873.02 rows=1000000 width=25) (actual time=0.026..46.138 rows=100011 loops=1)
Filter: (c2 = 2)
-> Index Scan using i_test2_id on test2 (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.020..51.088 rows=100011 loops=1)
Planning Time: 0.707 ms
Execution Time: 198.252 ms
(12 rows)
因為增量排序的緣故,查詢還是很快
- 如果我們關閉增量排序功能
postgres=# explain analyze select *from test join test2 on test.id = test2.id where test.c2 = 2 order by test.id,test2.c1 limit 10 offset 100000;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=156536.56..156536.59 rows=10 width=50) (actual time=2496.085..2496.093 rows=10 loops=1)
-> Sort (cost=156286.56..158786.56 rows=1000000 width=50) (actual time=2469.643..2491.429 rows=100010 loops=1)
Sort Key: test.id, test2.c1
Sort Method: external merge Disk: 72432kB
-> Merge Join (cost=1.94..68237.64 rows=1000000 width=50) (actual time=0.082..1371.433 rows=1000000 loops=1)
Merge Cond: (test.id = test2.id)
-> Index Scan using i_test_id on test (cost=0.42..27873.02 rows=1000000 width=25) (actual time=0.040..433.114 rows=1000000 loops=1)
Filter: (c2 = 2)
-> Index Scan using i_test2_id on test2 (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.033..401.784 rows=1000000 loops=1)
Planning Time: 0.807 ms
Execution Time: 2530.205 ms
(11 rows)
這個SQL耗時 2530.205 ms,和198.252 ms比增量排序提升還是很明顯
但是我們觀察到上面的SQL中使用id進行關聯,且用id排序的時候查詢效率較高,如果排序的欄位換成crt_time效果如何?
postgres=# explain analyze select *from test join test2 on test.id = test2.id where test.c2 = 2 order by test.crt_time,test2.c1 limit 10 offset 100000;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=156536.56..156536.59 rows=10 width=50) (actual time=2702.107..2702.133 rows=10 loops=1)
-> Sort (cost=156286.56..158786.56 rows=1000000 width=50) (actual time=2667.324..2697.033 rows=100010 loops=1)
Sort Key: test.crt_time, test2.c1
Sort Method: external merge Disk: 72432kB
-> Merge Join (cost=1.94..68237.64 rows=1000000 width=50) (actual time=0.161..1524.794 rows=1000000 loops=1)
Merge Cond: (test.id = test2.id)
-> Index Scan using i_test_id on test (cost=0.42..27873.02 rows=1000000 width=25) (actual time=0.074..488.803 rows=1000000 loops=1)
Filter: (c2 = 2)
-> Index Scan using i_test2_id on test2 (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.073..487.688 rows=1000000 loops=1)
Planning Time: 1.835 ms
Execution Time: 2746.486 ms
(11 rows)
當join關聯的欄位和order by的欄位不一樣時,雖然order by的欄位有索引但也不能走,如果欄位一致那麼也能利用增量排序。
使用test.crt_time排序和上面關閉增量排序執行計劃一樣
總結
-
增量排序對於單表多欄位排序來說效率還是提升明顯
-
join連表查詢如果關聯的鍵和排序鍵一樣也能走增量排序,如果不一樣則不能走增量排序
參考資料: