PostgreSQL10.0preview效能增強-hash,nestloopjoin優化(聰明的優化器是這樣的)

德哥發表於2017-04-22

背景

兩張表JOIN時，如果內表的JOIN欄位確定是唯一的，那麼在巢狀迴圈時，如果外表有重複值，迴圈過程中，對於內表來說，一個VALUE只需要掃描一次。

hash join同樣適用。

例子

postgres=# create table intbl(id int);  
CREATE TABLE  
postgres=# create unique index idx_intbl on intbl(id);  
CREATE INDEX  
postgres=# insert into intbl select generate_series(1,1000000);     
INSERT 0 1000000  
postgres=# create table out(id int);  
CREATE TABLE  
postgres=# insert into out select 1 from generate_series(1,1000);   
-- 對於外表的1000個1, 內表scan一次命中後，同一個值不需要再次scan內表  
INSERT 0 1000  
postgres=# set enable_hashjoin =off;  
SET  
postgres=# set enable_mergejoin =off;  
SET  
postgres=# set enable_material =off;  
SET

9.6

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from out,intbl where out.id=intbl.id;  
                                                              QUERY PLAN                                                                
--------------------------------------------------------------------------------------------------------------------------------------  
 Nested Loop  (cost=0.42..2736.00 rows=1000 width=8) (actual time=0.033..1.965 rows=1000 loops=1)  
   Output: "out".id, intbl.id  
   Buffers: shared hit=4005  
   ->  Seq Scan on public."out"  (cost=0.00..15.00 rows=1000 width=4) (actual time=0.013..0.101 rows=1000 loops=1)  
         Output: "out".id  
         Buffers: shared hit=5  
   ->  Index Only Scan using idx_intbl on public.intbl  (cost=0.42..2.71 rows=1 width=4) (actual time=0.001..0.002 rows=1 loops=1000)  
         Output: intbl.id  
         Index Cond: (intbl.id = "out".id)  
         Heap Fetches: 1000  
         Buffers: shared hit=4000  
 Planning time: 0.109 ms  
 Execution time: 2.048 ms  
(13 rows)

10.0

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from out,intbl where out.id=intbl.id;  
                                                              QUERY PLAN                                                                
--------------------------------------------------------------------------------------------------------------------------------------  
 Nested Loop  (cost=0.42..2202.50 rows=1000 width=8) (actual time=0.035..1.803 rows=1000 loops=1)  
   Output: "out".id, intbl.id  
   Inner Unique: true  
   Buffers: shared hit=4005  
   ->  Seq Scan on public."out"  (cost=0.00..15.00 rows=1000 width=4) (actual time=0.013..0.106 rows=1000 loops=1)  
         Output: "out".id  
         Buffers: shared hit=5  
   ->  Index Only Scan using idx_intbl on public.intbl  (cost=0.42..2.19 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1000)  
         Output: intbl.id  
         Index Cond: (intbl.id = "out".id)  
         Heap Fetches: 1000  
         Buffers: shared hit=4000  
 Planning time: 0.122 ms  
 Execution time: 1.887 ms  
(14 rows)

patch如下

Optimize joins when the inner relation can be proven unique.  
  
author	Tom Lane <tgl@sss.pgh.pa.us>	  
Sat, 8 Apr 2017 10:20:03 +0800 (22:20 -0400)  
committer	Tom Lane <tgl@sss.pgh.pa.us>	  
Sat, 8 Apr 2017 10:20:13 +0800 (22:20 -0400)  
commit	9c7f5229ad68d7e0e4dd149e3f80257893e404d4  
tree	0a167d403952550f43941b01b24ed5e7526c5351	tree | snapshot  
parent	f13a9121f9822eafe05cc3178bf046155a248173	commit | diff  
Optimize joins when the inner relation can be proven unique.  
  
If there can certainly be no more than one matching inner row for a given  
outer row, then the executor can move on to the next outer row as soon as  
it`s found one match; there`s no need to continue scanning the inner  
relation for this outer row.  This saves useless scanning in nestloop  
and hash joins.  In merge joins, it offers the opportunity to skip  
mark/restore processing, because we know we have not advanced past the  
first possible match for the next outer row.  
  
Of course, the devil is in the details: the proof of uniqueness must  
depend only on joinquals (not otherquals), and if we want to skip  
mergejoin mark/restore then it must depend only on merge clauses.  
To avoid adding more planning overhead than absolutely necessary,  
the present patch errs in the conservative direction: there are cases  
where inner_unique or skip_mark_restore processing could be used, but  
it will not do so because it`s not sure that the uniqueness proof  
depended only on "safe" clauses.  This could be improved later.  
  
David Rowley, reviewed and rather heavily editorialized on by me  
  
Discussion: https://postgr.es/m/CAApHDvqF6Sw-TK98bW48TdtFJ+3a7D2mFyZ7++=D-RyPsL76gw@mail.gmail.com

這個patch的討論，詳見郵件組，本文末尾URL。

PostgreSQL社群的作風非常嚴謹，一個patch可能在郵件組中討論幾個月甚至幾年，根據大家的意見反覆的修正，patch合併到master已經非常成熟，所以PostgreSQL的穩定性也是遠近聞名的。

參考

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9c7f5229ad68d7e0e4dd149e3f80257893e404d4

這可能是最好的效能優化教程（三）
2019-02-28
優化
這可能是最好的效能優化教程（一）
2017-09-12
優化
這可能是最好的效能優化教程（二）
2017-09-13
優化
PostgreSQL11preview-優化器增強彙總
2018-07-28
SQLView優化
【效能優化】CBO優化器兩個內建的假設
2010-08-26
優化
【前端效能優化】vue效能優化
2019-04-30
前端優化Vue
Flutter的效能優化
2020-02-21
Flutter優化
CCSpriteBatchNode的優化效能
2013-07-12
BAT優化
Android效能優化——效能優化的難題總結
2021-08-12
Android優化
效能優化漫談之七：效能優化的誤區
2013-10-18
優化
⚡️ 省錢 90%！我是這樣優化網站圖片的
2022-01-13
優化網站
效能優化案例-SQL優化
2016-12-05
優化SQL
前端效能優化（JS/CSS優化，SEO優化）
2018-03-04
前端優化JSCSS
前端效能優化之桌面瀏覽器優化策略
2017-12-29
前端優化瀏覽器
前端效能優化——桌面瀏覽器前端優化策略
2018-01-14
前端優化瀏覽器
KVM虛擬機器的效能優化
2013-10-17
虛擬機優化
瀏覽器效能優化-渲染效能
2017-10-08
瀏覽器優化
Android效能優化----卡頓優化
2020-07-28
Android優化
前端效能優化 --- 圖片優化
2018-08-29
前端優化
[效能優化]DateFormatter深度優化探索
2018-12-20
優化ORM
MySQL 效能優化之索引優化
2015-04-01
MySql優化索引
Web效能優化：圖片優化
2014-12-10
Web優化
MySQL 效能優化之SQL優化
2015-04-08
MySql優化
對Hash Join的一次優化
2010-07-24
優化
sql server中的hash應用優化
2009-10-22
SQLServer優化
前端效能優化 —— 移動端瀏覽器優化策略
2018-03-20
前端優化瀏覽器
效能優化
2019-02-16
優化
前端效能優化的點
2020-10-19
前端優化
iOS 效能優化的探索
2018-03-08
iOS優化
PhotoKit相簿的效能優化
2017-02-12
優化
急性者的效能優化
2010-03-05
優化
CSS 選擇器效能優化
2019-01-14
CSS優化
【效能優化】ORACLE資料庫效能優化概述
2010-10-01
優化Oracle資料庫
「簡明效能優化」雙端開啟Gzip
2019-04-15
優化
前端效能優化之移動端瀏覽器優化策略
2017-12-29
前端優化瀏覽器
Oracle效能優化-SQL優化(案例一)
2020-02-17
Oracle優化SQL
Oracle效能優化-SQL優化(案例二)
2020-02-17
Oracle優化SQL
Oracle效能優化-SQL優化(案例三)
2020-02-18
Oracle優化SQL

PostgreSQL10.0preview效能增強-hash,nestloopjoin優化(聰明的優化器是這樣的)

標籤

背景

參考

相關文章