PostgreSQL10.0preview效能增強-hash,nestloopjoin優化(聰明的優化器是這樣的)
標籤
PostgreSQL , 10.0 , nestloop , hash join
背景
兩張表JOIN時,如果內表的JOIN欄位確定是唯一的,那麼在巢狀迴圈時,如果外表有重複值,迴圈過程中,對於內表來說,一個VALUE只需要掃描一次。
hash join同樣適用。
例子
postgres=# create table intbl(id int);
CREATE TABLE
postgres=# create unique index idx_intbl on intbl(id);
CREATE INDEX
postgres=# insert into intbl select generate_series(1,1000000);
INSERT 0 1000000
postgres=# create table out(id int);
CREATE TABLE
postgres=# insert into out select 1 from generate_series(1,1000);
-- 對於外表的1000個1, 內表scan一次命中後,同一個值不需要再次scan內表
INSERT 0 1000
postgres=# set enable_hashjoin =off;
SET
postgres=# set enable_mergejoin =off;
SET
postgres=# set enable_material =off;
SET
9.6
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from out,intbl where out.id=intbl.id;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.42..2736.00 rows=1000 width=8) (actual time=0.033..1.965 rows=1000 loops=1)
Output: "out".id, intbl.id
Buffers: shared hit=4005
-> Seq Scan on public."out" (cost=0.00..15.00 rows=1000 width=4) (actual time=0.013..0.101 rows=1000 loops=1)
Output: "out".id
Buffers: shared hit=5
-> Index Only Scan using idx_intbl on public.intbl (cost=0.42..2.71 rows=1 width=4) (actual time=0.001..0.002 rows=1 loops=1000)
Output: intbl.id
Index Cond: (intbl.id = "out".id)
Heap Fetches: 1000
Buffers: shared hit=4000
Planning time: 0.109 ms
Execution time: 2.048 ms
(13 rows)
10.0
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from out,intbl where out.id=intbl.id;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.42..2202.50 rows=1000 width=8) (actual time=0.035..1.803 rows=1000 loops=1)
Output: "out".id, intbl.id
Inner Unique: true
Buffers: shared hit=4005
-> Seq Scan on public."out" (cost=0.00..15.00 rows=1000 width=4) (actual time=0.013..0.106 rows=1000 loops=1)
Output: "out".id
Buffers: shared hit=5
-> Index Only Scan using idx_intbl on public.intbl (cost=0.42..2.19 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1000)
Output: intbl.id
Index Cond: (intbl.id = "out".id)
Heap Fetches: 1000
Buffers: shared hit=4000
Planning time: 0.122 ms
Execution time: 1.887 ms
(14 rows)
patch如下
Optimize joins when the inner relation can be proven unique.
author Tom Lane <tgl@sss.pgh.pa.us>
Sat, 8 Apr 2017 10:20:03 +0800 (22:20 -0400)
committer Tom Lane <tgl@sss.pgh.pa.us>
Sat, 8 Apr 2017 10:20:13 +0800 (22:20 -0400)
commit 9c7f5229ad68d7e0e4dd149e3f80257893e404d4
tree 0a167d403952550f43941b01b24ed5e7526c5351 tree | snapshot
parent f13a9121f9822eafe05cc3178bf046155a248173 commit | diff
Optimize joins when the inner relation can be proven unique.
If there can certainly be no more than one matching inner row for a given
outer row, then the executor can move on to the next outer row as soon as
it`s found one match; there`s no need to continue scanning the inner
relation for this outer row. This saves useless scanning in nestloop
and hash joins. In merge joins, it offers the opportunity to skip
mark/restore processing, because we know we have not advanced past the
first possible match for the next outer row.
Of course, the devil is in the details: the proof of uniqueness must
depend only on joinquals (not otherquals), and if we want to skip
mergejoin mark/restore then it must depend only on merge clauses.
To avoid adding more planning overhead than absolutely necessary,
the present patch errs in the conservative direction: there are cases
where inner_unique or skip_mark_restore processing could be used, but
it will not do so because it`s not sure that the uniqueness proof
depended only on "safe" clauses. This could be improved later.
David Rowley, reviewed and rather heavily editorialized on by me
Discussion: https://postgr.es/m/CAApHDvqF6Sw-TK98bW48TdtFJ+3a7D2mFyZ7++=D-RyPsL76gw@mail.gmail.com
這個patch的討論,詳見郵件組,本文末尾URL。
PostgreSQL社群的作風非常嚴謹,一個patch可能在郵件組中討論幾個月甚至幾年,根據大家的意見反覆的修正,patch合併到master已經非常成熟,所以PostgreSQL的穩定性也是遠近聞名的。
參考
相關文章
- 這可能是最好的效能優化教程(三)優化
- 這可能是最好的效能優化教程(一)優化
- 這可能是最好的效能優化教程(二)優化
- PostgreSQL11preview-優化器增強彙總SQLView優化
- 【效能優化】CBO優化器兩個內建的假設優化
- 【前端效能優化】vue效能優化前端優化Vue
- Flutter的效能優化Flutter優化
- CCSpriteBatchNode的優化效能BAT優化
- Android效能優化——效能優化的難題總結Android優化
- 效能優化漫談之七:效能優化的誤區優化
- ⚡️ 省錢 90%!我是這樣優化網站圖片的優化網站
- 效能優化案例-SQL優化優化SQL
- 前端效能優化(JS/CSS優化,SEO優化)前端優化JSCSS
- 前端效能優化之桌面瀏覽器優化策略前端優化瀏覽器
- 前端效能優化——桌面瀏覽器前端優化策略前端優化瀏覽器
- KVM虛擬機器的效能優化虛擬機優化
- 瀏覽器效能優化-渲染效能瀏覽器優化
- Android效能優化----卡頓優化Android優化
- 前端效能優化 --- 圖片優化前端優化
- [效能優化]DateFormatter深度優化探索優化ORM
- MySQL 效能優化之索引優化MySql優化索引
- Web效能優化:圖片優化Web優化
- MySQL 效能優化之SQL優化MySql優化
- 對Hash Join的一次優化優化
- sql server中的hash應用優化SQLServer優化
- 前端效能優化 —— 移動端瀏覽器優化策略前端優化瀏覽器
- 效能優化優化
- 前端效能優化的點前端優化
- iOS 效能優化的探索iOS優化
- PhotoKit相簿的效能優化優化
- 急性者的效能優化優化
- CSS 選擇器效能優化CSS優化
- 【效能優化】ORACLE資料庫效能優化概述優化Oracle資料庫
- 「簡明效能優化」雙端開啟Gzip優化
- 前端效能優化之移動端瀏覽器優化策略前端優化瀏覽器
- Oracle效能優化-SQL優化(案例一)Oracle優化SQL
- Oracle效能優化-SQL優化(案例二)Oracle優化SQL
- Oracle效能優化-SQL優化(案例三)Oracle優化SQL