PostgreSQL DBA(6) - SeqScan vs IndexScan vs Bit...
本節介紹了PostgreSQL中資料表的三種掃描型別,分別是順序掃描SeqScan、索引掃描IndexScan和點陣圖堆掃描BitmapHeapScan。
一、簡介
選擇率=條件過濾後的元組數/條件過濾前的元組數
順序掃描SeqScan
直接對資料表堆資料(Heap Data)進行順序掃描,適用於選擇率較高的場景.
索引掃描IndexScan
透過訪問索引獲得元組位置指標後再訪問堆資料,適用於選擇率較低的場景.
點陣圖堆掃描BitmapHeapScan
點陣圖堆掃描需要首先透過BitmapIndexScan(點陣圖索引掃描)把符合條件的元組所在的Page(Block) ID儲存在Bitmap中,然後再透過Bitmap訪問堆資料,適用於選擇率不高不低的場景,介於上面兩種掃描方式之間.
2018.10.01 修正,索引適用於選擇率低的情況,順序掃描適用於選擇率高的情況
值得注意的地方:
1."選擇率較高"是一種定性的表述,實際上PG是根據Cost計算來確定使用哪種掃描方式.通常情況下,索引掃描主要執行的操作是隨機訪問儲存裝置,在PG的初始化引數配置中,隨機訪問的Cost是4,而順序訪問的Cost是1,很粗略的估算,如果透過索引訪問的Index Blocks + Heap Blocks超過順序訪問的Heap Blocks的1/4,那麼PG會選擇使用順序掃描而不是索引掃描.
2.IndexScan的掃描方式是訪問索引,如符合條件則馬上根據索引中的元組位置指標訪問堆資料從而獲取元組,而BitmapIndexScan(點陣圖索引掃描)是訪問索引,把符合條件的Block ID儲存在Bitmap中,這時候不涉及掃描堆資料,最終獲取元組的操作透過BitmapHeapScan掃描完成.
這兩者的不同,下面這段話總結得非常到位:
A plain Index Scan fetches one tuple-pointer at a time from the index, and immediately visits that tuple in the table. A bitmap scan fetches all the tuple-pointers from the index in one go, sorts them using an in-memory "bitmap" data structure, and then visits the table tuples in physical tuple-location order.
下面透過樣例指令碼直觀感受這幾種方式的不同.
測試資料表,t_dwxx,10000行資料,在dwbh上建立PK
testdb=# select count(*) from t_dwxx;
count
-------
10000
(1 row)
二、SeqScan
測試指令碼:
testdb=# explain verbose select t1.* from t_dwxx t1 where dwbh > '1000';
QUERY PLAN
----------------------------------------------------------------------
Seq Scan on public.t_dwxx t1 (cost=0.00..189.00 rows=9999 width=20)
Output: dwmc, dwbh, dwdz
Filter: ((t1.dwbh)::text > '1000'::text)
(3 rows)
查詢條件為dwbh > '1000',選擇率較低,PG選擇了順序掃描SeqScan,成本189.00,該成本如何計算,有興趣的可參照原始碼解讀(53),透過gdb跟蹤分析.
三、IndexScan
測試指令碼:
testdb=# explain verbose select t1.* from t_dwxx t1 where dwbh = '10000';
QUERY PLAN
-------------------------------------------------------------------------------------
Index Scan using t_dwxx_pkey on public.t_dwxx t1 (cost=0.29..8.30 rows=1 width=20)
Output: dwmc, dwbh, dwdz
Index Cond: ((t1.dwbh)::text = '10000'::text)
(3 rows)
查詢條件為dwbh = '10000',選擇率很高,只有1條記錄,選擇索引掃描.
總成本8.30=啟動成本 + 一次Index Block訪問 + 一次Heap Block訪問=0.29 + 4 + 4≈8.30
四、BitmapHeapScan
測試指令碼:
testdb=# explain verbose select t1.* from t_dwxx t1 where dwbh > '1000' and dwbh < '3000';
QUERY PLAN
---------------------------------------------------------------------------------------------
Bitmap Heap Scan on public.t_dwxx t1 (cost=51.07..148.42 rows=2223 width=20)
Output: dwmc, dwbh, dwdz
Recheck Cond: (((t1.dwbh)::text > '1000'::text) AND ((t1.dwbh)::text < '3000'::text))
-> Bitmap Index Scan on t_dwxx_pkey (cost=0.00..50.52 rows=2223 width=0)
Index Cond: (((t1.dwbh)::text > '1000'::text) AND ((t1.dwbh)::text < '3000'::text))
(5 rows)
查詢條件為dwbh > '1000' and dwbh < '3000',選擇率不高不低,PG選擇了BitmapHeapScan,啟動成本為51.07,總成本為148.42,該成本如何計算,後續的原始碼解讀會跟蹤分析.
值得注意的是在BitmapIndexScan後有一步:Recheck,這是因為點陣圖索引掃描只是把Heap Block ID找出來,並沒有把符合條件的元組找出來,因此出現了Recheck這一步.
五、參考資料
PostgreSQL indexing: Index scan vs. Bitmap scan vs. Sequential scan
Bitmap indexes
What is a “Bitmap heap scan” in a query plan?
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/6906/viewspace-2374851/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- PostgreSQL DBA(131) - Develop(numeric vs float vs int)SQLdev
- PostgreSQL DBA(175) - Cost EST(SeqScan)SQL
- PostgreSQL DBA(121) - pgAdmin(HA:PAF vs repmgr vs Patroni)SQL
- PostgreSQL DBA(34) - (HashAggregate vs GroupAggregate)SQL
- Oracle vs PostgreSQL DBA(21)- Oracle VPDOracleSQL
- PostgreSQL DBA(89) - Linux(Buffer vs Cache)SQLLinux
- PostgreSQL DBA(169) - Develop(Distinct vs Group by)SQLdev
- PostgreSQL DBA(177) - Serializability Isolation(Index vs NonIndex)SQLIndex
- Oracle vs PostgreSQL DBA(22)- Oracle VPD#2OracleSQL
- PostgreSQL DBA(107) - pgAdmin(Don't do this:NOT IN vs NOT EXISTS)SQL
- Oracle vs PostgreSQL DBA(13)- 拆分(split)分割槽OracleSQL
- PostgreSQL DBA(119) - pgAdmin(LIMIT:Index Scan vs Bitmap Index Scan)SQLMITIndex
- PostgreSQL DBA(146) - pgAdmin(pg_dumpall vs pg_dump)SQL
- PostgreSQL DBA(88) - Linux(CPU使用率 vs 平均負載)SQLLinux負載
- windows vs 編譯postgresqlWindows編譯SQL
- Oracle vs PostgreSQL Develop(27) - PLSQL vs PLPGSQL(輸出引數)OracleSQLdev
- Oracle vs PostgreSQL Develop(17) - ARRAYOracleSQLdev
- Oracle vs PostgreSQL Develop(15) - DISTINCT ONOracleSQLdev
- Oracle vs PostgreSQL Develop(18) - BooleanOracleSQLdevBoolean
- Oracle vs PostgreSQL,研發注意事項(6)- 事務處理OracleSQL
- Oracle vs PostgreSQL Develop(25) - plsql vs plpgsql(語法嚴謹性)OracleSQLdev
- Oracle vs PostgreSQL Develop(29) - COMMIT in PLSQLOracleSQLdevMIT
- Oracle vs PostgreSQL Develop(16) - Prepared StatementOracleSQLdev
- Oracle vs PostgreSQL Develop(20) - Materialized ViewOracleSQLdevZedView
- Oracle vs PostgreSQL Develop(19) - PIPE ROWOracleSQLdev
- Oracle vs PostgreSQL Develop(28) - function overloadOracleSQLdevFunction
- Oracle vs PostgreSQL Develop(26) - ASCII 0OracleSQLdevASCII
- PostgreSQL的MVCC vs InnoDB的MVCCSQLMVC
- Kunlun-Storage vs PostgreSQL OLTP 測試SQL
- Oracle vs PostgreSQL Develop(31) - Index Only ScanOracleSQLdevIndex
- Oracle vs PostgreSQL Develop(24) - subquery with agg functionOracleSQLdevFunction
- Playwright VS Selenium VS Puppeteer VS Cypress
- Oracle vs PostgreSQL Develop(30) - Index&Case whenOracleSQLdevIndex
- vs 2017 vs code
- Airflow vs. Luigi vs. Argo vs. MLFlow vs. KubeFlowAIUIGo
- PostgreSQL DBA(6) - PG 11 New Features#3SQL
- Axum vs Actix vs Rocket
- RDBMS VS XML VS NoSQLXMLSQL