oracle 一個or語句因欄位資料分佈不均並缺少直方圖引起的全表掃描優化分析
以下SQL 走了全表掃描,效率下降,而SQL中謂詞欄位選擇性非常低,通過直方圖,並從btree轉bitmap後效能提供,於是對此過程進行分析。
Select Count(*) From pmc.DesignXXXXX t Where 1=1 and OrganId='C00000220'And CategoryCode=2 and IsEnable=1 and isdelete=0 or (PublicStatus=1 and isdelete=0 ); COUNT(*) ---------- 1845
較差的執行計劃:通過掃描表方式,邏輯讀需要844525:
=====================================================
Execution Plan ---------------------------------------------------------- Plan hash value: 527126818 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1| 19 | 229K (1)| 00:45:58 | | 1 | SORT AGGREGATE | | 1| 19 | | | |* 2 | TABLE ACCESS FULL| DESIGNXXXXX | 4744K| 85M| 229K (1)| 00:45:58 | ----------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("ISDELETE"=0 AND ("PUBLICSTATUS"=1 OR "ORGANID"='C00000220' AND "CATEGORYCODE"=2 AND "ISENABLE"=1)) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 844525 consistent gets 842418 physical reads 0 redo size 527 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
該SQL是如何選擇的執行計劃(通過10053進行追蹤):
oracle進行了次以下幾種方式的cost 比較:
1.評估通過全表掃描需要的cost是229760.92.
Access Path: TableScan Cost: 229760.92 Resp: 229760.92 Degree: 0 Cost_io: 229075.00 Cost_cpu: 25302994949 Resp_io: 229075.00 Resp_cpu: 25302994949
2.評估通過點陣圖索引的方式cost是741028,這裡是已經同時用bitmap方式將or兩邊進行聯結的消耗。
****** trying bitmap/domain indexes ******
....
Bitmap nodes:
Used IND_DESIGNXXXXX_ISENABLE_ORG
Cost = 35.099036, sel = 0.000494
Used IND_DESIGNXXXXX_CATEGORYCODE
Cost = 1281.621955, sel = 0.034894
Bitmap nodes:
Used IND_PUBLICSTATUS
Cost = 17275.447942, sel = 0.471383
Used bitmap node
Bitmap nodes:
Used bitmap node
Access path: Bitmap index - accepted
Cost: 741028.481879 Cost_io: 740534.527080 Cost_cpu: 18221443693.247154 Sel: 0.471392
因為該語句中存在or ,即分別計算or左右的訪問路徑消耗,再來進行組合。
3.or右邊通過IND_PUBLICSTATUS索引範圍掃描 cost是429957
Access Path: index (AllEqRange)
Index: IND_PUBLICSTATUS
resc_io: 429587.00 resc_cpu: 13681713060
ix_sel: 0.477347 ix_sel_with_filters: 0.477347
Cost: 429957.89 Resp: 429957.89 Degree: 1
4.or左邊分別計算使用以下索引的的消耗
1)DESIGNXXXXX_TIME_ORGANID的消耗是88778。
Access Path: index (SkipScan)
Index: DESIGNXXXXX_TIME_ORGANID
resc_io: 88761.00 resc_cpu: 643271006
ix_sel: 0.000509 ix_sel_with_filters: 0.000509
Cost: 88778.44 Resp: 88778.44 Degree: 1
2)IND_DESIGNXXXXX_CATEGORYCODE的消耗是32961.
Access Path: index (AllEqRange)
Index: IND_DESIGNXXXXX_CATEGORYCODE
resc_io: 32934.00 resc_cpu: 1020893102
ix_sel: 0.036885 ix_sel_with_filters: 0.036885
Cost: 32961.67 Resp: 32961.67 Degree: 1
ColGroup Usage:: PredCnt: 2 Matches Full: #2 Partial: Sel: 0.0005
ColGroup Usage:: PredCnt: 2 Matches Full: #2 Partial: Sel: 0.0005
3)IND_DESIGNXXXXX_CATEGORYCODE的消耗是32961.
Access Path: index (AllEqRange)
Index: IND_DESIGNXXXXX_ISENABLE_ORG
resc_io: 6499.00 resc_cpu: 57845156
ix_sel: 0.000494 ix_sel_with_filters: 0.000494
Cost: 6500.57 Resp: 6500.57 Degree: 1
4)單獨 IND_DESIGNXXXXX_ISENABLE_ORG和IND_DESIGNXXXXX_CATEGORYCODE轉bitmap 的消耗是1406。
Bitmap nodes:
Used IND_DESIGNXXXXX_ISENABLE_ORG
Cost = 35.099036, sel = 0.000494
Used IND_DESIGNXXXXX_CATEGORYCODE
Cost = 1281.621955, sel = 0.034894
Access path: Bitmap index - accepted
Cost: 1406.374238 Cost_io: 1399.626467 Cost_cpu: 248917754.369408 Sel: 0.000017
這裡需要注意的是將or左右兩邊分別拿出來計算,最終合併需要統計計算兩邊的消耗,因此以上的所有消耗評估是:
全表掃描(Cost: 229760.92)< IND_PUBLICSTATUS索引(Cost: 429957.89)+任意左邊任意一種訪問路徑方式 <兩邊直接轉點陣圖聯結的方式(Cost: 741028)
於是自然而然選擇了全表掃描:
Final cost for query block SEL$1 (#0) - All Rows Plan:
Best join order: 1
Cost: 229760.9246 Degree: 1 Card: 1845.0000 Bytes: 35055
Resc: 229760.9246 Resc_io: 229075.0000 Resc_cpu: 25302994949
Resp: 229760.9246 Resp_io: 229075.0000 Resc_cpu: 25302994949
我們要知道以上都只是oracle CBO評估的結果,而在日常應用中CBO如果獲取的表資訊不夠準確便為導致評估結果不一定是正確,而我們有時無法控制的是SQL每次硬解析時獲取資訊是否足夠準確,這也是因此偶爾會出現執行計劃突變的狀況。
以上SQL 通過收集直方圖後便可暫時得到解決。
這是收集直方圖後,較優的執行計劃:分別通過btree索引轉成BITMAP索引方式,邏輯讀需要 2196
================================================================
Execution Plan ---------------------------------------------------------- Plan hash value: 4067119963 -------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 19 | 647 (1)| 00:00:08 | | 1 | SORT AGGREGATE | | 1 | 19 | | | |* 2 | TABLE ACCESS BY INDEX ROWID | DESIGNXXXXX | 1901 | 36119 | 647 (1)| 00:00:08 | | 3 | BITMAP CONVERSION TO ROWIDS | | | | | | | 4 | BITMAP OR | | | | | | | 5 | BITMAP CONVERSION FROM ROWIDS | | | | | | |* 6 | INDEX RANGE SCAN | IND_PUBLICSTATUS | | | 6 (0)| 00:00:01 | | 7 | BITMAP AND | | | | | | | 8 | BITMAP CONVERSION FROM ROWIDS| | | | | | |* 9 | INDEX RANGE SCAN | IND_DESIGNXXXXx_ISENABLE_ORG | | | 3 (0)| 00:00:01 | | 10 | BITMAP CONVERSION FROM ROWIDS| | | | | | |* 11 | INDEX RANGE SCAN | IND_DESIGNXXXXXX_CATEGORYCODE | | | 102 (0)| 00:00:02 | -------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("ISDELETE"=0) 6 - access("PUBLICSTATUS"=1) 9 - access("ISENABLE"=1 AND "ORGANID"='C00000220') 11 - access("CATEGORYCODE"=2) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 2196 consistent gets 0 physical reads 0 redo size 527 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
以上 BITMAP CONVERSION的順序過程:
步驟1.sql通過IND_PUBLICSTATUS索引到表中獲取符合條件的行,然後從獲取的行中的rowid轉換成bitmap,這一步是BITMAP CONVERSION FROM ROWIDS。
步驟2.sql通過IND_DESIGNXXXXX_CATEGORYCODE索引到表中獲取符合條件的行,然後同樣從獲取的行中的rowid轉換成bitmap,這一步是BITMAP CONVERSION FROM ROWIDS。
步驟3.sql通過IND_DESIGNXXXXX_ISENABLE_ORG索引到表中獲取符合條件的行,然後同樣從獲取的行中的rowid轉換成bitmap,這一步是BITMAP CONVERSION FROM ROWIDS。
步驟4.sql 將步驟2和步驟3所得bitmap資料通過BITMAP AND 方式取交集。
步驟5.sql 將步驟1所得bitmaps資料與步驟4通過BITMAP OR方式取並集。
步驟6.sql 將步驟5最終獲取的並集bitmap資料轉換成ROWIDS,這一步是BITMAP CONVERSION TO ROWIDS。
步驟7.sql 將步驟6獲取的rowid通過回表方式到表中獲取所需要的欄位資料,這一步是ABLE ACCESS BY INDEX ROWID 。
為什麼會這樣:
當對錶中的唯一度不高的列建立了index,oracle就有可能選擇轉為bitmap來執行。檢視sql中where條件後欄位都是選擇性非常的低。
相應欄位選擇性:
COLUMN_NAME NUM_ROWS CARDINALITY SELECTIVITY ------------------------------ ---------- ----------- ----------- ORGANID 21095783 2070 .01 CATEGORYCODE 21095783 29 0 ISENABLE 21095783 2 0 ISDELETE 21095783 2 0 PUBLISHSTATE 21095783 1 0 對應索引: INDEX_NAME INDEX_COL INDEX_TYPE -------------------------------- ---------------------- ---------------------- PMC.IND_DESIGNXXXXX_CATEGORYCODE CATEGORYCODE NORMAL-NONUNIQUE PMC.IND_DESIGNXXXXX_ISENABLE_ORG ISENABLE,ORGANID NORMAL-NONUNIQUE PMC.IND_PUBLICSTATUS PUBLICSTATUS NORMAL-NONUNIQUE
同樣使用10053追蹤增加直方圖後SQL執行,此時CBO為什麼可以選擇到轉點陣圖的執行計劃,發現增加直方圖之後評估消耗只需要647,而在此之前所需消耗要高達741028。
增加直方圖後的評估:
Access path: Bitmap index - accepted
Cost: 647.047103 Cost_io: 646.348285 Cost_cpu: 25778603.541021 Sel: 0.000103
對比未增加直方圖之前的評估:
Access path: Bitmap index - accepted
Cost: 741028.481879 Cost_io: 740534.527080 Cost_cpu: 18221443693.247154 Sel: 0.471392
為什麼收集直方圖後評估的消耗可以這麼低?
在oracle CBO 計算cost主要是IO成本+CPU成本,在計算成本之前,CBO會收集以下統計資訊:
列中不同值的數量也就是NDV
列中的最小值/最大值
列中null值的數量
資料分佈
直方圖資訊(前提是收集直方圖)
對比收集直方圖前後的欄位資訊:
收集直方圖之前的欄位資訊:
Column (#4): ORGANID(
AvgLen: 10 NDV: 2023 Nulls: 4717 Density: 0.000494
Column (#29): CATEGORYCODE(
AvgLen: 2 NDV: 27 Nulls: 1164044 Density: 0.037037 Min: 0 Max: 66
Column (#38): ISENABLE(
AvgLen: 2 NDV: 2 Nulls: 1151627 Density: 0.500000 Min: 0 Max: 1
Column (#14): ISDELETE(
AvgLen: 3 NDV: 2 Nulls: 0 Density: 0.500000 Min: 0 Max: 1
Column (#32): PUBLICSTATUS(
AvgLen: 2 NDV: 2 Nulls: 1151554 Density: 0.500000 Min: 0 Max: 1
收集直方圖之後的欄位資訊:
Single Table Cardinality Estimation for DESIGNXXXXX[T]
Column (#14):
NewDensity:0.041803, OldDensity:0.000000 BktCnt:6033548, PopBktCnt:6033548, PopValCnt:2, NDV:2
Column (#14): ISDELETE(
AvgLen: 3 NDV: 2 Nulls: 0 Density: 0.041803 Min: 0 Max: 1
Histogram: Freq #Bkts: 2 UncompBkts: 6033548 EndPtVals: 2
Column (#4):
NewDensity:0.000185, OldDensity:0.001779 BktCnt:254, PopBktCnt:160, PopValCnt:25, NDV:2027
Column (#4): ORGANID(
AvgLen: 10 NDV: 2027 Nulls: 4830 Density: 0.000185
Histogram: HtBal #Bkts: 254 UncompBkts: 254 EndPtVals: 120
Column (#29):
NewDensity:0.000000, OldDensity:0.000000 BktCnt:5680066, PopBktCnt:5680055, PopValCnt:16, NDV:27
Column (#29): CATEGORYCODE(
AvgLen: 2 NDV: 27 Nulls: 1162620 Density: 0.000000 Min: 0 Max: 66
Histogram: Freq #Bkts: 27 UncompBkts: 5680066 EndPtVals: 27
Column (#38):
NewDensity:0.000943, OldDensity:0.000000 BktCnt:5687407, PopBktCnt:5687407, PopValCnt:2, NDV:2
Column (#38): ISENABLE(
AvgLen: 2 NDV: 2 Nulls: 1150490 Density: 0.000943 Min: 0 Max: 1
Histogram: Freq #Bkts: 2 UncompBkts: 5687407 EndPtVals: 2
ColGroup (#2, Index) IND_DESIGNXXXXX_ISENABLE_ORG
Col#: 4 38 CorStregth: 2.00
ColGroup (#3, Index) IND_DESIGNXXXXX_AUTHOR_TIME
Col#: 6 7 CorStregth: -1.00
ColGroup (#1, Index) DESIGNXXXXX_TIME_ORGANID
Col#: 4 7 CorStregth: -1.00
ColGroup Usage:: PredCnt: 3 Matches Full: Partial:
Column (#32):
NewDensity:0.000055, OldDensity:0.000000 BktCnt:5688611, PopBktCnt:5688611, PopValCnt:2, NDV:2
Column (#32): PUBLICSTATUS(
AvgLen: 2 NDV: 2 Nulls: 1150387 Density: 0.000055 Min: 0 Max: 1
Histogram: Freq #Bkts: 2 UncompBkts: 5688611 EndPtVals: 2
在沒有收集直方圖之前,發現有部分欄位的Density都是0.5,這個值是從1/NDV(基數)得到的,這是因為CBO有時無法正確的統計到表的資料分佈,但當收集直方圖後該值就改變了,因為在一個表中,不一定所有的資料都能分配平均,直方圖的作用就是能找出這種不平均,
那PUBLICSTATUS欄位來說,我們看到NDV是2,即是說全表之後兩個值,這兩個值是0或1,在沒有收集直方圖之前CBO可能會認為0和1的分佈是各一半,此時他去評估訪問該欄位的路徑可能是全表掃描比較好,
而實際上,表中PUBLICSTATUS=1 的資料量非常少。
sys@LVDB SQL>Select Count(*) From pmc.DesignXXXXX t where PublicStatus=1 and isdelete=0 ;
COUNT(*)
----------
1845
但直到PUBLICSTATUS的資料分佈後,CBO評估通過IND_PUBLICSTATUS索引訪問cost只需要6。這也是為什麼收集直方圖後能更加準確的評估訪問表的消耗了。
Access Path: index (AllEqRange)
Index: IND_PUBLICSTATUS
resc_io: 6.00 resc_cpu: 457729
ix_sel: 0.000112 ix_sel_with_filters: 0.000112
Cost: 6.01 Resp: 6.01 Degree: 0
然後該種0或1的情況選擇了轉換成bitmap索引的模式。
其實如果不選擇btree 轉換bitmap方式,直接使用btree索引回表效率也是沒問題的,只是需要將sql中的or拆成union語句
Execution Plan ---------------------------------------------------------- Plan hash value: 3766559296 ------------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 13 | 105 (2)| 00:00:02 | | 1 | SORT AGGREGATE | | 1 | 13 | | | | 2 | VIEW | | 2 | 26 | 105 (2)| 00:00:02 | | 3 | SORT UNIQUE | | 2 | 22 | 105 (2)| 00:00:02 | | 4 | UNION-ALL | | | | | | | 5 | SORT AGGREGATE | | 1 | 17 | 9 (12)| 00:00:01 | |* 6 | TABLE ACCESS BY INDEX ROWID| DESIGXXXXXXX | 1 | 17 | 8 (0)| 00:00:01 | |* 7 | INDEX RANGE SCAN | IND_DESIGNXXXXXX_ISENABLE_ORG | 6 | | 3 (0)| 00:00:01 | | 8 | SORT AGGREGATE | | 1 | 5 | 96 (2)| 00:00:02 | |* 9 | TABLE ACCESS BY INDEX ROWID| DESIGNXXXXXXX | 1874 | 9370 | 95 (0)| 00:00:02 | |* 10 | INDEX RANGE SCAN | IND_PUBLICSTATUS | 2046 | | 6 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 6 - filter("CATEGORYCODE"=2 AND "ISDELETE"=0) 7 - access("ISENABLE"=1 AND "ORGANID"='C00000281') 9 - filter("ISDELETE"=0) 10 - access("PUBLICSTATUS"=1) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 2114 consistent gets 0 physical reads 0 redo size 527 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 1 rows processed
對於開啟直方圖和btree轉Bitma都各自存在某些bug,有時甚至可能引發異常的效能問題,這點是需要重點注意的。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29863023/viewspace-2284608/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 21_Oracle資料庫全表掃描詳解(一)Oracle資料庫
- 24_Oracle資料庫全表掃描詳解(四)_全表掃描生產最佳化案例三則Oracle資料庫
- Oracle優化案例-單表分頁語句的優化(八)Oracle優化
- 23_Oracle資料庫全表掃描詳解(三)Oracle資料庫
- 22_Oracle資料庫全表掃描詳解(二)Oracle資料庫
- 全表掃描和全索引掃描索引
- oracle是如何進行全表掃描的Oracle
- [20210205]警惕toad下優化直方圖相關sql語句.txt優化直方圖SQL
- [20210205]警惕toad下優化直方圖相關sql語句3.txt優化直方圖SQL
- MySQL中的全表掃描和索引樹掃描MySql索引
- SQL優化案例-單表分頁語句的優化(八)SQL優化
- Oracle優化案例-分頁語句返回資料順序不一致(十一)Oracle優化
- 怎麼解決因全表掃描帶來的 Buffer Pool 汙染
- 理解資料庫掃描方法-利用掃描方法對資料儲存進行優化資料庫優化
- 全表掃描和全索引掃描繼續(PG-TiDB)索引TiDB
- Oracle-新增一個欄位並設定日期Oracle
- SQL Server中獲取資料庫名、表名、欄位名和欄位註釋的SQL語句SQLServer資料庫
- python 資料視覺化:直方圖、核密度估計圖、箱線圖、累積分佈函式圖Python視覺化直方圖函式
- Oracle資料遷移後由列的直方圖統計資訊引起的執行計劃異常Oracle直方圖
- 關係型資料庫全表掃描分片詳解資料庫
- mysql資料表按照某個欄位分類輸出MySql
- mongodb資料庫範圍分片資料分佈不均勻MongoDB資料庫
- Sql查詢 一個表中某欄位的資料在另一個表中某欄位中不存在的SQL
- MySQL 更新一個表裡的欄位等於另一個表某欄位的值MySql
- PostgreSQL模擬兩個update語句死鎖-利用掃描方法SQL
- MySQL中修改一個資料庫下包含有某個相同欄位的所有表的欄位長度MySql資料庫
- oracle資料庫檢視鎖表的sql語句整理Oracle資料庫SQL
- sql語句修改欄位型別和增加欄位SQL型別
- sqlserver採集欄位的sql語句SQLServer
- C# 掃描並讀取圖片中的文字C#
- 直方圖均衡化直方圖
- 【TUNE_ORACLE】列出NL(NESTED LOOPS)被驅動表走了全表掃描的SQL參考OracleOOPSQL
- 基於MySQL自增ID欄位增量掃描研究MySql
- 【Oracle】 索引的掃描方式Oracle索引
- 資料庫全表查詢之-分頁查詢優化資料庫優化
- oracle 修改表欄位的長度Oracle
- 資料庫效能優化之冗餘欄位的作用資料庫優化
- 【SQL】16 SQL CREATE INDEX 語句、 撤銷索引、撤銷表以及撤銷資料庫、ALTER TABLE 語句、AUTO INCREMENT 欄位SQLIndex索引資料庫REM