Oracle當number型別超過一定長度直方圖限制導致SQL執行計劃錯誤
Oracle當number型別超過一定長度直方圖限制
背景
生產系統,監控巡檢發現某個SQL邏輯讀非常高,透過檢視執行計劃,存在三個執行計劃,最高cost 9w多,較低的兩個位100左右。
經過分析,把具體分析以及發現雖然查詢where條件列擁有直方圖,且存在資料傾斜,但是CBO依然無法判斷資料分佈情況透過
模擬測試如下。
1. 11.2.0.4構造測試環境
1.1 建立測試表,包含兩列,均為number
create table test_hist (id1 number(30) primary key,id2 number(20) not null);
1.2 插入10000條資料,均為15位加1~10000數字構成,也就是兩列的distinct均與行數相同
begin for i in 1..10000 loop insert into test_hist values(111021111112345||i,111021111112345||i); end loop; commit; end; /
1.3 再次插入1w條資料
第一列在上面1w的基礎上遞增,第二列與上面中第一行資料相同,也就是第二列的分佈已經傾斜,1110211111123451值在2w總資料中有10001條。
begin for i in 1..10000 loop insert into test_hist values(1110211111123451000||i,1110211111123451); end loop; commit; end; / SCOTT@honor1 > select count(*) from test_hist; COUNT(*) ---------------------------------------- 20000 SCOTT@honor1 > select count(*) from test_hist where id2=1110211111123451; COUNT(*) ---------------------------------------- 10001
1.4 在第二列建立普通索引
create index idx_test_hist_id2 on test_hist(id2);
1.5 收集統計資訊,不收集直方圖
exec dbms_stats.gather_table_stats(user,'TEST_HIST',method_opt=>'for all columns size 1',estimate_percent=>100,cascade=>true,no_invalidate=>false);
1.6 檢視資料分佈
set lines 200 pages 200 col table_name for a15 col num_distinct for 9999999 col density for 999.9999999 col num_nulls for 9999 col num_buckets for 9999 col low_value for a25 col high_value for a25 select table_name,column_name,NUM_DISTINCT,LOW_VALUE,HIGH_VALUE,DENSITY,NUM_NULLS,NUM_BUCKETS,HISTOGRAM from user_tab_col_statistics where table_name='TEST_HIST'; TABLE_NAME COLUMN_NAME NUM_DISTINCT LOW_VALUE HIGH_VALUE DENSITY NUM_NULLS NUM_BUCKETS HISTOGRAM ---------- -------------------- ------------ ------------------------------ ------------------------- ------------ --------- ----------- --------------- TEST_HIST ID1 20000 C80C0B160C0C0D2334 CC0C0B160C0C0D23340102 .0000500 0 1 NONE TEST_HIST ID2 10000 C80C0B160C0C0D2334 CA0C0B160C0C0D2334 .0001000 0 1 NONE col ENDPOINT_ACTUAL_VALUE for a25 col ENDPOINT_ACTUAL_VALUE_RAW for a25 select * from user_tab_histograms where table_name='TEST_HIST'; TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALU ----------- ---------------------------- ------------------------- -------------------- TEST_HIST ID1 0 1110211111123450 TEST_HIST ID2 0 1110211111123450 TEST_HIST ID1 1 111021111112345000000000 TEST_HIST ID2 1 11102111111234500000
2. 測試SQL語 句
2.1 檢視執行計劃
SCOTT@honor1 > set autotrace traceonly SCOTT@honor1 > select /*+ gather_plan_statistics */ id2 from test_hist where id2=1110211111123451; 10001 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1411376830 -------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2 | 22 | 1 (0)| 00:00:01 | |* 1 | INDEX RANGE SCAN| IDX_TEST_HIST_ID2 | 2 | 22 | 1 (0)| 00:00:01 | --------------------------------------------------------------------------------------
2.2 檢視真實執行計劃
SYS@honor1 > select sql_id,sql_text,child_number,plan_hash_value from v$sql where sql_text like 'select /*+ gather_plan_statistics */ id2 from test_hist%'; SYS@honor1 > select * from table(dbms_xplan.display_cursor('02v238rmgtfg8',0,'allstats last')); PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------------------------------- SQL_ID 02v238rmgtfg8, child number 0 ------------------------------------- select /*+ gather_plan_statistics */ id2 from test_hist where id2=1110211111123451 Plan hash value: 1411376830 --------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | --------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 10001 |00:00:00.01 | 696 | 31 | |* 1 | INDEX RANGE SCAN| IDX_TEST_HIST_ID2 | 1 | 2 | 10001 |00:00:00.01 | 696 | 31 | --------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("ID2"=1110211111123451) 19 rows selected.
# 可以看到由於CBO最佳化器不知道ID2存在傾斜,導致estimate-rows與actual-rows存在非常大偏差,導致選擇了錯誤執行計劃,走了索引。
2.3 收集直方圖
# 11g最大直方圖buckets為254,所以按照最大收集直方圖。
exec dbms_stats.gather_table_stats(user,'TEST_HIST',method_opt=>'for all columns size 254', estimate_percent=>100,cascade=>true,no_invalidate=>false); set lines 200 pages 200 col table_name for a15 col num_distinct for 9999999 col density for 999.9999999 col num_nulls for 9999 col num_buckets for 9999 col low_value for a25 col high_value for a25 select table_name,column_name,NUM_DISTINCT,LOW_VALUE,HIGH_VALUE,DENSITY,NUM_NULLS,NUM_BUCKETS,HISTOGRAM from user_tab_col_statistics where table_name='TEST_HIST' TABLE_NAME COLUMN_NAME NUM_DISTINCT LOW_VALUE HIGH_VALUE DENSITY NUM_NULLS NUM_BUCKETS HISTOGRAM --------------- -------------------- ------------ ------------------------- ------------------------- ------------ --------- ----------- --------------- TEST_HIST ID1 20000 C80C0B160C0C0D2334 CC0C0B160C0C0D23340102 .0000500 0 254 HEIGHT BALANCED TEST_HIST ID2 10000 C80C0B160C0C0D2334 CA0C0B160C0C0D2334 .0000500 0 254 HEIGHT BALANCED SCOTT@honor1 > select table_name,column_name,endpoint_number,to_char(ENDPOINT_VALUE),ENDPOINT_ACTUAL_VALUE from user_tab_histograms where table_name='TEST_HIST' order by column_name,endpoint_number; TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER TO_CHAR(ENDPOINT_VALUE) ENDPOINT_ACTUAL_VALU --------------- -------------------- --------------- ---------------------------------------- -------------------- TEST_HIST ID2 126 1110211111123450 TEST_HIST ID2 127 11102111111234500 ...
2.4 透過上述列直方圖統計資訊以及測試可以得出如下情況:
# 取15位後,第十五位四捨五入,其餘位補0後,那麼實際查詢時,傳入值如果為具體值,不會參考具體值,驗證如下:
# 驗證之前ID2列直方圖資訊如下:
TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER TO_CHAR(ENDPOINT_VALUE) ------------------------------ ---------------------------- ------------------------ TEST_HIST ID2 0 1110211111123450 TEST_HIST ID2 112 1110211111123450 TEST_HIST ID2 124 1110211111123460 TEST_HIST ID2 125 11102111111234500 TEST_HIST ID2 126 11102111111234500 TEST_HIST ID2 127 11102111111234500 TEST_HIST ID2 128 11102111111234600 TEST_HIST ID2 129 11102111111234600 TEST_HIST ID2 130 11102111111234600 TEST_HIST ID2 131 11102111111234600 TEST_HIST ID2 132 111021111112345000 TEST_HIST ID2 133 111021111112345000 TEST_HIST ID2 134 111021111112345000 TEST_HIST ID2 135 111021111112345000 TEST_HIST ID2 136 111021111112345000 TEST_HIST ID2 137 111021111112345000 TEST_HIST ID2 138 111021111112345000 TEST_HIST ID2 139 111021111112345000 TEST_HIST ID2 140 111021111112345000 TEST_HIST ID2 141 111021111112345000 TEST_HIST ID2 142 111021111112345000 TEST_HIST ID2 143 111021111112346000 TEST_HIST ID2 144 111021111112346000 TEST_HIST ID2 145 111021111112346000 TEST_HIST ID2 146 111021111112346000 TEST_HIST ID2 147 111021111112346000 TEST_HIST ID2 148 111021111112346000 TEST_HIST ID2 149 111021111112346000 TEST_HIST ID2 150 111021111112346000 TEST_HIST ID2 151 111021111112346000 TEST_HIST ID2 152 1110211111123450000 TEST_HIST ID2 153 1110211111123450000 TEST_HIST ID2 154 1110211111123450000 TEST_HIST ID2 155 1110211111123450000 TEST_HIST ID2 156 1110211111123450000 TEST_HIST ID2 157 1110211111123450000 TEST_HIST ID2 158 1110211111123450000 TEST_HIST ID2 159 1110211111123450000 TEST_HIST ID2 160 1110211111123450000 TEST_HIST ID2 161 1110211111123450000 TEST_HIST ID2 162 1110211111123450000 TEST_HIST ID2 163 1110211111123450000 TEST_HIST ID2 164 1110211111123450000 TEST_HIST ID2 165 1110211111123450000 TEST_HIST ID2 166 1110211111123450000 TEST_HIST ID2 167 1110211111123450000 TEST_HIST ID2 168 1110211111123450000 TEST_HIST ID2 169 1110211111123450000 TEST_HIST ID2 170 1110211111123450000 TEST_HIST ID2 171 1110211111123450000 TEST_HIST ID2 172 1110211111123450000 TEST_HIST ID2 173 1110211111123450000 TEST_HIST ID2 174 1110211111123450000 TEST_HIST ID2 175 1110211111123450000 TEST_HIST ID2 176 1110211111123450000 TEST_HIST ID2 177 1110211111123450000 TEST_HIST ID2 178 1110211111123450000 TEST_HIST ID2 179 1110211111123450000 TEST_HIST ID2 180 1110211111123450000 TEST_HIST ID2 181 1110211111123450000 TEST_HIST ID2 182 1110211111123450000 TEST_HIST ID2 183 1110211111123450000 TEST_HIST ID2 184 1110211111123450000 TEST_HIST ID2 185 1110211111123450000 TEST_HIST ID2 186 1110211111123450000 TEST_HIST ID2 187 1110211111123450000 TEST_HIST ID2 188 1110211111123450000 TEST_HIST ID2 189 1110211111123450000 TEST_HIST ID2 190 1110211111123450000 TEST_HIST ID2 191 1110211111123450000 TEST_HIST ID2 192 1110211111123450000 TEST_HIST ID2 193 1110211111123450000 TEST_HIST ID2 194 1110211111123450000 TEST_HIST ID2 195 1110211111123450000 TEST_HIST ID2 196 1110211111123450000 TEST_HIST ID2 197 1110211111123450000 TEST_HIST ID2 198 1110211111123460000 TEST_HIST ID2 199 1110211111123460000 TEST_HIST ID2 200 1110211111123460000 TEST_HIST ID2 201 1110211111123460000 TEST_HIST ID2 202 1110211111123460000 TEST_HIST ID2 203 1110211111123460000 TEST_HIST ID2 204 1110211111123460000 TEST_HIST ID2 205 1110211111123460000 TEST_HIST ID2 206 1110211111123460000 TEST_HIST ID2 207 1110211111123460000 TEST_HIST ID2 208 1110211111123460000 TEST_HIST ID2 209 1110211111123460000 TEST_HIST ID2 210 1110211111123460000 TEST_HIST ID2 211 1110211111123460000 TEST_HIST ID2 212 1110211111123460000 TEST_HIST ID2 213 1110211111123460000 TEST_HIST ID2 214 1110211111123460000 TEST_HIST ID2 215 1110211111123460000 TEST_HIST ID2 216 1110211111123460000 TEST_HIST ID2 217 1110211111123460000 TEST_HIST ID2 218 1110211111123460000 TEST_HIST ID2 219 1110211111123460000 TEST_HIST ID2 220 1110211111123460000 TEST_HIST ID2 221 1110211111123460000 TEST_HIST ID2 222 1110211111123460000 TEST_HIST ID2 223 1110211111123460000 TEST_HIST ID2 224 1110211111123460000 TEST_HIST ID2 225 1110211111123460000 TEST_HIST ID2 226 1110211111123460000 TEST_HIST ID2 227 1110211111123460000 TEST_HIST ID2 228 1110211111123460000 TEST_HIST ID2 229 1110211111123460000 TEST_HIST ID2 230 1110211111123460000 TEST_HIST ID2 231 1110211111123460000 TEST_HIST ID2 232 1110211111123460000 TEST_HIST ID2 233 1110211111123460000 TEST_HIST ID2 234 1110211111123460000 TEST_HIST ID2 235 1110211111123460000 TEST_HIST ID2 236 1110211111123460000 TEST_HIST ID2 237 1110211111123460000 TEST_HIST ID2 238 1110211111123460000 TEST_HIST ID2 239 1110211111123460000 TEST_HIST ID2 240 1110211111123460000 TEST_HIST ID2 241 1110211111123460000 TEST_HIST ID2 242 1110211111123460000 TEST_HIST ID2 243 1110211111123460000 TEST_HIST ID2 244 1110211111123460000 TEST_HIST ID2 245 1110211111123460000 TEST_HIST ID2 246 1110211111123460000 TEST_HIST ID2 247 1110211111123460000 TEST_HIST ID2 248 1110211111123460000 TEST_HIST ID2 249 1110211111123460000 TEST_HIST ID2 250 1110211111123460000 TEST_HIST ID2 251 1110211111123460000 TEST_HIST ID2 252 1110211111123460000 TEST_HIST ID2 253 1110211111123460000 TEST_HIST ID2 254 11102111111234500000
# 使用直方圖顯示的為popular value實際只有6行的1110211111123450值驗證:
# 可以看到執行計劃依然根據popular value公式,走了全表掃描。
# cardinality計算公式如下:
cardinality = NUM_ROWS * selectivity selectivity = (Buckets_this_popular_value) / Buckets_total) * Null_Adjust Null_Adjust = (NUM_ROWS - NUM_NULLS) / NUM_ROWS
註釋:這裡Buckets_this_popular_value為popular value所佔buckets總數,Buckets_Total為buckets總數。
將上述統計資訊代入上述公式
cardinality = 22449 * (112 / 254) * 1 ≈ 9921 符合下面計算公式:
SCOTT@honor1 > select id1,id2 from test_hist where id2=1110211111123450; 6 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 3529380458 ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9921 | 232K| 27 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_HIST | 9921 | 232K| 27 (0)| 00:00:01 | -------------------------------------------------------------------------------
# 使用真正的popular value但是在直方圖中為non-popular value反而走了不應該走的索引掃描:
SCOTT@honor1 > select id1,id2 from test_hist where id2=1110211111123451; 10007 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 2004230999 ------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 24 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| TEST_HIST | 1 | 24 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_TEST_HIST_ID2 | 1 | | 1 (0)| 00:00:01 | -------------------------------------------------------------------------------------------------
2.5 根據上述直方圖資料分佈以及測試得出如下情況
(1)直方圖收集number型別時,字符集為AL32UTF8時,取number前15位,其餘位有多少位補多少0,第十五位是否出現四捨五入情況,取決於資料分佈:
hight balanced直方圖含義為每個bucket中平均分佈所有not null值,當第十六位為5、6、7、8、9的總和無法放在取前十五位第十六位補0的值所在bucket中時,
那麼會再出現下一個bucket,這個bucket的endpoint_value將會為第十五位加一,這個情況可以透過測試得到驗證。
(2)ID2值1110211111123450從126 bucket開始表示該值為popular value,0~127個bucket的endpoint_value相同。
endpoint_number為0的user_tab_col_statistics行,表示該endpoint_value行數不足一個bucket,endpoint_number為累計值bucket number。
(3)多個endpoint_value相同的為popular value,可以看出當1110211111123451為最小值且行數大於1個bucket後,則bucket分佈如下,不會包含endpoint_number為0的資料分佈:
SCOTT@honor1 > select table_name,column_name,endpoint_number,to_char(ENDPOINT_VALUE),ENDPOINT_ACTUAL_VALUE from user_tab_histograms where table_name='TEST_HIST' order by column_name,endpoint_number; TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER TO_CHAR(ENDPOINT_VALUE) ENDPOINT_ACTUAL_VALU --------------- -------------------- --------------- ---------------------------------------- -------------------- TEST_HIST ID2 126 1110211111123450 TEST_HIST ID2 127 11102111111234500
當最小值不滿足一個bucket時,則會有直方圖中endpoint_number為0的行數,但是並不是一個bucket,只是代表該endpoint_value並不滿足一個bucket,如下:
# 下面資料為再次插入了1110211111123450值與1110211111123450~1110211111123459部分值重新收集的統計資訊
TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALU ------------------------------ -------------------- ---------------------------------------- ---------------------------------------- -------------------- TEST_HIST ID2 0 1110211111123450 TEST_HIST ID2 112 1110211111123450 ...
(4)取15位後,第十五位四捨五入,其餘位補0後,那麼實際查詢時,傳入值如果為具體值,不會參考具體值,上面測試
已經驗證。
3.19c情況會有所改善嗎?
3.1 構造如下資料分佈的測試表
CZH@czhpdb > select count(*) from test_hist; COUNT(*) ---------------------------------------- 27200
CZH@czhpdb > select id2,count(*) from test_hist group by id2 having count(*) > 2 order by 1; ID2 COUNT(*) ---------------------------------------- ---------------------------------------- 1110211111123450 900 1110211111123451 10001 1110211111123452 901 1110211111123453 901 1110211111123454 901 1110211111123455 901 1110211111123456 901 1110211111123457 901 1110211111123458 901 其餘值均為distinct值,均只有1個。
3.2 收集直方圖
CZH@czhpdb > select * from user_tab_histograms where table_name='TEST_HIST' and column_name='ID2' order by endpoint_number; TABLE_NAME COLUM ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE ENDPOINT_ACTUAL_VALUE_RAW ENDPOINT_REPEAT_COUNT SCOPE --------------- ----- --------------- ---------------------------------------- ------------------------- ------------------------- --------------------- ------- TEST_HIST ID2 69 1110211111123450 1110211111123450 C80C0B160C0C0D2333 0 SHARED TEST_HIST ID2 838 1110211111123450 1110211111123451 C80C0B160C0C0D2334 0 SHARED TEST_HIST ID2 907 1110211111123450 1110211111123452 C80C0B160C0C0D2335 0 SHARED TEST_HIST ID2 974 1110211111123450 1110211111123453 C80C0B160C0C0D2336 0 SHARED TEST_HIST ID2 1040 1110211111123450 1110211111123454 C80C0B160C0C0D2337 0 SHARED TEST_HIST ID2 1107 1110211111123450 1110211111123455 C80C0B160C0C0D2338 0 SHARED TEST_HIST ID2 1174 1110211111123460 1110211111123456 C80C0B160C0C0D2339 0 SHARED TEST_HIST ID2 1241 1110211111123460 1110211111123457 C80C0B160C0C0D233A 0 SHARED TEST_HIST ID2 1307 1110211111123460 1110211111123458 C80C0B160C0C0D233B 0 SHARED TEST_HIST ID2 1308 11102111111234500 11102111111234510 C9020C030C0C0C182E0B 0 SHARED TEST_HIST ID2 1309 11102111111234500 11102111111234524 C9020C030C0C0C182E19 0 SHARED TEST_HIST ID2 1310 11102111111234500 11102111111234537 C9020C030C0C0C182E26 0 SHARED TEST_HIST ID2 1311 11102111111234600 11102111111234551 C9020C030C0C0C182E34 0 SHARED TEST_HIST ID2 1312 11102111111234600 11102111111234564 C9020C030C0C0C182E41 0 SHARED TEST_HIST ID2 1313 11102111111234600 11102111111234578 C9020C030C0C0C182E4F 0 SHARED TEST_HIST ID2 1314 11102111111234600 11102111111234591 C9020C030C0C0C182E5C 0 SHARED TEST_HIST ID2 1315 111021111112345000 111021111112345105 C90C0B160C0C0D233406 0 SHARED TEST_HIST ID2 1316 111021111112345000 111021111112345118 C90C0B160C0C0D233413 0 SHARED ...
# 省略其餘bucket直到2048
可以看到Oracle 19c對直方圖完善非常明顯,可以執行與11.2.0.4中無法正確選擇執行計劃的SQL,測試如下:
# 分別測試兩個non-popular與 一個真正popular值
CZH@czhpdb > set autot traceonly CZH@czhpdb > select id1,id2 from test_hist where id2=1110211111123451; 10001 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 3529380458 ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10213 | 229K| 33 (4)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_HIST | 10213 | 229K| 33 (4)| 00:00:01 | -------------------------------------------------------------------------------
CZH@czhpdb > select id1,id2 from test_hist where id2=1110211111123450; 900 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 2521703107 --------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 916 | 21068 | 9 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| TEST_HIST | 916 | 21068 | 9 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_TEST_HIST_ID2 | 916 | | 4 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------------------
CZH@czhpdb > select id1,id2 from test_hist where id2=1110211111123453; 901 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 2521703107 --------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 890 | 20470 | 9 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| TEST_HIST | 890 | 20470 | 9 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_TEST_HIST_ID2 | 890 | | 4 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------------------
# 可以看到19c雖然endpoint_value在AL32UTF8字符集下也會將16位變成0,但是會正確顯示ENDPOINT_ACTUAL_VALUE,這就會給CBO提供非常充足資訊
# 可以看到CBO有了充足資訊,可以正確選擇執行計劃,19c相比11.2.0.4進步非常明顯。
4. 在11g中這種情況有辦法改善嗎?
有辦法改善,但是不同情況 有不同辦法,針對本文上面資料情況,可以採用函式索引辦法,如下:
# 可透過substr函式索引改善上文中資料傾斜,直方圖無法解決問題,測試如下:
# s ubstr函式會隱式使用to_char轉換number建立索引,查詢時,需要用單引號括起來,否則會發生隱式轉換,將不走函式索引。
SCOTT@honor1 > create index idx_func_id2 on test_hist(substr(id2,16,5)); SCOTT@honor1 > exec dbms_stats.gather_table_stats(user,'TEST_HIST',method_opt=>'for all columns size 254',estimate_percent=>100,cascade=>true,no_invalidate=>false); SCOTT@honor1 > select id1,id2 from test_hist where substr(id2,16,5)='1'; # 需要單引號括起來,否則不走索引 10001 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 3529380458 ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9921 | 261K| 23 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_HIST | 9921 | 261K| 23 (0)| 00:00:01 | -------------------------------------------------------------------------------
SCOTT@honor1 > select id1,id2 from test_hist where substr(id2,16,5)='2'; Execution Plan ---------------------------------------------------------- Plan hash value: 3835702174 -------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 27 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| TEST_HIST | 1 | 27 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_FUNC_ID2 | 1 | | 1 (0)| 00:00:01 | --------------------------------------------------------------------------------------------
# 可以看到已經可以正確選擇執行計劃。
# 這種方法僅適用於資料前多少位完全一致,如果資料所有位均有可能發生變化,這種情況比較難以穩定執行計劃。
綜上:
(1)11.2.0.4 number取前15,其餘位根據情況補0.
(2)19c中bucket數量擴充套件至2048,已經可以正確處理上文中資料傾斜情況。
(3)所以直方圖並不是因為傾斜才需要收集,而是直方圖能真實反應列統計資訊才可以收集,才真正有幫助,上面例子就是列存在傾斜,但是由於直方圖11.2.0.4直方圖限制
導致無法反映真實統計資訊,導致傾斜值存在大量行數,但是Oracle會認為是non popular值,cardinality會計算很低,導致走索引掃描或者由於計算cardinality較低,會導致一旦跟其他表關聯時
走錯誤nest loop,將造成SQL執行效率低下。
直方圖其他情況(11.2.0.4為例):
詳情參見文章:
1、如果目標列的distinct值的數量和目標表的資料量相同,即使該目標列在SYS.COL_USAGE$中有使用記錄,
Oracle在自動收集直方圖統計資訊的時候也不會對該列收集直方圖統計資訊;
2、在手工收集直方圖統計資訊的時候,如果我手工指定的bucket的數量等於目標列的distinct值的數量,
且這個值是小於等於254的話,那麼Oracle此時收集的直方圖統計資訊的型別應該是FREQUENCY——這個結論成立的前提條件是該列的資料分佈是傾斜的。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/31439444/viewspace-2747700/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- autotrace 和explain plan for可能導致執行計劃錯誤AI
- 執行計劃錯誤導致系統負載高負載
- 【效能優化】執行計劃與直方圖優化直方圖
- 統計資訊不正確導致執行計劃的錯誤選擇
- 11G的SORT GROUP BY NOSORT導致錯誤執行計劃
- Mysql索引型別建立錯誤導致SQL查詢緩慢MySql索引型別
- 採用直方圖改善SQL執行效能直方圖SQL
- MySQL 因資料型別轉換導致執行計劃使用低效索引MySql資料型別索引
- 優化由直方圖資訊導致的sql效能問題優化直方圖SQL
- Oracle sql執行計劃OracleSQL
- sql最佳化-錯誤強制型別轉換導致索引失效SQL型別索引
- 一次ORACLE SQL謂詞跨界導致的執行計劃不準OracleSQL
- Oracle直方圖 32位元組限制Oracle直方圖
- CHAR型別函式索引導致結果錯誤型別函式索引
- 非索引列直方圖的丟失導致sql效能急劇下降索引直方圖SQL
- 統計資訊過舊導致SQL無法執行出來SQL
- Oracle資料遷移後由列的直方圖統計資訊引起的執行計劃異常Oracle直方圖
- 索引失效系列——統計量過期引起執行計劃錯誤索引
- Grant許可權導致執行計劃失效
- ORACLE analyse table方式收集表統計資訊導致SQL執行計劃不準確而效能下降OracleSQL
- oracle 9i 獲取sql執行計劃(書寫長的sql)OracleSQL
- oracle 直方圖Oracle直方圖
- 交流(1)-- 執行計劃錯誤問題
- Oracle中檢視已執行sql的執行計劃OracleSQL
- 完美的執行計劃導致的效能問題
- 執行計劃的偏差導致的效能問題
- 由於統計量失真造成SQL執行計劃錯誤一例SQL
- Oracle 獲取SQL執行計劃方法OracleSQL
- Oracle 檢視SQL的執行計劃OracleSQL
- Oracle手動固定SQL執行計劃OracleSQL
- oracle統計資訊和直方圖Oracle直方圖
- sql 執行計劃SQL
- 關於mysql varchar 型別的最大長度限制MySql型別
- MySQL5.6執行計劃錯誤案例分析MySql
- oracle直方圖使用Oracle直方圖
- Oracle直方圖解析Oracle直方圖圖解
- date列統計資訊陳舊導致sql沒有選擇最優執行計劃SQL
- 動態建立 @ViewChild 導致執行時錯誤的原因分析View