oracle收集直方圖時 method_opt中引數 auto跟skewonly的區別

531968912發表於2016-09-14

我們先來看官方文件中對這兩個引數的解釋
- AUTO : Oracle determines the columns to collect histograms based on data distribution and the workload of the columns.
- SKEWONLY : Oracle determines the columns to collect histograms based on the data distribution of the columns
從字面上解讀,AUTO和SKEWONLY的區別就在於收集histograms時ORACLE的選擇條件不同。
AUTO會根據column資料分佈情況以及column的負載情況進行判斷是不是收集這個column的histogram.
而SKEWONLY只會根據column的資料分佈情況決定是否收集histogram。
這裡,我會強調這個“只”字,因為在SKEWONLY時,只要Oracle覺得資料分佈可能不均,就會收集列的柱狀圖。
而AUTO時,即使某一個column它的資料分佈不均,但只要這個column的workload很低,或者說沒有workload,那麼Oracle就不會收集這個column的histogram
Oracle自行判斷的依據就來源於col_usage$字典基表,若表上的某一列曾在硬解析(hard parse)過的SQL語句中充當過predicate(通俗的說就是where後的condition)的話,我們認為此列上有收集柱狀圖的必要,那麼col_usage$上就會被加入該列曾充當predicate的記錄。當DBMS_STATS.GATHER_TABLE_STATS儲存過程以’SIZE AUTO’模式執行時,收集程式會檢查col_usage$基表以判斷哪些列之前曾充當過predicate,若充當過則說明該列有收集柱狀圖的價值。
現在我們就來進行測試
06:34:25 > create table test2 as select * from dba_objects;

Table created.

Elapsed: 00:00:01.12
07:14:17 > select count(*) from dba_objects;

     COUNT(*)
-------------
        86277

Elapsed: 00:00:00.18
07:14:30 > update test2 set object_id=100 where rownum<=86200;

86200 rows updated.

Elapsed: 00:00:05.33
07:15:23 > commit;

Commit complete.
07:15:26 > create index idx_test2 on test2(object_id);

Index created.

我們此時檢視test2這張表object_id列的使用情況
07:35:25 > select * from sys.COL_USAGE$ where OBJ# in (select object_id from dba_objects where object_name='TEST2');

no rows selected

先來看此時表TEST2的OBJECT_ID列是否有柱狀圖
select ENDPOINT_NUMBER,ENDPOINT_VALUE from user_histograms where TABLE_NAME='TEST2' and COLUMN_NAME='OBJECT_ID';
no rows selected

select COLUMN_NAME,HISTOGRAM from user_tab_columns where TABLE_NAME='TEST2' and COLUMN_NAME='OBJECT_ID';

COLUMN_NAME                    HISTOGRAM
------------------------------ ---------------
OBJECT_ID                      NONE

此時我們以auto的方式來收集統計資訊
07:52:22 > exec dbms_stats.gather_table_stats('KIWI','TEST2',method_opt=>'for columns OBJECT_ID size auto');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.56
07:53:28 > select COLUMN_NAME,HISTOGRAM from user_tab_columns where TABLE_NAME='TEST2' and COLUMN_NAME='OBJECT_ID';

COLUMN_NAME                    HISTOGRAM
------------------------------ ---------------
OBJECT_ID                      NONE

由此可見,雖然此時object_id列上面資料分佈是極不均衡的,但由於該列沒有predicate,所以oracle並不會去收集他的柱狀圖
如果此時我們進行基於object_id的查詢
07:53:35 > select * from test2 where object_id=57089;

07:51:32 > exec DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO();

PL/SQL procedure successfully completed.

07:57:52 > select * from sys.COL_USAGE$ where OBJ# in (select object_id from dba_objects where object_name='TEST2');

         OBJ#       INTCOL# EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS   RANGE_PREDS    LIKE_PREDS    NULL_PREDS TIMESTAMP
------------- ------------- -------------- -------------- ----------------- ------------- ------------- ------------- -------------------
        87423             4              1              0                 0             0             0             0 2014-12-21 07:57:35
如果此時我們執行
exec dbms_stats.gather_table_stats('KIWI','TEST2',method_opt=>'for columns OBJECT_ID size auto');
select COLUMN_NAME,HISTOGRAM from user_tab_columns where TABLE_NAME='TEST2' and COLUMN_NAME='OBJECT_ID';
COLUMN_NAME                    HISTOGRAM
------------------------------ ---------------
OBJECT_ID                      FREQUENCY
可以看到已經生成了基於頻率的直方圖
下面我們刪掉這個直方圖,並且刪除掉sys.COL_USAGE$的列,使用skewonly的方式來生成直方圖
exec dbms_stats.delete_column_stats(user,'TEST2','OBJECT_ID')exec dbms_stats.delete_column_stats(user,'TEST2','OBJECT_ID')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.02
08:14:52 > select COLUMN_NAME,HISTOGRAM from user_tab_columns where TABLE_NAME='TEST2' and COLUMN_NAME='OBJECT_ID';

COLUMN_NAME                    HISTOGRAM
------------------------------ ---------------
OBJECT_ID                      NONE

08:16:16 > delete from sys.COL_USAGE$ where OBJ#=87423;

1 row deleted.

Elapsed: 00:00:00.01
08:16:18 > commit;

Commit complete.

Elapsed: 00:00:00.01
08:16:22 > select * from sys.COL_USAGE$ where OBJ# in (select object_id from dba_objects where object_name='TEST2');

no rows selected

刪除這些資訊以後我們再來看使用auto的方式來進行收集
08:15:02 > exec dbms_stats.gather_table_stats('KIWI','TEST2',method_opt=>'for columns OBJECT_ID size auto');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.43
08:17:14 > select COLUMN_NAME,HISTOGRAM from user_tab_columns where TABLE_NAME='TEST2' and COLUMN_NAME='OBJECT_ID';

COLUMN_NAME                    HISTOGRAM
------------------------------ ---------------
OBJECT_ID                      NONE
是不成功的,現在我們使用skewonly的方式來進行收集
08:17:19 > exec dbms_stats.gather_table_stats('KIWI','TEST2',method_opt=>'for columns OBJECT_ID size skewonly');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.30
08:20:26 >  select COLUMN_NAME,HISTOGRAM from user_tab_columns where TABLE_NAME='TEST2' and COLUMN_NAME='OBJECT_ID';

COLUMN_NAME                    HISTOGRAM
------------------------------ ---------------
OBJECT_ID                      FREQUENCY

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/25462274/viewspace-2124976/,如需轉載,請註明出處,否則將追究法律責任。

相關文章