col_usage$與直方圖的收集

myownstars發表於2013-06-11

DB11202,兩節點RAC

 

建表指令碼,id列分佈極度不均

create table t1(id number(10), name varchar2(20)) nologging tablespace IBMDBA

 begin

  for i in 1.. 99999 loop

    insert /*+ append */ into t1 values(999,lpad('I',20));

  end loop;

  for i in 1.. 100 loop

    insert /*+ append */into t1 values(i,lpad('I',20));

  end loop;

  commit;

end;

/

 

收集統計資訊指令碼,如無特殊說明,以下試驗中均用此引數收集統計資訊

exec dbms_stats.gather_table_stats('SYS','T1');

SQL> select dbms_stats.get_prefs('method_opt','SYS','T1') as pref from dual;

 

PREF

--------------------------------------------------------------------------------

FOR ALL COLUMNS SIZE AUTO

SQL> select dbms_stats.get_prefs('estimate_percent','SYS','T1') from dual;

 

DBMS_STATS.GET_PREFS('ESTIMATE_PERCENT','SYS','T1')

--------------------------------------------------------------------------------

DBMS_STATS.AUTO_SAMPLE_SIZE

 

新建表後的資訊,其統計資訊和直方圖均無

SQL> select COLUMN_NAME,NUM_DISTINCT,LOW_VALUE,HIGH_VALUE,DENSITY,NUM_BUCKETS, LAST_ANALYZED,SAMPLE_SIZE,HISTOGRAM from dba_tab_columns where table_name='T1' and column_name in('ID','NAME');

 

COLUM NUM_DISTINCT LOW_VALUE                                HIGH_VALUE                                  DENSITY NUM_BUCKETS LAST_ANAL SAMPLE_SIZE HISTOGRAM

----- ------------ ---------------------------------------- ---------------------------------------- ---------- ----------- --------- ----------- ----------

NAME                                                                                                                                              NONE

ID                                                                                                                                                NONE

SQL> select num_rows,sample_size,last_analyzed from dba_tables where table_name='T1' and wner='SYS';

 

  NUM_ROWS SAMPLE_SIZE LAST_ANALYZED

---------- ----------- -------------------

 

 

以上是建立後的基本資訊,統計資訊為空;

 

 

1

沒有執行任何查詢,收集統計資訊

SQL> select num_rows,sample_size,last_analyzed from dba_tables where table_name='T1' and wner='SYS';

 

  NUM_ROWS SAMPLE_SIZE LAST_ANALYZED

---------- ----------- -------------------

100099      100099 2013-03-21 23:22:14

SQL> select COLUMN_NAME,ENDPOINT_NUMBER,ENDPOINT_VALUE,ENDPOINT_ACTUAL_VALUE from dba_tab_histograms where TABLE_NAME='T1' and wner='SYS' order by 1;

 

COLUM ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALU

----- --------------- -------------- --------------------

ID                  1            999

ID                  0              1

NAME                1     1.6681E+35

NAME                0     1.6681E+35

SQL> select COLUMN_NAME,NUM_DISTINCT,LOW_VALUE,HIGH_VALUE,DENSITY,NUM_BUCKETS, LAST_ANALYZED,SAMPLE_SIZE,HISTOGRAM from dba_tab_columns where table_name='T1' and column_name in('ID','NAME');

 

COLUM NUM_DISTINCT LOW_VALUE                                HIGH_VALUE                                  DENSITY NUM_BUCKETS LAST_ANALYZED       SAMPLE_SIZE HISTOGRAM

----- ------------ ---------------------------------------- ---------------------------------------- ---------- ----------- ------------------- ----------- ----------

NAME             1 2020202020202020202020202020202020202049 2020202020202020202020202020202020202049          1           1 2013-03-21 23:22:14      100099 NONE

ID             101 C102                                     C20A64                                    .00990099           1 2013-03-21 23:22:14      100099 NONE

 

SQL> select * from col_usage$ where OBJ#=(select object_id from dba_objects where object_name ='T1' and wner='SYS');

 

no rows selected

 

沒有收集直方圖,col_usage$也沒有記錄任何資訊

 

 

2

執行以表列為where條件的sql(必須為硬解析),再次收集統計資訊

 

SQL>  select count(*) from t1 where name='I';

 

  COUNT(*)

----------

         0

收集統計資訊

SQL>  select * from col_usage$ where OBJ#=(select object_id from dba_objects where object_name ='T1' and wner='SYS');

 

      OBJ#    INTCOL# EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS TIMESTAMP

---------- ---------- -------------- -------------- ----------------- ----------- ---------- ---------- -------------------

    132928          2              1              0                 0           0          0          0 2013-03-21 23:24:54

SQL> select num_rows,sample_size,last_analyzed from dba_tables where table_name='T1' and wner='SYS';

 

  NUM_ROWS SAMPLE_SIZE LAST_ANALYZED

---------- ----------- -------------------

100099      100099 2013-03-21 23:24:54

SQL> select COLUMN_NAME,NUM_DISTINCT,LOW_VALUE,HIGH_VALUE,DENSITY,NUM_BUCKETS, LAST_ANALYZED,SAMPLE_SIZE,HISTOGRAM from dba_tab_columns where table_name='T1' and column_name in('ID','NAME');

 

COLUM NUM_DISTINCT LOW_VALUE                                HIGH_VALUE                                  DENSITY NUM_BUCKETS LAST_ANALYZED       SAMPLE_SIZE HISTOGRAM

----- ------------ ---------------------------------------- ---------------------------------------- ---------- ----------- ------------------- ----------- ----------

NAME             1 2020202020202020202020202020202020202049 2020202020202020202020202020202020202049 5.1103E-06           1 2013-03-21 23:24:54        5376 FREQUENCY

ID             101 C102                                     C20A64                                    .00990099           1 2013-03-21 23:24:54      100099 NONE

 

SQL> select COLUMN_NAME,ENDPOINT_NUMBER,ENDPOINT_VALUE,ENDPOINT_ACTUAL_VALUE from dba_tab_histograms where TABLE_NAME='T1' and wner='SYS' order by 1;

 

COLUM ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALU

----- --------------- -------------- --------------------

ID                  0              1

ID                  1            999

NAME             5376     1.6681E+35

 

 

針對id列進行查詢

SQL> select count(*) from t1 where id=999;

 

  COUNT(*)

----------

     99999

收集統計資訊

SQL> select num_rows,sample_size,last_analyzed from dba_tables where table_name='T1' and wner='SYS';

 

  NUM_ROWS SAMPLE_SIZE LAST_ANALYZED

---------- ----------- -------------------

    100099      100099 2013-03-21 23:28:02

 

SQL> select * from col_usage$ where OBJ#=132928;

 

      OBJ#    INTCOL# EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS TIMESTAMP

---------- ---------- -------------- -------------- ----------------- ----------- ---------- ---------- -------------------

    132928          1              1              0                 0           0          0          0 2013-03-21 23:28:02

    132928          2              1              0                 0           0          0          0 2013-03-21 23:24:54

 

SQL> select COLUMN_NAME,NUM_DISTINCT,LOW_VALUE,HIGH_VALUE,DENSITY,NUM_BUCKETS, LAST_ANALYZED,SAMPLE_SIZE,HISTOGRAM from dba_tab_columns where table_name='T1' and column_name in('ID','NAME');

 

COLUM NUM_DISTINCT LOW_VALUE                                HIGH_VALUE                                  DENSITY NUM_BUCKETS LAST_ANALYZED       SAMPLE_SIZE HISTOGRAM

----- ------------ ---------------------------------------- ---------------------------------------- ---------- ----------- ------------------- ----------- ----------

NAME             1 2020202020202020202020202020202020202049 2020202020202020202020202020202020202049 4.9842E-06           1 2013-03-21 23:28:02        5512 FREQUENCY

ID             101 C102                                     C20A64                                   4.9842E-06           3 2013-03-21 23:28:02        5512 FREQUENCY

SQL> select COLUMN_NAME,ENDPOINT_NUMBER,ENDPOINT_VALUE,ENDPOINT_ACTUAL_VALUE from dba_tab_histograms where TABLE_NAME='T1' and wner='SYS' order by 1;

 

COLUM ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALU

----- --------------- -------------- --------------------

ID                  1             15

ID                  2             31

ID               5512            999

NAME             5512     1.6681E+35

 

id列也收集了直方圖,bucket3個;

結論:

column出現在where子句時,其會被col_usage$記錄並在下次收集統計資訊時嘗試收集其直方圖;

 

 

3

Col_usage$的作用

修改一下案例2

執行select count(*) from t1 where id=999;

收集統計資訊

刪除col_usage$中的記錄

重新收集統計資訊

SQL>  select COLUMN_NAME,NUM_DISTINCT,LOW_VALUE,HIGH_VALUE,DENSITY,NUM_BUCKETS, LAST_ANALYZED,SAMPLE_SIZE,HISTOGRAM from dba_tab_columns where table_name='T1' and column_name in('ID','NAME');

 

COLUMN_NAM NUM_DISTINCT LOW_VALUE                                HIGH_VALUE                                  DENSITY NUM_BUCKETS LAST_ANALYZED       SAMPLE_SIZE HISTOGRAM

---------- ------------ ---------------------------------------- ---------------------------------------- ---------- ----------- ------------------- ----------- ----------

NAME                  1 2020202020202020202020202020202020202049 2020202020202020202020202020202020202049          1           1 2013-03-22 02:34:17      100099 NONE

ID                  101 C102                                     C20A64                                    .00990099           1 2013-03-22 02:34:17      100099 NONE

此時id列沒有直方圖,因為col_usage$沒有相應記錄的緣故

 

注:不要手工更改col_usage$,否則很容易產生不可預知的問題

 

 

4

收集直方圖型別

重建該表,其id列有101distinct值,在不執行任何sql的情況下收集統計資訊

SQL> select * from col_usage$ where OBJ#=(select object_id from dba_objects where object_name ='T1' and wner='SYS');

 

no rows selected

 

exec dbms_stats.gather_table_stats('SYS','T1',method_opt => 'for all columns size 100');

其顯示如下

 

SQL> select COLUMN_NAME,NUM_DISTINCT,LOW_VALUE,HIGH_VALUE,DENSITY,NUM_BUCKETS, LAST_ANALYZED,SAMPLE_SIZE,HISTOGRAM from dba_tab_columns where table_name='T1' and column_name in('ID','NAME');

 

COLUMN_NAM NUM_DISTINCT LOW_VALUE                                HIGH_VALUE                                  DENSITY NUM_BUCKETS LAST_ANALYZED       SAMPLE_SIZE HISTOGRAM

---------- ------------ ---------------------------------------- ---------------------------------------- ---------- ----------- ------------------- ----------- ----------

NAME                  1 2020202020202020202020202020202020202049 2020202020202020202020202020202020202049 4.9969E-06           1 2013-03-22 03:05:44        5498 FREQUENCY

ID                  101 C102                                     C20A64                                   9.9938E-06         100 2013-03-22 03:05:44        5498 HEIGHT BAL

                                                                                                                                                                 ANCED

 

bucket調大至101

exec dbms_stats.gather_table_stats('SYS','T1',method_opt => 'for all columns size 101');

反覆執行多次,idbucket數量會波動

SQL> select COLUMN_NAME,NUM_DISTINCT,LOW_VALUE,HIGH_VALUE,DENSITY,NUM_BUCKETS, LAST_ANALYZED,SAMPLE_SIZE,HISTOGRAM from dba_tab_columns where table_name='T1' and column_name in('ID','NAME');

 

COLUMN_NAM NUM_DISTINCT LOW_VALUE                                HIGH_VALUE                                  DENSITY NUM_BUCKETS LAST_ANALYZED       SAMPLE_SIZE HISTOGRAM

---------- ------------ ---------------------------------------- ---------------------------------------- ---------- ----------- ------------------- ----------- ----------

NAME                  1 2020202020202020202020202020202020202049 2020202020202020202020202020202020202049 4.9208E-06           1 2013-03-22 03:06:13        5583 FREQUENCY

ID                  101 C102                                     C20A64                                   4.9208E-06           9 2013-03-22 03:06:13        5583 FREQUENCY

SQL> select COLUMN_NAME,NUM_DISTINCT,LOW_VALUE,HIGH_VALUE,DENSITY,NUM_BUCKETS, LAST_ANALYZED,SAMPLE_SIZE,HISTOGRAM from dba_tab_columns where table_name='T1' and column_name in('ID','NAME');

 

COLUMN_NAM NUM_DISTINCT LOW_VALUE                                HIGH_VALUE                                  DENSITY NUM_BUCKETS LAST_ANALYZED       SAMPLE_SIZE HISTOGRAM

---------- ------------ ---------------------------------------- ---------------------------------------- ---------- ----------- ------------------- ----------- ----------

NAME                  1 2020202020202020202020202020202020202049 2020202020202020202020202020202020202049 4.9923E-06           1 2013-03-22 03:05:21        5503 FREQUENCY

ID                  101 C102                                     C20A64                                   4.9923E-06           7 2013-03-22 03:05:21        5503 FREQUENCY

 

 

當列的distinct數量小於或等於指定的bucket數目時,會採用等頻直方圖,否則採用使用高度直方圖

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/15480802/viewspace-763737/,如需轉載,請註明出處,否則將追究法律責任。

相關文章