Oracle 11g新特性:多列統計資訊(MultiColumn Statistics)

shytodear發表於2015-05-15

where子句後指定了一個表的多個列條件時,最佳化器通常會將多個列的選擇性(selectivity)相乘得到where語句的選擇性,導致最佳化器做出錯誤判斷!

Oracle 11g引入了多列統計資訊概念,如果上面情況列關聯性很好,我們可以做多列統計資訊收集,讓最佳化器做出正確判斷。

 

例如:

SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA';

 

  COUNT(*)

----------

      3341

 

SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA' and country_id=52790;

 

  COUNT(*)

----------

      3341

SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA' and country_id=52775;

 

  COUNT(*)

----------

          0

可以看出表customers的列CUST_STATE_PROVINCEcountry_id是有關係的,只有多列統計資訊才能讓最佳化器知道他們的關係,從而有一個更精確的選擇性(selectiveity)。

 

 

1、建立Column Group

DECLARE

  cg_name varchar2(30);

BEGIN

  cg_name := dbms_stats.create_extended_stats(null,'customers', 

             '(cust_state_province,country_id)');

END;

/

 

2、檢視column group name:

select sys.dbms_stats.show_extended_stats_name('sh','customers',

       '(cust_state_province,country_id)') col_group_name

from dual;

 

COL_GROUP_NAME

--------------------------------------------------------------------------------

SYS_STU#S#WF25Z#QAHIHE#MOFFMM_

 

3、刪除Column Group

exec dbms_stats.drop_extended_stats('sh','customers','(cust_state_province,country_id)');

 

4、監控Column Groups

--查詢多列統計資訊

Select extension_name, extension

from user_stat_extensions

where table_name='CUSTOMERS';

 

EXTENSION_NAME                EXTENSION

------------------------------ ----------------------------------------

SYS_STU#S#WF25Z#QAHIHE#MOFFMM_   ("CUST_STATE_PROVINCE","COUNTRY_ID")

 

--檢視distinct數和柱狀圖使用情況

select e.extension col_group, t.num_distinct, t.histogram

  from user_stat_extensions e, user_tab_col_statistics t

 where e.extension_name = t.column_name

   and e.table_name = t.table_name

   and t.table_name = 'CUSTOMERS';

COL_GROUP                              NUM_DISTINCT  HISTOGRAM

---------------------------------------- ------------ ---------------

("CUST_STATE_PROVINCE","COUNTRY_ID")           145  FREQUENCY

 

5、收集多列統計資訊

1)收集已存在的列組統計資訊

EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','CUSTOMERS',METHOD_OPT =>'FOR ALL COLUMNS SIZE AUTO');

 

2)收集新指定的列組統計資訊

EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','CUSTOMERS',METHOD_OPT =>'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS (CUST_STATE_PROVINCE,COUNTRY_ID) SIZE SKEWONLY');

 

 

 

實驗:

1)當不使用多列統計資訊時,真實結果是3341,執行計劃是1107.

SQL> exec dbms_stats.drop_extended_stats('sh','customers','(cust_state_province,country_id)');

 

PL/SQL procedure successfully completed.

 

SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA' and country_id=52790;

 

  COUNT(*)

----------

      3341

 

 

Execution Plan

----------------------------------------------------------

Plan hash value: 296924608

 

--------------------------------------------------------------------------------

| Id  | Operation      | Name      | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------

|   0 | SELECT STATEMENT   |        |     1 |    16 |   406      (1)| 00:00:05 |

|   1 |  SORT AGGREGATE    |               |     1 |    16 |      |              |

|*  2 |   TABLE ACCESS FULL| CUSTOMERS |  1107 | 17712 |   406 (1)| 00:00:05 |

--------------------------------------------------------------------------------

 

Predicate Information (identified by operation id):

---------------------------------------------------

 

   2 - filter("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790)

 

Statistics

----------------------------------------------------------

         120  recursive calls

           0  db block gets

       1588  consistent gets

       1454  physical reads

           0  redo size

         423  bytes sent via SQL*Net to client

         420  bytes received via SQL*Net from client

           2  SQL*Net roundtrips to/from client

           7  sorts (memory)

           0  sorts (disk)

           1  rows processed

 

2)當使用多列統計資訊時,真實結果是3341,執行計劃是3294.

SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','CUSTOMERS',METHOD_OPT =>'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS (CUST_STATE_PROVINCE,COUNTRY_ID) SIZE SKEWONLY');

 

PL/SQL procedure successfully completed.

 

SQL>

SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA' and country_id=52790;

 

  COUNT(*)

----------

      3341

 

 

Execution Plan

----------------------------------------------------------

Plan hash value: 296924608

 

--------------------------------------------------------------------------------

| Id  | Operation      | Name      | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------

|   0 | SELECT STATEMENT   |        |     1 |    16 |   406      (1)| 00:00:05 |

|   1 |  SORT AGGREGATE    |               |     1 |    16 |      |              |

|*  2 |   TABLE ACCESS FULL| CUSTOMERS |  3294 | 52704 |   406        (1)| 00:00:05 |

--------------------------------------------------------------------------------

 

Predicate Information (identified by operation id):

---------------------------------------------------

 

   2 - filter("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790)

 

 

Statistics

----------------------------------------------------------

           8  recursive calls

           0  db block gets

       1458  consistent gets

       1454  physical reads

           0  redo size

         423  bytes sent via SQL*Net to client

         420  bytes received via SQL*Net from client

           2  SQL*Net roundtrips to/from client

           0  sorts (memory)

           0  sorts (disk)

           1  rows processed

3)即以上情況,使用多列統計資訊能讓最佳化器得到更準確的判斷!

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/27126919/viewspace-1655727/,如需轉載,請註明出處,否則將追究法律責任。

相關文章