優化由直方圖資訊導致的sql效能問題

下午在生產庫上捕獲一條sql，查詢了系統的幾個主要大表,cost有34776，並且在某列上使用了繫結變數，需要對其進行優化
sql大致內容如下
select si.id1,si.id2,si.id13
  from (select jusint_it.id1,jusint_it.id2,jusint_it.id3
      from justin, jusint_it
      where justin.id = jusint_it.order_id
         and justin.jusint_id  in (:1)
      union all
      select jusitin_it_hist.id1,jusitin_it_hist.id2,jusitin_it_hist.id3
      from justin_hist, jusitin_it_hist
      where justin_hist.order_id = jusitin_it_hist.order_id
         and justin_hist.jusint_id in (:2)) si
執行計劃
| SELECT STATEMENT                |                            |    |    |  34776 |    |    |
| VIEW                            |                            | 163K| 90M|  34776 |    |    |
|  UNION-ALL                      |                            |    |    |       |    |    |
| HASH JOIN                      |                            | 163K| 10M|  34723 |    |    |
| TABLE ACCESS BY INDEX ROWID    |justin_it                   |    6 |  366 |    3 |    |    |
| NESTED LOOPS                |                            | 125 | 8K|    54 |    |    |
| TABLE ACCESS BY INDEX ROWID |justin_hist                | 23K|  209K| 2053 |    |    |
| TABLE ACCESS FULL             |justin_it_hist             |    5M|  306M|  32607 |    |    |
|    INDEX RANGE SCAN             |idx_justin_order             |    6 |    |    2 |    |    |
|    INDEX RANGE SCAN             |idx_hist_justin_id          | 23K|    | 102 |    |    |
|    TABLE ACCESS BY INDEX ROWID |justin                      | 21 |  189 |    7 |    |    |
|    INDEX RANGE SCAN          |idx_so_justin_id             | 21 |    |    3 |    |    |
------------------------------------------------------------------------------------------------------------

經檢視，繫結變數列有直方圖，資訊如下
SQL> select table_name,num_distinct,density,num_nulls,num_buckets,last_analyzed from user_tab_col_statistics t where t.table_name in ('justin','justin_hist') and t.column_name='justin_id';

TABLE_NAME                   NUM_DISTINCT DENSITY  NUM_NULLS NUM_BUCKETS LAST_ANALYZED
------------------------------ ------------ ---------- ---------- ----------- -------------
justin                               9570 7.90613223    27923       254 2011-1-19 上午
justin_hist                            114  1.83563894    86491       114 2011-1-14 上午
注意density列，當沒有直方圖的時候，density=1/num_distinct;其值越接近0，表示該列的選擇性越高，而這裡兩個值都超過了1

檢視繫結變數justin_id在表justin和justin_hist中的分佈狀況
SQL> select count(*),count(distinct justin_id) from justin_hist;

  COUNT(*) COUNT(DISTINCTjustin_id)
---------- ---------------------------
2887588                      32146

SQL> select justin_id,count(*) from justin_hist group by justin_id having count(*) >1000;

   justin_id COUNT(*)
------------------- ----------
                     99300
               0 2729587

SQL> select count(*),count(distinct justin_id) from justin;

  COUNT(*) COUNT(DISTINCT justin_id)
---------- ---------------------------
239715                      9733

SQL> select justin_id,count(*) from justin group by justin_id having count(*) >1000;

   justin_id COUNT(*)
------------------- ----------
                  27934
            0       198284

可以看到justin_id的分佈情況及其不均勻，其中justin_id=0佔了全部值的80%還要多

SQL> select 2729587/2887588,198284/239715 from dual;

2729587/2887588 198284/239715
--------------- -------------
0.9452827065356 0.82716559247

將繫結變數替換成常量，重新觀察執行計劃
1. Null
----------------------------------------------------------------------------------------
| Id  | Operation          | Name          | Rows  | Bytes | Cost (%CPU)| Time    |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT    |                | 754K| 417M| 93401 (2)| 00:18:41 |
| 1 |  VIEW             |                | 754K| 417M| 93401 (2)| 00:18:41 |
| 2 | UNION-ALL       |                |    |    |          |       |
|*  3 | HASH JOIN       |                | 162K| 10M|  9149 (1)| 00:01:50 |
|*  4 |    TABLE ACCESS FULL| justin       | 27923 | 245K|  6767 (1)| 00:01:22 |
| 5 |    TABLE ACCESS FULL| justin_it    | 295K| 17M|  2378 (2)| 00:00:29 |
|*  6 | HASH JOIN       |                | 592K| 38M| 84251 (2)| 00:16:52 |
|*  7 |    TABLE ACCESS FULL| justin_hist    | 86491 | 760K| 51580 (1)| 00:10:19 |
| 8 |    TABLE ACCESS FULL| justin_it_hist  |  5354K| 306M| 32607 (2)| 00:06:32 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("justin"."ID"="justin_it"."ORDER_ID")
4 - filter("justin"."justin_id" IS NULL)
6 - access("justin_hist"."ORDER_ID"="justin_it_hist"."ORDER_ID")
7 - filter("justin_hist"."justin_id" IS NULL)

23 rows selected.

2. 0
------------------------------------------------------------------------------------------------
| Id  | Operation          | Name          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time    |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT    |                |  5649K|  3124M|    | 115K  (2)| 00:23:08 |
| 1 |  VIEW             |                |  5649K|  3124M|    | 115K  (2)| 00:23:08 |
| 2 | UNION-ALL       |                |    |    |    |          |       |
|*  3 | HASH JOIN       |                | 295K| 19M|  3936K| 10368 (1)| 00:02:05 |
|*  4 |    TABLE ACCESS FULL| justin       | 191K|  1686K|    |  6768 (1)| 00:01:22 |
| 5 |    TABLE ACCESS FULL| justin_it    | 295K| 17M|    |  2378 (2)| 00:00:29 |
|*  6 | HASH JOIN       |                |  5354K| 352M| 53M| 105K  (2)| 00:21:04 |
|*  7 |    TABLE ACCESS FULL| justin_hist    |  2671K| 22M|    | 51590 (1)| 00:10:20 |
| 8 |    TABLE ACCESS FULL| justin_it_hist  |  5354K| 306M|    | 32607 (2)| 00:06:32 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("justin"."ID"="justin_it"."ORDER_ID")
4 - filter("justin"."justin_id"=0)
6 - access("justin_hist"."ORDER_ID"="justin_it_hist"."ORDER_ID")
7 - filter("justin_hist"."justin_id"=0)

23 rows selected.

3. 除了null和0外的任意一個值
---------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                      | Rows  | Bytes | Cost (%CPU)| Time    |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT             |                            |  1408 | 797K| 492 (0)| 00:00:06 |
| 1 |  VIEW                         |                            |  1408 | 797K| 492 (0)| 00:00:06 |
| 2 | UNION-ALL                   |                            |    |    |          |       |
| 3 | TABLE ACCESS BY INDEX ROWID  | justin_it                |    6 | 366 |    3 (0)| 00:00:01 |
| 4 |    NESTED LOOPS             |                            |    8 | 560 |    7 (0)| 00:00:01 |
| 5 |    TABLE ACCESS BY INDEX ROWID| justin                   |    1 |    9 |    4 (0)| 00:00:01 |
|*  6 |    INDEX RANGE SCAN       | idx_so_justin_id          |    1 |    |    3 (0)| 00:00:01 |
|*  7 |    INDEX RANGE SCAN          | idx_justin_order          |    6 |    |    2 (0)| 00:00:01 |
| 8 | TABLE ACCESS BY INDEX ROWID  | justin_it_hist             |    7 | 420 |    3 (0)| 00:00:01 |
| 9 |    NESTED LOOPS             |                            |  1400 | 96600 | 485 (0)| 00:00:06 |
|  10 |    TABLE ACCESS BY INDEX ROWID| justin_hist                | 204 |  1836 | 20 (0)| 00:00:01 |
|* 11 |    INDEX RANGE SCAN       | idx_hist_justin_id       | 204 |    |    3 (0)| 00:00:01 |
|* 12 |    INDEX RANGE SCAN          | idx_hist_justin_order_id |    7 |    |    2 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

6 - access("justin"."justin_id"=864477)
7 - access("justin"."ID"="justin_it"."ORDER_ID")
  11 - access("justin_hist"."justin_id"=864477)
  12 - access("justin_hist"."ORDER_ID"="justin_it_hist"."ORDER_ID")

27 rows selected.

解決方法：
去掉該列上的直方圖資訊，使其採用穩定的執行計劃
因為這兩個表比較大，在主庫收集統計資訊會造成一定的壓力，所以考慮在邏輯備庫收集，然後將統計資訊匯入主庫.
邏輯備庫上上重新收集這兩個表，去掉所有列的直方圖,在11g，可以通過呼叫dbms_stats.delete_column_stats直接刪除，同時收集完成後lock其統計資訊，以避免再出現直方圖統計

收集統計資訊
begin
  dbms_stats.gather_table_stats(ownname => 'JUSTIN',tabname => 'justin_hist',method_opt => 'for all columns size 1',estimate_percent => 10);
end;

begin
  dbms_stats.gather_table_stats(ownname => 'JUSTIN',tabname => 'justin',method_opt => 'for all columns size 1',estimate_percent => 10);
end;

將統計資訊匯出至資料庫表
begin
  dbms_stats.create_stat_table(ownname => 'JUSTIN',stattab => 'temp_justin_stats');
end;

begin
  dbms_stats.export_table_stats(ownname => 'JUSTIN',tabname => 'JUSTIN',stattab => 'temp_justin_stats');
end;

begin
  dbms_stats.create_stat_table(ownname => 'JUSTIN',stattab => 'temp_justin_hist_stats');
end;

begin
  dbms_stats.export_table_stats(ownname => 'JUSTIN',tabname => 'JUSTIN_HIST',stattab => 'temp_justin_hist_stats');
end;

採用exp匯出並傳輸至主庫

[oracle@std ~]$ exp  JUSTIN/xxxxx tables=temp_justin_stats file=temp_justin_stats.dmp

[oracle@std ~]$ scp temp_justin_stats.dmp 192.168.132.21:/home/oracle

[oracle@racdg1 ~]$ imp JUSTIN/xxxxx fromuser=JUSTIN touser=JUSTIN file=temp_justin_stats.dmp

在主庫上將統計資訊匯入
begin
  dbms_stats.import_table_stats(ownname => 'JUSTIN',tabname => 'justin',stattab => 'temp_justin_stats');
end;

在主庫上檢視其直方圖資訊，bucket為1,已經刪掉了直方圖資訊
SQL> select table_name,num_distinct,density,num_nulls,num_buckets,last_analyzed from user_tab_col_statistics t where t.table_name in ('justin','justin_hist') and t.column_name='justin_id';

TABLE_NAME                   NUM_DISTINCT DENSITY  NUM_NULLS NUM_BUCKETS LAST_ANALYZED
------------------------------ ------------ ---------- ---------- ----------- -------------
justin                               1321 0.00075700    27650       1 2011-1-19 下午
justin_hist                         5441 0.00018378    100300          1 2011-1-19 下午

然後鎖住統計資訊
SQL> begin
  2 dbms_stats.lock_table_stats(ownname => 'JUSTIN',tabname => 'justin');
  3  end;
  4  /

PL/SQL procedure successfully completed

SQL> begin
  2 dbms_stats.lock_table_stats(ownname => 'JUSTIN',tabname => 'justin_hist');
  3  end;
  4  /

PL/SQL procedure successfully completed

檢驗
SQL> select table_name,u.STATTYPE_LOCKED from user_tab_statistics u where u.STATTYPE_LOCKED is not null;

TABLE_NAME                   STATTYPE_LOCKED
------------------------------ ---------------
justin                         ALL
justin_hist                   ALL

此時檢視該sql執行計劃，
| Id  | Operation                      | Name                      | Rows  | Bytes | Cost (%CPU)| Time    |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT             |                            |  4445 |  2517K|  1596 (1)| 00:00:20 |
| 1 |  VIEW                         |                            |  4445 |  2517K|  1596 (1)| 00:00:20 |
| 2 | UNION-ALL                   |                            |    |    |          |       |
| 3 | TABLE ACCESS BY INDEX ROWID  | JUSTIN_IT                |    6 | 366 |    3 (0)| 00:00:01 |
| 4 |    NESTED LOOPS             |                            | 945 | 66150 | 387 (0)| 00:00:05 |
| 5 |    TABLE ACCESS BY INDEX ROWID| JUSTIN                   | 162 |  1458 | 29 (0)| 00:00:01 |
|*  6 |    INDEX RANGE SCAN       | IDX_JUSTIN_HIST_ID       | 162 |    |    3 (0)| 00:00:01 |
|*  7 |    INDEX RANGE SCAN          | IDX_JUSTIN_ORDER          |    6 |    |    2 (0)| 00:00:01 |
| 8 | TABLE ACCESS BY INDEX ROWID  | JUSTIN_IT_HIST             |    7 | 420 |    3 (0)| 00:00:01 |
| 9 |    NESTED LOOPS             |                            |  3500 | 235K|  1209 (1)| 00:00:15 |
|  10 |    TABLE ACCESS BY INDEX ROWID| JUSTIN_HIST                | 511 |  4599 | 47 (0)| 00:00:01 |
|* 11 |    INDEX RANGE SCAN       | IDX_JUSTIN_HIST_ID       | 511 |    |    5 (0)| 00:00:01 |
|* 12 |    INDEX RANGE SCAN          | IDX_JUSTIN_ORDER          |    7 |    |    2 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------

來自 “ ITPUB部落格 ” ，連結：http://blog.itpub.net/15480802/viewspace-688363/，如需轉載，請註明出處，否則將追究法律責任。

優化由直方圖資訊導致的sql效能問題

相關文章