關於索引聚簇因子(CLUSTERING_FACTOR)引起的sql優化過程

YallonKing發表於2012-02-07

以前的一個關於索引聚簇因子引起的sql優化過程

 
索引聚簇因子/聚集因子(CLUSTERING_FACTOR):按照索引列值進行了排序的索引行序和對應表中的資料行序的相似程度。
主要影響物件是堆表。
 
問題sql
SQL> select * from stock.stockinfo  where  tradingdate >=to_date('20110801','yyyymmdd');
開發反映此語句執行異常緩慢(和之前類似的語句相差時間過長)

其執行計劃如下:
SQL> select * from stock.stockinfo  where  tradingdate >=to_date('20110801','yyy
ymmdd');
41443 rows selected.
Elapsed: 00:01:13.69
Execution Plan
----------------------------------------------------------
Plan hash value: 4179235280
--------------------------------------------------------------------------------
-------------------------
| Id  | Operation                   | Name                      | Rows  | Bytes
| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
-------------------------
|   0 | SELECT STATEMENT            |                           | 23474 |  5020K
| 24879   (1)| 00:04:59 |
|   1 |  TABLE ACCESS BY INDEX ROWID| STOCKINFO                 | 23474 |  5020K
| 24879   (1)| 00:04:59 |
|*  2 |   INDEX RANGE SCAN          | STOCKSTOCKINFOTRADINGDATE | 24982 |
|    69   (0)| 00:00:01 |
--------------------------------------------------------------------------------
-------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("TRADINGDATE">=TO_DATE(' 2011-08-01 00:00:00', 'syyyy-mm-dd hh24:m
i:ss'))
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      43563  consistent gets
      25012  physical reads
          0  redo size
    9774561  bytes sent via SQL*Net to client
      30797  bytes received via SQL*Net from client
       2764  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      41443  rows processed

可見其走了B樹索引
檢視錶分析時間
SQL> select to_char(LAST_ANALYZED,'yyyy/mm/dd hh24:mi:ss') from user_tables wher
e table_name='STOCKINFO';
TO_CHAR(LAST_ANALYZ
-------------------
2011/12/06 22:03:29

表結構如下:
SQL> desc stock.stockinfo;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 TRADINGDATE                                        DATE
 DAYID                                              NUMBER
 WEEKID1                                            NUMBER
 WEEKDAYID1                                         NUMBER
 WEEKID2                                            NUMBER
 WEEKDAYID2                                         NUMBER
 WEEKID3                                            NUMBER
 WEEKDAYID3                                         NUMBER
 WEEKID4                                            NUMBER
 WEEKDAYID4                                         NUMBER
 WEEKID5                                            NUMBER
 WEEKDAYID5                                         NUMBER
 INNERCODE                                          NUMBER(10)
 SECUCODE                                           VARCHAR2(10)
 OPENPRICE                                          NUMBER(10,4)
 HIGHPRICE                                          NUMBER(10,4)
 LOWPRICE                                           NUMBER(10,4)
 CLOSEPRICE                                         NUMBER(10,4)
 CPRICE_A_DAY                                       NUMBER(10,4)
 OPRICE_A_DAY                                       NUMBER
 HPRICE_A_DAY                                       NUMBER
 LPRICE_A_DAY                                       NUMBER
 TURNOVERVOLUME                                     NUMBER(20)
 TURNOVERVALUE                                      NUMBER(19,4)
 NONRESTRICTEDSHARES                                NUMBER(16)
 AFLOATS                                            NUMBER
 ASHARES                                            NUMBER
 N_A_F_VALUE                                        NUMBER
 A_F_VALUE                                          NUMBER
 A_T_VALUE                                          NUMBER
 DAILYRATIO                                         NUMBER(20,8)
 SW3                                                VARCHAR2(10)
 NORMAL                                             NUMBER(20)
 ST                                                 NUMBER(20)
 STST                                               NUMBER(20)
 PT                                                 NUMBER(20)
 DELISTING                                          NUMBER(20)
 NTRADINGDAY                                        NUMBER(20)
 IPO                                                NUMBER(20)
 LOCALUPTIME                                        DATE

表相關資料量如下:
SQL> select count(*) from stock.stockinfo;
  COUNT(*)
----------
   4070420
SQL> select count(*) from stock.stockinfo  where  tradingdate >=to_date('2011080
1','yyyymmdd');
  COUNT(*)
----------
     41443
 
嘗試表全表掃描:
SQL> select /*+ no_index(stockinfo stockstockinfotradingdate) */ * from stock.st
ockinfo  where  tradingdate >=to_date('20110801','yyyymmdd');
41443 rows selected.
Elapsed: 00:00:11.94
Execution Plan
----------------------------------------------------------
Plan hash value: 580256601
-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           | 23474 |  5020K| 34800   (1)| 00:06:58 |
|*  1 |  TABLE ACCESS FULL| STOCKINFO | 23474 |  5020K| 34800   (1)| 00:06:58 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("TRADINGDATE">=TO_DATE(' 2011-08-01 00:00:00',
              'syyyy-mm-dd hh24:mi:ss'))

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     130248  consistent gets
     127560  physical reads
          0  redo size
    8721445  bytes sent via SQL*Net to client
      30797  bytes received via SQL*Net from client
       2764  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      41443  rows processed
 
結論:全表掃描效率高於B樹索引
問題:根據相關資料量來說,走索引應該比全表掃描效率會高些
 
設定10053事件抓取詳細trace
部分內容如下:
***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: STOCKINFO  Alias: STOCKINFO
    #Rows: 4070420  #Blks:  128167  AvgRowLen:  219.00
Index Stats::
  Index: STOCKSTOCKINFOTRADINGDATE  Col#: 1
    LVLS: 2  #LB: 11460  #DK: 2815  LB/K: 4.00  DB/K: 1527.00  CLUF: 4300220.00
Access path analysis for STOCKINFO
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for STOCKINFO[STOCKINFO]
  Table: STOCKINFO  Alias: STOCKINFO
    Card: Original: 4070420.000000  Rounded: 23474  Computed: 23474.13  Non Adjusted: 23474.13
  Access Path: TableScan
    Cost:  34800.38  Resp: 34800.38  Degree: 0
      Cost_io: 34714.00  Cost_cpu: 1745127320
      Resp_io: 34714.00  Resp_cpu: 1745127320
  Access Path: index (RangeScan)
    Index: STOCKSTOCKINFOTRADINGDATE
    resc_io: 24869.00  resc_cpu: 205332751
    ix_sel: 0.005767  ix_sel_with_filters: 0.005767
    Cost: 24879.16  Resp: 24879.16  Degree: 1
  Best:: AccessPath: IndexRange
  Index: STOCKSTOCKINFOTRADINGDATE
         Cost: 24879.16  Degree: 1  Resp: 24879.16  Card: 23474.13  Bytes: 0
***************************************
可見  #Rows: 4070420 CLUF: 4300220.00  其中索引聚合因子和記錄行數基本一樣
具體如下
SQL> select table_name,index_name,num_rows,leaf_blocks,clustering_factor from us
er_indexes where index_name ='STOCKSTOCKINFOTRADINGDATE';
TABLE_NAME INDEX_NAME                       NUM_ROWS LEAF_BLOCKS
---------- ------------------------------ ---------- -----------
CLUSTERING_FACTOR
-----------------
STOCKINFO  STOCKSTOCKINFOTRADINGDATE         4331799       11460
          4300220
 
最好情況應該是索引塊數和索引聚簇因子基本相同。

嘗試按索引列重建該表及相關索引,即對比表進行重構。
SQL> select * from stock.stockinfo_bak  where  tradingdate >=to_date('20110801',
'yyyymmdd');
41443 rows selected.
Elapsed: 00:00:15.10
Execution Plan
----------------------------------------------------------
Plan hash value: 2415850697
--------------------------------------------------------------------------------
---
| Id  | Operation         | Name          | Rows  | Bytes | Cost (%CPU)| Time
  |
--------------------------------------------------------------------------------
---
|   0 | SELECT STATEMENT  |               |  1401 |   684K| 34588   (1)| 00:06:5
6 |
|*  1 |  TABLE ACCESS FULL| STOCKINFO_BAK |  1401 |   684K| 34588   (1)| 00:06:5
6 |
--------------------------------------------------------------------------------
---

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("TRADINGDATE">=TO_DATE(' 2011-08-01 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
Note
-----
   - dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     256930  consistent gets
     127173  physical reads
          0  redo size
    7165405  bytes sent via SQL*Net to client
      30797  bytes received via SQL*Net from client
       2764  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      41443  rows processed
在tradingdate列建立點陣圖索引
SQL> select * from stock.stockinfo_bak  where  tradingdate >=to_date('20110801',
'yyyymmdd');
41443 rows selected.
Elapsed: 00:00:04.15
Execution Plan
----------------------------------------------------------
Plan hash value: 3810964835
--------------------------------------------------------------------------------
------------------
| Id  | Operation                    | Name              | Rows  | Bytes | Cost
(%CPU)| Time     |
--------------------------------------------------------------------------------
------------------
|   0 | SELECT STATEMENT             |                   |  2500 |  1220K| 29007
   (1)| 00:05:49 |
|   1 |  TABLE ACCESS BY INDEX ROWID | STOCKINFO_BAK     |  2500 |  1220K| 29007
   (1)| 00:05:49 |
|   2 |   BITMAP CONVERSION TO ROWIDS|                   |       |       |
      |          |
|*  3 |    BITMAP INDEX RANGE SCAN   | IDX_BIT_STOCKINFO |       |       |
      |          |
--------------------------------------------------------------------------------
------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("TRADINGDATE">=TO_DATE(' 2011-08-01 00:00:00', 'syyyy-mm-dd hh24:m
i:ss'))
       filter("TRADINGDATE">=TO_DATE(' 2011-08-01 00:00:00', 'syyyy-mm-dd hh24:m
i:ss'))

Note
-----
   - dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       5340  consistent gets
         77  physical reads
      94840  redo size
    9774561  bytes sent via SQL*Net to client
      30797  bytes received via SQL*Net from client
       2764  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      41443  rows processed
 
再次檢視索引聚集因子值
SQL> select table_name,index_name,num_rows,leaf_blocks,clustering_factor from us
er_indexes where table_name ='STOCKINFO_BAK';
TABLE_NAME INDEX_NAME                       NUM_ROWS LEAF_BLOCKS
---------- ------------------------------ ---------- -----------
CLUSTERING_FACTOR
-----------------
STOCKINFO_ IDX_BIT_STOCKINFO                    2815         133
BAK
             2815
sql得以優化,問題得以解決!
心得:10g以後的cbo即使按正確的路徑進行,也不一定最優,索引聚簇因子對sql的執行效率影響比較大。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/26143577/viewspace-715765/,如需轉載,請註明出處,否則將追究法律責任。

相關文章