在生產環境出現一件怪事，一個查詢，按照日期生成查詢，而且，查詢的結果集佔總資料量的比重也很小，但這個查詢確不走索引掃描，見下例：

SQL> select count(*) from ttm_temp_order t;

COUNT(*)
----------
729465

總行數為70多萬條

SQL> select count(*) from ttm_temp_order t where t.order_time>=trunc(sysdate)-2;

COUNT(*)
----------
6076

查詢兩天內的資料只有6000多條，理應走索引掃描

SQL> exec dbms_stats.gather_table_stats(user,'TTM_TEMP_ORDER',null,null,method_opt => 'for all indexed columns size 254',cascade => true,degree => 4);

PL/SQL procedure successfully completed.

統計資訊保證是最新的

SQL> set autot trace exp stat
SQL> select * from ttm_temp_order t where t.order_time>=trunc(sysdate)-2;

6076 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 256603396

------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7116 | 1660K| 5386 (1)| 00:01:05 |
|* 1 | TABLE ACCESS FULL| TTM_TEMP_ORDER | 7116 | 1660K| 5386 (1)| 00:01:05 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("T"."ORDER_TIME">=TRUNC()

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
24680 consistent gets
0 physical reads
0 redo size
1214061 bytes sent via SQL*Net to client
4924 bytes received via SQL*Net from client
407 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
6076 rows processed

居然走的是全表掃描。強制索引掃描，看看邏輯讀

SQL> select /*+ index(t) */ * from ttm_temp_order t where t.order_time>=trunc(sysdate)-2;

6076 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 760559715

---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7116 | 1660K| 6745 (1)| 00:01:21 |
| 1 | TABLE ACCESS BY INDEX ROWID| TTM_TEMP_ORDER | 7116 | 1660K| 6745 (1)| 00:01:21 |
|* 2 | INDEX RANGE SCAN | TTM_TEMP_ORDER_IDX1 | 7116 | | 20 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("T"."ORDER_TIME">=TRUNC()

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
6445 consistent gets
0 physical reads
0 redo size
1543059 bytes sent via SQL*Net to client
4924 bytes received via SQL*Net from client
407 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
6076 rows processed

比全表掃描少了不少。那為什麼Oracle不選擇更好的索引掃描呢？這裡有一個cluster factor的概念

有興趣的朋友可以看一下官方文件：

SQL> select t.index_name,t.num_rows,t.clustering_factor from user_indexes t where t.index_name='TTM_TEMP_ORDER_IDX1';

INDEX_NAME NUM_ROWS CLUSTERING_FACTOR
------------------------------ ---------- -----------------
TTM_TEMP_ORDER_IDX1 729467 689050

cluster factor確實不小，這就是Oracle不選擇索引掃描的原因。那為什麼這裡的cluster factor這麼高呢？原因在於這個欄位儲存的是詳細的時間資訊，精確至秒，因此，按照cluster factor的計算方法，cluster factor確實比較高。

那麼有什麼辦法能夠讓Oracle走索引掃描呢？根據這個案例的特性，我們可以建立一個函式索引，將時間部分去除，只保留日期部分，這樣，應該可以降低cluster factor。建立索引過程略

SQL> select * from ttm_temp_order t where to_char(t.order_time,'YYYY-MM-DD')>=to_char(sysdate-2,'YYYY-MM-DD');

6076 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2246974986

---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6073 | 1417K| 388 (0)| 00:00:05 |
| 1 | TABLE ACCESS BY INDEX ROWID| TTM_TEMP_ORDER | 6073 | 1417K| 388 (0)| 00:00:05 |
|* 2 | INDEX RANGE SCAN | TTM_TEMP_ORDER_IDX3 | 6073 | | 20 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access(TO_CHAR(INTERNAL_FUNCTION("ORDER_TIME"),'YYYY-MM-DD')>=TO_CHAR(
YYY-MM-DD'))

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1335 consistent gets
0 physical reads
0 redo size
1212195 bytes sent via SQL*Net to client
4924 bytes received via SQL*Net from client
407 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
6076 rows processed

Oracle正確的選擇了索引掃描，並且，因為cluster factor的降低，連帶著consistent gets也有明顯的降低。

[@more@]

利用函式索引，最佳化因cluster factor過高導致不走索引一例

相關文章