有索引卻走全表掃描的實驗分析

liuhaimiao發表於2014-08-12

最近在查詢某範圍的資料時,返回結果僅佔整個表資料的0.05%,在相應欄位上有對應索引,然而這部分資料並沒有走索引,
將根據如下實驗證明此種情況的原因
一 構建環境
1建立測試表 有兩個欄位,主鍵ID和手機號,手機號上有B樹索引
drop table TEST_HIS
-- Create table
create table TEST_HIS
(
  id    number,
  phone varchar2(32)
)
;
-- Add comments to the table
comment on table TEST_HIS
  is '測試直方圖';
-- Add comments to the columns
comment on column TEST_HIS.phone
  is '手機號,用於測試上面的直方圖是否影響走索引';
2 建立主鍵和索引
-- Create/Recreate indexes
create index IDX_PHONE on TEST_HIS (phone);
-- Create/Recreate primary, unique and foreign key constraints
alter table TEST_HIS
  add constraint PK_ID primary key (ID);
3 構造測試資料,根據如下指令碼,可以手工修改迴圈次數,向表中插入13開頭的電話號碼11000個,15開頭的電話號碼10個,
以及18開頭的電話號碼10個,null的空號碼11010個 
declare
 v_phone varchar2(32);
begin
 v_phone :='136818978';
 for i in 1..10000
 loop
  insert into test_his
    (id, phone)
  values
    (SEQ_TEST_HIST.NEXTVAL, v_phone||i);
    --(SEQ_TEST_HIST.NEXTVAL,null);
 end loop;
end;
4 收集統計資訊
SELECT t.NUM_ROWS,t.AVG_ROW_LEN,t.BLOCKS,t.TABLE_NAME,T.LAST_ANALYZED,t.EMPTY_BLOCKS FROM USER_TABLES T WHERE TABLE_NAME ='TEST_HIS';

begin
  dbms_stats.gather_table_stats(ownname => 'TEST',tabname => 'TEST_HIS',cascade => TRUE);
end;

SELECT t.NUM_ROWS,t.AVG_ROW_LEN,t.BLOCKS,t.TABLE_NAME,T.LAST_ANALYZED,t.EMPTY_BLOCKS FROM USER_TABLES T WHERE TABLE_NAME ='TEST_HIS';

二 查詢13開頭和15開頭的電話號碼的sql執行計劃

select  * from test.test_his H where h.phone like '13%';
執行計劃
----------------------------------------------------------
Plan hash value: 2828532586

------------------------------------------------------------------------------
| Id  | Operation         | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |          |  1754 | 17540 |    14   (8)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TEST_HIS |  1754 | 17540 |    14   (8)| 00:00:01 |
------------------------------------------------------------------------------

select  * from test.test_his H where h.phone like '15%';
執行計劃
----------------------------------------------------------
Plan hash value: 2828532586

------------------------------------------------------------------------------
| Id  | Operation         | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |          |  2227 | 22270 |    14   (8)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TEST_HIS |  2227 | 22270 |    14   (8)| 00:00:01 |
------------------------------------------------------------------------------
根據執行計劃我們看到查詢15開頭的資料返回的rows是2227,這和實際資料相差較大,於是開啟10053檢視相關資訊
三 檢視10053
alter system flush buffer_cache;
ALTER SESSION SET EVENTS '10053 trace name context forever, level 1';
explain plan for select * from test.test_his where phone like '15%';
alter session set events  '10053 trace name context off';
觀察10053中針對表的訪問路徑資訊
SINGLE TABLE ACCESS PATH
  Column (#2): PHONE(VARCHAR2)
    AvgLen: 6.00 NDV: 10867 Nulls: 11107 Density: 9.2022e-005
  Table: TEST_HIS  Alias: TEST_HIS    
    Card: Original: 22235  Rounded: 2227  Computed: 2226.62  Non Adjusted: 2226.62
  Access Path: TableScan
    Cost:  14.09  Resp: 14.09  Degree: 0
      Cost_io: 13.00  Cost_cpu: 6359522
      Resp_io: 13.00  Resp_cpu: 6359522
kkofmx: index filter:"TEST_HIS"."PHONE" LIKE '15%'
  Access Path: index (RangeScan)
    Index: IDX_PHONE
    resc_io: 604.00  resc_cpu: 5125340
    ix_sel: 0.20203  ix_sel_with_filters: 0.20203
    Cost: 604.88  Resp: 604.88  Degree: 1
  Best:: AccessPath: TableScan
         Cost: 14.09  Degree: 1  Resp: 14.09  Card: 2226.62  Bytes: 0
TEST_HIS中 NDV: 高達10867個,然而sql中要查詢的是以15開頭的電話號碼,15開頭,13開頭,18開頭的資料只有三種
因此統計資訊和實際sql要查詢的語義也不相符,因此雖然phone上有索引,但是sql的查詢條件是查詢某號碼開頭的資料時,已有索引
就不能準確描述15開頭的資料分佈情況,因此即便15開頭的資料只有10個,最佳化器也會按照普通索引上的統計資訊NDV的值計算
返回rows,而這個返回rows遠超過實際資料的分佈情況,因此建立直方圖收集正確的分佈資訊
四 建立直方圖
analyze table test.test_his compute statistics for columns phone;

alter system flush buffer_cache;
ALTER SESSION SET EVENTS '10053 trace name context forever, level 1';
explain plan for select * from test.test_his where phone like '15%';
alter session set events  '10053 trace name context off';

SINGLE TABLE ACCESS PATH
  Column (#2): PHONE(VARCHAR2)
    AvgLen: 5.00 NDV: 10021 Nulls: 11010 Density: 1.0720e-004
    Histogram: HtBal  #Bkts: 75  UncompBkts: 75  EndPtVals: 76
  Table: TEST_HIS  Alias: TEST_HIS    
    Card: Original: 22235  Rounded: 30  Computed: 29.93  Non Adjusted: 29.93
  Access Path: TableScan
    Cost:  14.09  Resp: 14.09  Degree: 0
      Cost_io: 13.00  Cost_cpu: 6359522
      Resp_io: 13.00  Resp_cpu: 6359522
kkofmx: index filter:"TEST_HIS"."PHONE" LIKE '15%'
  Access Path: index (RangeScan)
    Index: IDX_PHONE
    resc_io: 10.00  resc_cpu: 82314
    ix_sel: 0.002716  ix_sel_with_filters: 0.002716
    Cost: 10.01  Resp: 10.01  Degree: 1
******** Begin index join costing ********
  ****** trying bitmap/domain indexes ******
  Access Path: index (IndexOnly)
    Index: IDX_PHONE
    resc_io: 2.00  resc_cpu: 20243
    ix_sel: 0.002716  ix_sel_with_filters: 0.002716
    Cost: 2.00  Resp: 2.00  Degree: 0
    SORT resource      Sort statistics
      Sort width:         477 Area size:      417792 Max Area size:    83676160
      Degree:               1
      Blocks to Sort:       1 Row size:           21 Total Rows:             30
      Initial runs:         1 Merge passes:        0 IO Cost / pass:          0
      Total IO sort cost: 0      Total CPU sort cost: 5826332
      Total Temp space used: 0
  Access Path: index (FullScan)
    Index: PK_ID
    resc_io: 43.00  resc_cpu: 4712422
    ix_sel: 1  ix_sel_with_filters: 1
    Cost: 43.81  Resp: 43.81  Degree: 0
******** Cost index join ********
Index join: Considering index join to index IDX_PHONE
Index join: Joining index PK_ID
Ix HA Join
  Outer table:
    resc: 2.00  card 29.93  bytes: 15  deg: 1  resp: 2.00
  Inner table:
    resc: 54.76  card: 22235.00  bytes: 15  deg: 1  resp: 54.76
    using dmeth: 2  #groups: 1
    Cost per ptn: 0.88  #ptns: 1
    hash_area: 0 (max=0)   Hash join: Resc: 57.65  Resp: 57.65  [multiMatchCost=0.00]
******** Index join cost ********
Cost: 57.65 
******** End index join costing ********
  Best:: AccessPath: IndexRange  Index: IDX_PHONE
         Cost: 10.01  Degree: 1  Resp: 10.01  Card: 29.93  Bytes: 0

檢視對應執行計劃
---------

| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| T
ime     |

--------------------------------------------------------------------------------
---------

|   0 | SELECT STATEMENT            |           |    30 |   300 |    10   (0)| 0
0:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID| TEST_HIS  |    30 |   300 |    10   (0)| 0
0:00:01 |

|*  2 |   INDEX RANGE SCAN          | IDX_PHONE |    30 |       |     2   (0)| 0
0:00:01 |

--------------------------------------------------------------------------------

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/223653/viewspace-1249861/,如需轉載,請註明出處,否則將追究法律責任。

相關文章