建立聯合函式索引解決top sql效能問題

myownstars發表於2011-05-12

今天捕獲到一條top sql,其大致內容如下
select count(*) from archive partition (1105) where done = 2 and name like '%@qq.com';
其中archive為分割槽表,按月分割槽,分割槽鍵為create_time列;其總資料為6千萬,分割槽1105資料有600多萬,且每日遞增;分割槽1104資料有1千萬,預計到月底1105資料也要有千萬
再來看兩個查詢列,done的選擇性極差,其distinct值只有3個;而name的distinct值有20多萬,其實就是使用者的郵箱地址;兩個列上目前都沒有索引;
目前該sql的執行計劃如下,邏輯讀有60多萬
---------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name          | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |               |     1 |    23 |   164K  (1)| 00:32:59 |       |       |
|   1 |  SORT AGGREGATE         |               |     1 |    23 |            |          |       |       |
|   2 |   PARTITION RANGE SINGLE|               |   243K|  5460K|   164K  (1)| 00:32:59 |     6 |     6 |
|*  3 |    TABLE ACCESS FULL    | archive       |   243K|  5460K|   164K  (1)| 00:32:59 |     6 |     6 |
---------------------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
     605310  consistent gets
     530101  physical reads
          0  redo size
        517  bytes sent via SQL*Net to client
        487  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)

sql結構極為簡單,在無法修改應用邏輯的前提下,為了降低cost,只能考慮建立索引,而done列肯定是不合適的;
name列的選擇性倒是還可以,可是使用的是模糊查詢,而且%位於前面,直接建立索引顯然是不合適的,前面提過name是使用者的郵箱地址,而郵箱地址的格式是有規律的,
為*****@.**;我們可以透過instr獲取@的位置,聯合substr可以得到@.**部分的內容,即substr(name,instr(name,'@'),於是name like '%@qq.com'可以改寫為
substr(name,instr(name,'@')='@qq.com',現在則可以基於substr(name,instr(name,'@')建立函式索引

SQL> Create index ind_func_name on  archive(substr(name,instr(name,'@'))) online;

Index created


建立後的執行計劃
--------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                          |     1 |    23 |   112K  (1)| 00:22:34 |       |       |
|   1 |  SORT AGGREGATE                     |                          |     1 |    23 |            |          |       |       |
|*  2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| archive                  | 48626 |  1092K|   112K  (1)| 00:22:34 |     6 |     6 |
|*  3 |    INDEX RANGE SCAN                 | ind_func_name            |   247K|       |     8  (25)| 00:00:01 |       |       |
--------------------------------------------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
     192078  consistent gets
          0  physical reads
          0  redo size
        517  bytes sent via SQL*Net to client
        487  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

執行代價有所降低,但是此時索引不是local的,執行計劃中也沒有出現partition字樣,重新建立一個local索引

SQL> Create index ind_func_name on  archive(substr(name,instr(name,'@'))) online local;
再看執行計劃
--------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                          |     1 |    23 | 12490   (1)| 00:02:30 |       |       |
|   1 |  SORT AGGREGATE                     |                          |     1 |    23 |            |          |       |       |
|   2 |   PARTITION RANGE SINGLE            |                          | 48626 |  1092K| 12490   (1)| 00:02:30 |     6 |     6 |
|*  3 |    TABLE ACCESS BY LOCAL INDEX ROWID| archive                  | 48626 |  1092K| 12490   (1)| 00:02:30 |     6 |     6 |
|*  4 |     INDEX RANGE SCAN                | ind_func_name            | 25856 |       |     3   (0)| 00:00:01 |     6 |     6 |
--------------------------------------------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
        106  recursive calls
          0  db block gets
     171296  consistent gets
       1485  physical reads
          0  redo size
        517  bytes sent via SQL*Net to client
        487  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

可以看到consistent gets建立索引前為605310,建立函式索引後下降為192078,函式索引修改成分割槽索引後變為171296;
雖然效能得到了提升,但是執行代價仍然有點大,此時可以考慮組合索引了
SQL> create index ind_func_name_done on archive(substr(name,instr(name,'@')),done) online local;

Index created.
再看一把執行計劃
-------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name                    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                         |     1 |    23 |     3   (0)| 00:00:01 |       |       |
|   1 |  SORT AGGREGATE         |                         |     1 |    23 |            |          |       |       |
|   2 |   PARTITION RANGE SINGLE|                         | 48626 |  1092K|     3   (0)| 00:00:01 |     6 |     6 |
|*  3 |    INDEX RANGE SCAN     | ind_func_name_done      | 48626 |  1092K|     3   (0)| 00:00:01 |     6 |     6 |
-------------------------------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
         15  recursive calls
          0  db block gets
       1186  consistent gets
       1181  physical reads
          0  redo size
        517  bytes sent via SQL*Net to client
        487  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
這樣一來,sql的consistent gets驟降到1186,而它也從top sql的佇列中消失了

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/15480802/viewspace-695166/,如需轉載,請註明出處,否則將追究法律責任。

相關文章