Oracle多層級查詢相容的效能問題

regonly1發表於2011-05-19
我們在寫plsql的時候,由於外部條件的複雜性,在呼叫介面時,需要考慮到不同層次查詢的需求。以購買基金為例,有賬戶編號、份額類別、基金程式碼,其中一個賬戶下有多個份額類別的基金。

假設有兩個表:
accoinfo:  賬戶資訊表。有客戶編號和基金帳號
tradeinfo: 交易資訊表。有基金帳號、份額類別、基金程式碼、交易日期、交易金額、備註欄位
其中,在賬戶資訊表的客戶編號欄位和交易資訊的基金帳號欄位上建有索引。

現要求可以根據不同的輸入引數,得到不同的統計值,高一級的必填。
比如要統計某個客戶下,前收費下所有的基金程式碼的交易金額。則必須輸入客戶編號、基金帳號和份額類別。

下面例子是以賬戶來統計交易次數,其中提供了客戶編號和賬戶編號:
declare
    v_custno    varchar2(12) := '990000181565';
    v_accono    varchar2(12) := '880001491234';
    v_sharetype varchar2(6)  := '';
    v_fundcode  varchar2(12) := '';
    v_count     pls_integer  := 0;
begin
    select count(1) into v_count
      from tradeinfo i, accoinfo c
     where i.account_no = c.account_no
       and c.customer_no = v_custno
       and i.account_no = nvl(v_accono, i.account_no)
       and i.sharetype = nvl(v_sharetype, i.sharetype);
end;

這種編寫方式比較簡潔,但實際使用中可能會發現執行速度比想象中的要慢很多。觀察執行計劃發現是下面這個樣子的:
SQL> var v_custno varchar2(12);
SQL> var v_accono varchar2(12);
SQL> var v_sharetype varchar2(1);
SQL> var v_fundcode varchar2(6);
SQL> var v_count number;
SQL> exec :v_custno := '990000181565';
SQL> exec :v_accono := '880001491234';


SQL> select count(1)
  2    from tradeinfo i, accoinfo c
  3   where i.account_no = c.account_no
  4     and c.customer_no = :v_custno
  5     and i.account_no = nvl(:v_accono, i.account_no)
  6     and i.sharetype = nvl(:v_sharetype, i.sharetype)
  7  /

執行計劃:
-----------------------------------------------------------------------------------
| Id  | Operation             | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |           |     1 |    26 |     9  (12)| 00:00:01 |
|   1 |  SORT AGGREGATE       |           |     1 |    26 |            |          |
|   2 |   CONCATENATION       |           |       |       |            |          |
|*  3 |    FILTER             |           |       |       |            |          |
|*  4 |     HASH JOIN         |           |     1 |    26 |     5  (20)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL| TRADEINFO |     1 |    10 |     2   (0)| 00:00:01 |
|*  6 |      TABLE ACCESS FULL| ACCOINFO  |     1 |    16 |     2   (0)| 00:00:01 |
|*  7 |    FILTER             |           |       |       |            |          |
|*  8 |     HASH JOIN         |           |     1 |    26 |     5  (20)| 00:00:01 |
|*  9 |      TABLE ACCESS FULL| TRADEINFO |     1 |    10 |     2   (0)| 00:00:01 |
|* 10 |      TABLE ACCESS FULL| ACCOINFO  |     1 |    16 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

下面是謂詞邏輯:
---------------------------------------------------
   3 - filter(:v_accono IS NULL)
   4 - access("I"."ACCOUNT_NO"="C"."ACCOUNT_NO")
   5 - filter("I"."SHARETYPE"=NVL(:V_SHARETYPE,"I"."SHARETYPE") AND
              "I"."ACCOUNT_NO" IS NOT NULL)
   6 - filter("C"."CUSTOMER_NO"=:v_custno)
   7 - filter(:v_accono IS NOT NULL)
   8 - access("I"."ACCOUNT_NO"="C"."ACCOUNT_NO")
   9 - filter("I"."ACCOUNT_NO"=:v_accono AND
              "I"."SHARETYPE"=NVL(:V_SHARETYPE,"I"."SHARETYPE"))
  10 - filter("C"."CUSTOMER_NO"=:v_custno)


很奇怪的方式,兩個表關聯,明明可以透過索引來查詢,卻以一個複雜的計劃掃描全表來實現。使得一行的記錄需要大量的一致性讀和物理讀來實現:
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     360297  consistent gets
     342956  physical reads
          1  rows processed

但是,如果嘗試不用繫結變數而是用實際值去執行該SQL語句,卻會是另一種情況:
SQL> select count(1)
  2    from tradeinfo i, accoinfo c
  3   where i.account_no = c.account_no
  4     and c.customer_no = '990000181565'
  5     and i.account_no = nvl('880001491234', i.account_no)
  6     and i.sharetype = nvl('', i.sharetype)
  7  /

執行計劃:
--------------------------------------------------------------------------------------------
| Id  | Operation                      | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |           |     1 |    26 |    11   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE                |           |     1 |    26 |            |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID  | TRADEINFO |     1 |    10 |     3   (0)| 00:00:01 |
|   3 |    NESTED LOOPS                |           |     1 |    26 |    11   (0)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID| ACCOINFO  |     2 |    32 |     5   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN          | ICUSTNO   |     2 |       |     3   (0)| 00:00:01 |
|*  6 |     INDEX RANGE SCAN           | IACCONO   |     1 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

謂詞邏輯:
---------------------------------------------------
   2 - filter("I"."SHARETYPE"=NVL('',"I"."SHARETYPE"))
   5 - access("C"."CUSTOMER_NO"='990000181565')
   6 - access("I"."ACCOUNT_NO"="C"."ACCOUNT_NO")
       filter("I"."ACCOUNT_NO"=NVL('880001491234',"I"."ACCOUNT_NO"))

代價統計資訊:
----------------------------------------------------------
         13  recursive calls
          0  db block gets
        443  consistent gets
          4  physical reads
          1  rows processed

可以看到,不使用繫結變數後,"consistent gets"和"physical reads"都從
     360297  consistent gets
     342956  physical reads
下降到:     
        443  consistent gets
          4  physical reads
帶來的效益是很明顯的。執行速度上,也從原來的幾分鐘變成了不到1秒。那是什麼影響了它的執行計劃呢?之前,為了重現這個案例,我也嘗試過很多其他的資料方式,但是最終只有在這個案例上重現了。從這個案例中,可以猜測出這種問題可能是其中兩個級別的可選擇性比較差,重複值很高引起的。那如果對錶進行了分析,又會是什麼情況呢?
對錶做30%的抽樣分析:
call dbms_stats.gather_table_stats(user, 'ACCOINFO', null, 30);
call dbms_stats.gather_table_stats(user, 'TRADEINFO', null, 30);

檢視分析後的執行計劃:
----------------------------------------------------------------------------------------------
| Id  | Operation                        | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |           |     1 |    41 |    25   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE                  |           |     1 |    41 |            |          |
|   2 |   CONCATENATION                  |           |       |       |            |          |
|*  3 |    FILTER                        |           |       |       |            |          |
|*  4 |     TABLE ACCESS BY INDEX ROWID  | TRADEINFO |     1 |    15 |     5   (0)| 00:00:01 |
|   5 |      NESTED LOOPS                |           |     3 |   123 |    15   (0)| 00:00:01 |
|   6 |       TABLE ACCESS BY INDEX ROWID| ACCOINFO  |     2 |    52 |     5   (0)| 00:00:01 |
|*  7 |        INDEX RANGE SCAN          | ICUSTNO   |     2 |       |     3   (0)| 00:00:01 |
|*  8 |       INDEX RANGE SCAN           | IACCONO   |     2 |       |     2   (0)| 00:00:01 |
|*  9 |    FILTER                        |           |       |       |            |          |
|* 10 |     TABLE ACCESS BY INDEX ROWID  | ACCOINFO  |     1 |    26 |     4   (0)| 00:00:01 |
|  11 |      NESTED LOOPS                |           |     1 |    41 |    10   (0)| 00:00:01 |
|* 12 |       TABLE ACCESS BY INDEX ROWID| TRADEINFO |     1 |    15 |     6   (0)| 00:00:01 |
|* 13 |        INDEX RANGE SCAN          | IACCONO   |     2 |       |     3   (0)| 00:00:01 |
|* 14 |       INDEX RANGE SCAN           | ICUSTNO   |     2 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
以上計劃,相對來說,還是比較複雜,但是在執行速度上已經快了很多,基本上符合按索引取資料的基本要求。

統計資訊:
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          1  rows processed

在統計資訊上也比前兩個少了很多。但是由於客戶環境的問題,是不允許收集統計資訊的(客戶的奇怪規定)。所以要最佳化語句,只能從語句本身來著手了。按照之前的語句,我們只要排除它因為自連線帶來的執行計劃誤導問題,就可以解決此問題了,因此,可以按如下方式修改:
select count(1)
  from tradeinfo i, accoinfo c
 where i.account_no = c.account_no
   and c.customer_no = :v_custno
   and (i.account_no = :v_accono or :v_accono is null)
   and (i.sharetype = :v_sharetype or :v_sharetype is null)


表結構與資料如下:
drop table tradeinfo;
drop table accoinfo;
create table tradeinfo(
    account_no varchar2(12),
    sharetype  varchar2(1),
    fundcode varchar2(6),
    trade_date date,
    trade_amount number,    
    remark char(1000)
);
create table accoinfo(customer_no varchar2(12), account_no varchar2(12));
--測試資料
--truncate table tradeinfo
insert into tradeinfo(account_no, sharetype, fundcode, trade_date, trade_amount, remark)
select '88' || lpad(mod(rownum, trunc(dbms_random.value(7,9))*100000), 10, '0'),
       chr(round(dbms_random.value(65,66))),
       trunc(dbms_random.value(200000, 200100)),
       date'2000-1-1' + numtodsinterval(trunc(dbms_random.value(100,200)),'day') + numtoyminterval(trunc(dbms_random.value(1,10)),'year'),
       round(dbms_random.value(1000,10000),4), rownum
  from dual
connect by rownum <= 2140000
order by dbms_random.normal

--truncate table accoinfo
insert into accoinfo(customer_no, account_no)
select '99' || lpad(mod(rownum, trunc(dbms_random.value(1,4))*250000), 10, '0'), account_no
  from (select account_no
          from tradeinfo
         where account_no is not null
         group by account_no)
order by dbms_random.normal

--建立索引
create index iaccono on tradeinfo(account_no) nologging tablespace fund_index;
create index icustno on accoinfo(customer_no) nologging tablespace fund_index;

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/12932950/viewspace-695792/,如需轉載,請註明出處,否則將追究法律責任。

相關文章