exists和連線方式

westzq1984發表於2009-12-17

這是一個開發提出的問題,他問下面2種那個快?
select *
  from tbas.d_term_type a
 where exists (SELECT 1
          from channel_chengdu.o_user partition(p200910) b
         where a.term_type_id = b.term_type_id)
   and term_type_id in (1, 75)

SELECT *
  FROM TBAS.D_TERM_TYPE A
 WHERE EXISTS (SELECT 1
   FROM CHANNEL_CHENGDU.O_USER PARTITION(P200910) B
  WHERE A.TERM_TYPE_ID = B.TERM_TYPE_ID
    AND ROWNUM < 2)
   AND TERM_TYPE_ID IN (1, 75)  

偶給出的答案是一樣塊,因為EXISTS是找到資料立刻返回

但是測試的結果是,SQL2超級快,SQL1等了1分鐘也沒返回

看一看SQL1的執行計劃
--------------------------------------------------------------------------------------------------------
| Id  | Operation                     |  Name                  | Rows  | Bytes | Cost  | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                        |     2 |   510 | 20003 |       |       |
|*  1 |  HASH JOIN SEMI               |                        |     2 |   510 | 20003 |       |       |
|   2 |   INLIST ITERATOR             |                        |       |       |       |       |       |
|   3 |    TABLE ACCESS BY INDEX ROWID| D_TERM_TYPE            |     2 |   484 |     1 |       |       |
|*  4 |     INDEX RANGE SCAN          | INX_TERM_TYPE_TYPE_ID  |     1 |       |     2 |       |       |
|*  5 |   TABLE ACCESS FULL           | O_USER                 |   423K|  5370K| 20001 |    10 |    10 |
--------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - access("A"."TERM_TYPE_ID"="B"."TERM_TYPE_ID")
   4 - access("A"."TERM_TYPE_ID"=1 OR "A"."TERM_TYPE_ID"=75)
   5 - filter("B"."TERM_TYPE_ID"=1 OR "B"."TERM_TYPE_ID"=75)

發現其使用了HASH JOIN SEMI。這個資料量,明顯應該是用NESTED LOOP才對
SELECT *
  FROM TBAS.D_TERM_TYPE A
 WHERE EXISTS (SELECT /*+nl_sj */1
   FROM CHANNEL_CHENGDU.O_USER PARTITION(P200910) B
  WHERE A.TERM_TYPE_ID = B.TERM_TYPE_ID)
   AND TERM_TYPE_ID IN (1, 75)
 
--------------------------------------------------------------------------------------------------------
| Id  | Operation                     |  Name                  | Rows  | Bytes | Cost  | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                        |     2 |   510 | 40003 |       |       |
|   1 |  NESTED LOOPS SEMI            |                        |     2 |   510 | 40003 |       |       |
|   2 |   INLIST ITERATOR             |                        |       |       |       |       |       |
|   3 |    TABLE ACCESS BY INDEX ROWID| D_TERM_TYPE            |     2 |   484 |     1 |       |       |
|*  4 |     INDEX RANGE SCAN          | INX_TERM_TYPE_TYPE_ID  |     1 |       |     2 |       |       |
|*  5 |   TABLE ACCESS FULL           | O_USER                 |   423K|  5370K| 20001 |    10 |    10 |
--------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access("A"."TERM_TYPE_ID"=1 OR "A"."TERM_TYPE_ID"=75)
   5 - filter("A"."TERM_TYPE_ID"="B"."TERM_TYPE_ID" AND ("B"."TERM_TYPE_ID"=1 OR "B"."TERM_TYPE_ID"=75))
 
這樣的話,SQL1就能很快執行,看來EXISTS是否立刻返回結果,和使用的連線方式是否是NESTED LOOP有關。HASH JOIN會忽略掉EXISTS的行為
而且,在COST計算時,NESTED 的成本也比 HASH高了20000,接近於一個全表掃描的成本
還有,以前的執行計劃不選擇NESTED LOOP的更本原因是O_USER表在連線列上無索引,CBO無法估算取得資料的時間

SQL> SELECT *
  2    FROM TBAS.D_TERM_TYPE A
  3   WHERE EXISTS (SELECT /*+nl_sj */1
  4       FROM CHANNEL_CHENGDU.O_USER PARTITION(P200910) B
         WHERE A.TERM_TYPE_ID = B.TERM_TYPE_ID)
  5    6     AND TERM_TYPE_ID IN (1, 75);


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT ptimizer=CHOOSE (Cost=40003 Card=2 Bytes=5
          10)

   1    0   NESTED LOOPS (SEMI) (Cost=40003 Card=2 Bytes=510)
   2    1     INLIST ITERATOR
   3    2       TABLE ACCESS (BY INDEX ROWID) OF 'D_TERM_TYPE' (Cost=1 Card=2 Bytes=484)
   4    3         INDEX (RANGE SCAN) OF 'INX_TERM_TYPE_TYPE_ID' (NON-UNIQUE) (Cost=2 Card=1)
   5    1     TABLE ACCESS (FULL) OF 'O_USER' (Cost=20001 Card=423008 Bytes=5499104)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         15  consistent gets
          0  physical reads
          0  redo size
       1908  bytes sent via SQL*Net to client
        656  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          2  rows processed

SQL> SELECT *
  2    FROM TBAS.D_TERM_TYPE A
  3   WHERE EXISTS (SELECT 1
  4       FROM CHANNEL_CHENGDU.O_USER PARTITION(P200910) B
  5      WHERE A.TERM_TYPE_ID = B.TERM_TYPE_ID
           AND ROWNUM < 2)
  6    7     AND TERM_TYPE_ID IN (1, 75)  ;


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT ptimizer=CHOOSE (Cost=1 Card=1 Bytes=242)
   1    0   INLIST ITERATOR
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'D_TERM_TYPE' (Cost=1 Card=1 Bytes=242)
   3    2       INDEX (RANGE SCAN) OF 'INX_TERM_TYPE_TYPE_ID' (NON-UNIQUE) (Cost=2 Card=1)
   4    3         COUNT (STOPKEY)
   5    4           TABLE ACCESS (FULL) OF 'O_USER' (Cost=20001 Card=423008 Bytes=5499104)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         15  consistent gets
          0  physical reads
          0  redo size
       1908  bytes sent via SQL*Net to client
        656  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          2  rows processed

邏輯讀也是一樣的。
 
-------------------------------------------------------------------------------------------------------
| Id  | Operation                    |  Name                  | Rows  | Bytes | Cost  | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                        |     1 |   242 |     1 |       |       |
|   1 |  INLIST ITERATOR             |                        |       |       |       |       |       |
|   2 |   TABLE ACCESS BY INDEX ROWID| D_TERM_TYPE            |     1 |   242 |     1 |       |       |
|*  3 |    INDEX RANGE SCAN          | INX_TERM_TYPE_TYPE_ID  |     1 |       |     2 |       |       |
|*  4 |     COUNT STOPKEY            |                        |       |       |       |       |       |
|*  5 |      TABLE ACCESS FULL       | O_USER                 |   423K|  5370K| 20001 |    10 |    10 |
-------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("SYS_ALIAS_1"."TERM_TYPE_ID"=1 OR "SYS_ALIAS_1"."TERM_TYPE_ID"=75)
       filter( EXISTS (SELECT /*+ */ 0 FROM "CHANNEL_CHENGDU"."O_USER" "B" WHERE ROWNUM<2 AND
              "B"."TERM_TYPE_ID"=:B1))
   4 - filter(ROWNUM<2)
   5 - filter("B"."TERM_TYPE_ID"=:B1)

這個執行計劃是首先進行了謂詞傳遞,用B.TERM_TYPE_ID IN (1, 75)來訪問O_USER表,結合COUNT STOPKEY,有結果立刻返回
INDEX RANGE SCAN進行對D_TERM_TYPE的訪問,但是注意下面的filter,其將完成一個過濾動作。

還有這個cost的計算也很神奇,不過偶重來不看那一列,也不熟悉這個,先不管了

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/8242091/viewspace-622689/,如需轉載,請註明出處,否則將追究法律責任。

相關文章