關於Oracle full outer join 的bug問題分析及處理

regonly1發表於2013-01-15
full (outer) join是用來全連線兩個表的語法。即希望將A表和B表關聯,能夠得到A表中有而B表中沒有的記錄,或者B表中有而A表中沒有的記錄。
如何判斷是否有該記錄,則透過on子句來關聯。
下面是一個例子:
SQL> with
  2  A as(select 1 a, 2 b from dual),
  3  B as(select 2 a, 3 b from dual)
  4  select * from A full join B
  5      on A.a=B.a
  6  /
 
         A          B          A          B
---------- ---------- ---------- ----------
         1          2           
                                  2          3
 
瞭解了以上基本原理後,我們應該知道,理論上講,A表和B表的在from子句中的順序是沒有關係的,也就是不影響結果。但是,實際上,卻出現了這樣的問題,下面是對這種情況的描述:
--------------------------------------------------------------------------------------------
| Id  | Operation                | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |                 | 12791 |   349K|    82   (3)| 00:00:01 |
|   1 |  VIEW                    |                 | 12791 |   349K|    82   (3)| 00:00:01 |
|   2 |   UNION-ALL              |                 |       |       |            |          |
|*  3 |    FILTER                |                 |       |       |            |          |
|*  4 |     HASH JOIN RIGHT OUTER|                 | 12790 |  1124K|    41   (3)| 00:00:01 |
|   5 |      TABLE ACCESS FULL   | JXC_RISHARESUM  |  1735 | 78075 |     7   (0)| 00:00:01 |
|   6 |      TABLE ACCESS FULL   | JXC_ALLTRADEDAY | 12790 |   562K|    33   (0)| 00:00:01 |
|*  7 |    HASH JOIN ANTI        |                 |     1 |    76 |    41   (3)| 00:00:01 |
|*  8 |     TABLE ACCESS FULL    | JXC_RISHARESUM  |     1 |    45 |     7   (0)| 00:00:01 |
|   9 |     TABLE ACCESS FULL    | JXC_ALLTRADEDAY | 12790 |   387K|    33   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
 
從以上執行計劃來看,在第四步驟,使用的是hash join rigth outer連線方式。而透過改變兩表的擺放順序,得到如下的執行計劃:
 
-----------------------------------------------------------------------------------------
| Id  | Operation             | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                 |  1876 | 52528 |    82   (3)| 00:00:01 |
|   1 |  VIEW                 |                 |  1876 | 52528 |    82   (3)| 00:00:01 |
|   2 |   UNION-ALL           |                 |       |       |            |          |
|*  3 |    FILTER             |                 |       |       |            |          |
|*  4 |     HASH JOIN OUTER   |                 |  1874 |   164K|    41   (3)| 00:00:01 |
|   5 |      TABLE ACCESS FULL| JXC_RISHARESUM  |  1735 | 78075 |     7   (0)| 00:00:01 |
|   6 |      TABLE ACCESS FULL| JXC_ALLTRADEDAY | 12790 |   562K|    33   (0)| 00:00:01 |
|*  7 |    HASH JOIN ANTI     |                 |     2 |   152 |    41   (3)| 00:00:01 |
|*  8 |     TABLE ACCESS FULL | JXC_ALLTRADEDAY |     2 |    90 |    33   (0)| 00:00:01 |
|   9 |     TABLE ACCESS FULL | JXC_RISHARESUM  |  1735 | 53785 |     7   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
 
注意,此時,執行計劃中的第四個步驟,變成了:hash join outer方式。這個才是我們所期望的方式。那究竟是什麼導致了這個變化呢?檢視他們的謂詞連線邏輯:
hash join right outer的:
   3 - filter(NVL("T1"."C_SHARENO","T2"."C_SHARENO")='26200703200004969020')
   4 - access("T2"."D_TRADEDATE"="T1"."D_TRADEDATE"(+) AND
              "T2"."D_REGDATE"="T1"."D_REGDATE"(+) AND "T2"."C_FUNDCODE"="T1"."C_FUNDCODE"(+) AND
              "T2"."C_FUNDACCO"="T1"."C_FUNDACCO"(+))
   7 - access("T2"."C_FUNDACCO"="T1"."C_FUNDACCO" AND
              "T2"."C_FUNDCODE"="T1"."C_FUNDCODE" AND "T2"."D_REGDATE"="T1"."D_REGDATE" AND
              "T2"."D_TRADEDATE"="T1"."D_TRADEDATE")
   8 - filter(NVL("T1"."C_SHARENO",NULL)='26200703200004969020')
 
hash join outer的:
   3 - filter(NVL("T1"."C_SHARENO","T2"."C_SHARENO")='26200703200004969020')
   4 - access("T2"."D_TRADEDATE"(+)="T1"."D_TRADEDATE" AND
              "T2"."D_REGDATE"(+)="T1"."D_REGDATE" AND "T2"."C_FUNDCODE"(+)="T1"."C_FUNDCODE"
              AND "T2"."C_FUNDACCO"(+)="T1"."C_FUNDACCO")
   7 - access("T2"."C_FUNDACCO"="T1"."C_FUNDACCO" AND
              "T2"."C_FUNDCODE"="T1"."C_FUNDCODE" AND "T2"."D_REGDATE"="T1"."D_REGDATE" AND
              "T2"."D_TRADEDATE"="T1"."D_TRADEDATE")
   8 - filter("T2"."C_SHARENO"='26200703200004969020')
還是沒有發現明顯的區別。但是實際卻導致了結果的不同。
還原到原始的表連線順序,然後對兩表進行分析,再檢視執行計劃:
SQL> call dbms_stats.gather_table_stats(user, 'JXC_ALLTRADEDAY');
呼叫完成。
SQL> call dbms_stats.gather_table_stats(user, 'JXC_RISHARESUM');
.....
-----------------------------------------------------------------------------------------
| Id  | Operation             | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                 |  1738 | 48664 |    82   (3)| 00:00:01 |
|   1 |  VIEW                 |                 |  1738 | 48664 |    82   (3)| 00:00:01 |
|   2 |   UNION-ALL           |                 |       |       |            |          |
|*  3 |    FILTER             |                 |       |       |            |          |
|*  4 |     HASH JOIN OUTER   |                 |  1735 |   191K|    41   (3)| 00:00:01 |
|   5 |      TABLE ACCESS FULL| JXC_RISHARESUM  |  1735 | 98895 |     7   (0)| 00:00:01 |
|   6 |      TABLE ACCESS FULL| JXC_ALLTRADEDAY | 12775 |   698K|    33   (0)| 00:00:01 |
|*  7 |    HASH JOIN ANTI     |                 |     3 |   276 |    41   (3)| 00:00:01 |
|*  8 |     TABLE ACCESS FULL | JXC_ALLTRADEDAY |     3 |   168 |    33   (0)| 00:00:01 |
|   9 |     TABLE ACCESS FULL | JXC_RISHARESUM  |  1735 | 62460 |     7   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter(NVL("T1"."C_SHARENO","T2"."C_SHARENO")='26200703200004969020')
   4 - access("T2"."D_TRADEDATE"(+)="T1"."D_TRADEDATE" AND
              "T2"."D_REGDATE"(+)="T1"."D_REGDATE" AND "T2"."C_FUNDCODE"(+)="T1"."C_FUNDCODE"
              AND "T2"."C_FUNDACCO"(+)="T1"."C_FUNDACCO")
   7 - access("T2"."C_FUNDACCO"="T1"."C_FUNDACCO" AND
              "T2"."C_FUNDCODE"="T1"."C_FUNDCODE" AND "T2"."D_REGDATE"="T1"."D_REGDATE" AND
              "T2"."D_TRADEDATE"="T1"."D_TRADEDATE")
   8 - filter("T2"."C_SHARENO"='26200703200004969020')
 
可發現,這時原來連線方式的錯誤執行計劃被修正了,改為hash join outer連線。
 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/12932950/viewspace-752670/,如需轉載,請註明出處,否則將追究法律責任。

相關文章