關聯查詢時使用樹狀查詢要小心

yangtingkun發表於2007-10-02

很多人為了方便經常將幾個表的關聯查詢和樹狀查詢合在一起,這樣做很可能產生一些並不需要的記錄。


如果在查詢的時候兩張或兩張以上的表進行關聯查詢,同時需要對其中一張執行CONNECT BY操作,那麼建議要不然將CONNECT BY放到查詢最裡層,要不然將CONNECT BY放到查詢最外層,而不要將CONNECT BY與多表連線放在一起。

透過一個例子來說明這個問題吧:

SQL> CREATE TABLE T (ID NUMBER, FID NUMBER, NAME VARCHAR2(30));

表已建立。

SQL> ALTER TABLE T ADD CONSTRAINT PK_T PRIMARY KEY (ID);

表已更改。

SQL> INSERT INTO T VALUES (1, 0, 'TABLE');

已建立 1 行。

SQL> INSERT INTO T VALUES (2, 0, 'INDEX');

已建立 1 行。

SQL> INSERT INTO T SELECT 10000 + ROWNUM, 1, TABLE_NAME FROM DBA_TABLES;

已建立1624行。

SQL> INSERT INTO T SELECT 20000 + ROWNUM, 2, INDEX_NAME FROM DBA_INDEXES;

已建立2401行。

SQL> CREATE INDEX IND_T_FID ON T(FID);

索引已建立。

SQL> CREATE TABLE T1 (ID NUMBER, USERNAME VARCHAR2(30), TID NUMBER);

表已建立。

SQL> ALTER TABLE T1 ADD CONSTRAINT PK_T1 PRIMARY KEY (ID);

表已更改。

SQL> INSERT INTO T1 VALUES (1, 'YANGTK', 1);

已建立 1 行。

SQL> INSERT INTO T1 VALUES (2, 'YANGTK', 10005);

已建立 1 行。

SQL> INSERT INTO T1 VALUES (3, 'YANGTK', 10006);

已建立 1 行。

SQL> INSERT INTO T1 VALUES (4, 'TEST', 1);

已建立 1 行。

SQL> CREATE INDEX IND_T1_NAME ON T1(USERNAME);

索引已建立。

構造一個很簡單的樹型結構的表,並構造一張關聯表,下面執行一個簡單的連線查詢:

SQL> SELECT T.ID, FID, NAME, USERNAME
2 FROM T, T1
3 WHERE T.ID = T1.TID
4 AND T1.USERNAME = 'YANGTK';

ID FID NAME USERNAME
---------- ---------- ------------------------------ -------------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK

如果在這個查詢的基礎上加上一個樹型查詢,START WITH FID = 0,看看執行的結果:

SQL> SELECT T.ID, FID, NAME, USERNAME
2 FROM T, T1
3 WHERE T.ID = T1.TID
4 AND T1.USERNAME = 'YANGTK'
5 START WITH FID = 0
6 CONNECT BY PRIOR T.ID = FID;

ID FID NAME USERNAME
---------- ---------- ------------------------------ -----------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK

現在結果變成了5條,為什麼不加樹型查詢得到3條記錄,而新增了樹型查詢結果變成了5條呢。

觀察一下執行計劃:

SQL> SET AUTOT ON EXP
SQL> SELECT T.ID, FID, NAME, USERNAME
2 FROM T, T1
3 WHERE T.ID = T1.TID
4 AND T1.USERNAME = 'YANGTK'
5 START WITH FID = 0
6 CONNECT BY PRIOR T.ID = FID;

ID FID NAME USERNAME
---------- ---------- ------------------------------ ------------------------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK

執行計劃
----------------------------------------------------------
Plan hash value: 1198160839

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 292 | 3 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
|* 2 | CONNECT BY WITH FILTERING | | | | | |
|* 3 | FILTER | | | | | |
| 4 | COUNT | | | | | |
| 5 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 8 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
|* 9 | HASH JOIN | | | | | |
| 10 | CONNECT BY PUMP | | | | | |
| 11 | COUNT | | | | | |
| 12 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 13 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 14 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 15 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
| 16 | COUNT | | | | | |
| 17 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 18 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 19 | TABLE ACCESS BY INDEX ROWID | T | 1 | 43 | 1 (0)| 00:00:01 |
|* 20 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("T1"."USERNAME"='YANGTK')
2 - filter("FID"=0)
3 - filter("FID"=0)
8 - access("T"."ID"="T1"."TID")
9 - access("FID"=NULL)
15 - access("T"."ID"="T1"."TID")
20 - access("T"."ID"="T1"."TID")

Note
-----
- dynamic sampling used for this statement

Oracle先進行的連線,然後是CONNECT BY,最後使用filter("T1"."USERNAME"='YANGTK')來進行過濾。

這就是造成資料重複的原因。

實際上這個SQL相當於先執行了一個:

SQL> SELECT T.ID, FID, NAME, USERNAME
2 FROM T, T1
3 WHERE T.ID = T1.TID
4 START WITH FID = 0
5 CONNECT BY PRIOR T.ID = FID;

ID FID NAME USERNAME
---------- ---------- ------------------------------ ------------------------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
1 0 TABLE TEST
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK

已選擇6行。

執行計劃
----------------------------------------------------------
Plan hash value: 2039799266

----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 292 | 3 (0)| 00:00:01 |
|* 1 | CONNECT BY WITH FILTERING | | | | | |
|* 2 | FILTER | | | | | |
| 3 | COUNT | | | | | |
| 4 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 6 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 7 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
|* 8 | HASH JOIN | | | | | |
| 9 | CONNECT BY PUMP | | | | | |
| 10 | COUNT | | | | | |
| 11 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 12 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 13 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 14 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
| 15 | COUNT | | | | | |
| 16 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 17 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 18 | TABLE ACCESS BY INDEX ROWID | T | 1 | 43 | 1 (0)| 00:00:01 |
|* 19 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("FID"=0)
2 - filter("FID"=0)
7 - access("T"."ID"="T1"."TID")
8 - access("FID"=NULL)
14 - access("T"."ID"="T1"."TID")
19 - access("T"."ID"="T1"."TID")

Note
-----
- dynamic sampling used for this statement

然後再次基礎上執行了T1.USERNAME = 'YANGTK'

直接的關聯加樹型查詢等價於下面的SQL

SQL> SELECT * FROM
2 (
3 SELECT T.ID, FID, NAME, USERNAME
4 FROM T, T1
5 WHERE T.ID = T1.TID
6 START WITH FID = 0
7 CONNECT BY PRIOR T.ID = FID
8 )
9 WHERE USERNAME = 'YANGTK';

ID FID NAME USERNAME
---------- ---------- ------------------------------ ------------------------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK

執行計劃
----------------------------------------------------------
Plan hash value: 778818883

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 240 | 3 (0)| 00:00:01 |
|* 1 | VIEW | | 4 | 240 | 3 (0)| 00:00:01 |
|* 2 | CONNECT BY WITH FILTERING | | | | | |
|* 3 | FILTER | | | | | |
| 4 | COUNT | | | | | |
| 5 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 8 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
|* 9 | HASH JOIN | | | | | |
| 10 | CONNECT BY PUMP | | | | | |
| 11 | COUNT | | | | | |
| 12 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 13 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 14 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 15 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
| 16 | COUNT | | | | | |
| 17 | NESTED LOOPS | | 4 | 292 | 3 (0)| 00:00:01 |
| 18 | TABLE ACCESS FULL | T1 | 4 | 120 | 2 (0)| 00:00:01 |
| 19 | TABLE ACCESS BY INDEX ROWID | T | 1 | 43 | 1 (0)| 00:00:01 |
|* 20 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("USERNAME"='YANGTK')
2 - filter("FID"=0)
3 - filter("FID"=0)
8 - access("T"."ID"="T1"."TID")
9 - access("FID"=NULL)
15 - access("T"."ID"="T1"."TID")
20 - access("T"."ID"="T1"."TID")

Note
-----
- dynamic sampling used for this statement

而這個結果顯然不是希望得到的。

對於這種即包含關聯又包含樹型查詢的SQL,最好的方法是先做一個,再做另一個,這樣可以確保得到預期的結果:

SQL> SELECT * FROM
2 (
3 SELECT T.ID, FID, NAME, USERNAME
4 FROM T, T1
5 WHERE T.ID = T1.TID
6 AND T1.USERNAME = 'YANGTK'
7 )
8 START WITH FID = 0
9 CONNECT BY PRIOR ID = FID;

ID FID NAME USERNAME
---------- ---------- ------------------------------ ------------------------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK

執行計劃
----------------------------------------------------------
Plan hash value: 3886537187

-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 219 | 2 (0)| 00:00:01 |
|* 1 | CONNECT BY WITH FILTERING | | | | | |
|* 2 | FILTER | | | | | |
| 3 | COUNT | | | | | |
| 4 | NESTED LOOPS | | 3 | 219 | 2 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| T1 | 3 | 90 | 1 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | IND_T1_NAME | 3 | | 1 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 8 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
|* 9 | HASH JOIN | | | | | |
| 10 | CONNECT BY PUMP | | | | | |
| 11 | COUNT | | | | | |
| 12 | NESTED LOOPS | | 3 | 219 | 2 (0)| 00:00:01 |
| 13 | TABLE ACCESS BY INDEX ROWID| T1 | 3 | 90 | 1 (0)| 00:00:01 |
|* 14 | INDEX RANGE SCAN | IND_T1_NAME | 3 | | 1 (0)| 00:00:01 |
| 15 | TABLE ACCESS BY INDEX ROWID| T | 1 | 43 | 1 (0)| 00:00:01 |
|* 16 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
| 17 | COUNT | | | | | |
| 18 | NESTED LOOPS | | 3 | 219 | 2 (0)| 00:00:01 |
| 19 | TABLE ACCESS BY INDEX ROWID | T1 | 3 | 90 | 1 (0)| 00:00:01 |
|* 20 | INDEX RANGE SCAN | IND_T1_NAME | 3 | | 1 (0)| 00:00:01 |
| 21 | TABLE ACCESS BY INDEX ROWID | T | 1 | 43 | 1 (0)| 00:00:01 |
|* 22 | INDEX UNIQUE SCAN | PK_T | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("FID"=0)
2 - filter("FID"=0)
6 - access("T1"."USERNAME"='YANGTK')
8 - access("T"."ID"="T1"."TID")
9 - access("FID"=NULL)
14 - access("T1"."USERNAME"='YANGTK')
16 - access("T"."ID"="T1"."TID")
20 - access("T1"."USERNAME"='YANGTK')
22 - access("T"."ID"="T1"."TID")

Note
-----
- dynamic sampling used for this statement

剛才是先做關聯,然後做樹型查詢,也可以反過來,先做樹型查詢:

SQL> SELECT T.ID, FID, NAME, USERNAME
2 FROM
3 (
4 SELECT ID, FID, NAME
5 FROM T
6 START WITH FID = 0
7 CONNECT BY PRIOR ID = FID
8 ) T, T1
9 WHERE T.ID = T1.TID
10 AND T1.USERNAME = 'YANGTK';

ID FID NAME USERNAME
---------- ---------- ------------------------------ ------------------------------
1 0 TABLE YANGTK
10005 1 FILE$ YANGTK
10006 1 UET$ YANGTK

執行計劃
----------------------------------------------------------
Plan hash value: 973607771

-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 40 | 2920 | 3 (34)| 00:00:01 |
|* 1 | HASH JOIN | | 40 | 2920 | 3 (34)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID | T1 | 3 | 90 | 1 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | IND_T1_NAME | 3 | | 1 (0)| 00:00:01 |
| 4 | VIEW | | 40 | 1720 | 1 (0)| 00:00:01 |
|* 5 | CONNECT BY WITH FILTERING | | | | | |
| 6 | TABLE ACCESS BY INDEX ROWID | T | | | | |
|* 7 | INDEX RANGE SCAN | IND_T_FID | 2 | 26 | 1 (0)| 00:00:01 |
| 8 | NESTED LOOPS | | | | | |
| 9 | BUFFER SORT | | | | | |
| 10 | CONNECT BY PUMP | | | | | |
| 11 | TABLE ACCESS BY INDEX ROWID| T | 40 | 1720 | 1 (0)| 00:00:01 |
|* 12 | INDEX RANGE SCAN | IND_T_FID | 16 | | 1 (0)| 00:00:01 |
|* 13 | TABLE ACCESS FULL | T | 40 | 1720 | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("T"."ID"="T1"."TID")
3 - access("T1"."USERNAME"='YANGTK')
5 - filter("FID"=0)
7 - access("FID"=0)
12 - access("FID"=NULL)
13 - access("FID"=NULL)

Note
-----
- dynamic sampling used for this statement

兩種方法選擇那種,就需要根據具體情況進行分析了。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/4227/viewspace-69444/,如需轉載,請註明出處,否則將追究法律責任。

相關文章