對IN & EXISTS NOT IN & NOT EXISTS的優化
在平時工作中,IN & EXISTS NOT IN & NOT EXISTS是使用頻率比較高的SQL語句,
所以對它們的優化工作是很有必要的
測試環境:Oracle 9.2.0.1 for Windows2000
1、IN 和 EXISTS
IN和EXISTS的處理流程是不一樣的:
IN的執行流程
select * from T1 where x in ( select y from T2 )
可以理解為:
select * from t1, ( select distinct y from t2 ) t2 where t1.x = t2.y;
EXISTS的執行流程
select * from t1 where exists ( select null from t2 where y = x )
可以理解為:
for x in ( select * from t1 )
loop
if ( exists ( select null from t2 where y = x.x )
then
OUTPUT THE RECORD
end if
end loop
建立測試用例表big(4000 row)和small(400 row)
create table big as select * from dba_objects where rownum <= 10000;
insert into big select * from big;
insert into big select * from big;
commit;
create table small as select * from dba_objects where rownum <= 400;
當內層表為small,外層表為big時,兩種語法的查詢如下:
SQL> select count(1) from big a where a.object_id in
(select b.object_id from sall b);
COUNT(1)
----------
1600
已用時間: 00: 00: 00.56
Execution Plan
-----------------------------------------------------
0 SELECT STATEMENT ptimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 MERGE JOIN
3 2 SORT (JOIN)
4 3 TABLE ACCESS (FULL) OF 'BIG'
5 2 SORT (JOIN)
6 5 VIEW OF 'VW_NSO_1'
7 6 SORT (UNIQUE)
8 7 TABLE ACCESS (FULL) OF 'SMALL'
Statistics
-----------------------------------------------------
0 recursive calls
0 db block gets
543 consistent gets
0 physical reads
SQL> select count(1) from big a where exists
(select 1 from small b where a.object_id=b.object_id);
COUNT(1)
----------
1600
已用時間: 00: 00: 03.10
Execution Plan
-----------------------------------------------------
0 SELECT STATEMENT ptimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'BIG'
4 2 TABLE ACCESS (FULL) OF 'SMALL'
Statistics
-----------------------------------------------------
0 recursive calls
0 db block gets
312157 consistent gets
0 physical reads
當內層表為big,外層表為small時,兩種語法的查詢如下:
SQL> select count(1) from small a where a.object_id in
(select b.object_id from big b);
COUNT(1)
----------
400
已用時間: 00: 00: 00.56
Execution Plan
-----------------------------------------------------
0 SELECT STATEMENT ptimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 MERGE JOIN
3 2 SORT (JOIN)
4 3 TABLE ACCESS (FULL) OF 'SMALL'
5 2 SORT (JOIN)
6 5 VIEW OF 'VW_NSO_1'
7 6 SORT (UNIQUE)
8 7 TABLE ACCESS (FULL) OF 'BIG'
Statistics
-----------------------------------------------------
0 recursive calls
0 db block gets
543 consistent gets
0 physical reads
SQL> select count(1) from small a where exists
(select null from big b where a.bject_id=b.object_id);
COUNT(1)
----------
400
已用時間: 00: 00: 00.25
Execution Plan
-----------------------------------------------------
0 SELECT STATEMENT ptimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'SMALL'
4 2 TABLE ACCESS (FULL) OF 'BIG'
Statistics
-----------------------------------------------------
0 recursive calls
0 db block gets
2562 consistent gets
0 physical reads
在對錶big、small進行分析後,發現CBO下兩種語法的執行計劃是一樣的,都使用hash連線或者hash半連線
SQL> analyze table big compute statistics;
SQL> analyze table small compute statistics;
SQL> select count(1) from big a where a.object_id in
(select b.object_id from small b);
COUNT(1)
----------
1600
已用時間: 00: 00: 00.09
Execution Plan
-------------------------------------------------------
0 SELECT STATEMENT ptimizer=CHOOSE (Cost=58
Card=1 Bytes=8)
1 0 SORT (AGGREGATE)
2 1 HASH JOIN (Cost=58 Card=1600 Bytes=12800)
3 2 SORT (UNIQUE)
4 3 TABLE ACCESS (FULL) OF 'SMALL' (Cost=2 Card=400 Bytes=1600)
5 2 TABLE ACCESS (FULL) OF 'BIG' (Cost=53 Card=40000 Bytes=160000)
Statistics
-------------------------------------------------------
0 recursive calls
0 db block gets
543 consistent gets
0 physical reads
SQL> select count(1) from big a where exists
(select 1 from small b where a.object_id=b.object_id);
COUNT(1)
----------
1600
已用時間: 00: 00: 00.09
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT ptimizer=CHOOSE (Cost=58 Card=1 Bytes=8)
1 0 SORT (AGGREGATE)
2 1 HASH JOIN (Cost=58 Card=1600 Bytes=12800)
3 2 SORT (UNIQUE)
4 3 TABLE ACCESS (FULL) OF 'SMALL' (Cost=2 Card=400 Bytes=1600)
5 2 TABLE ACCESS (FULL) OF 'BIG' (Cost=53 Card=40000 Bytes=160000)
Statistics
--------------------------------------------------------
0 recursive calls
0 db block gets
543 consistent gets
0 physical reads
SQL> select count(1) from small a where a.object_id in
(select b.object_id from big b);
COUNT(1)
----------
400
已用時間: 00: 00: 00.09
Execution Plan
------------------------------------------------------
0 SELECT STATEMENT ptimizer=CHOOSE (Cost=56 Card=1 Bytes=8)
1 0 SORT (AGGREGATE)
2 1 HASH JOIN (SEMI) (Cost=56 Card=400 Bytes=3200)
3 2 TABLE ACCESS (FULL) OF 'SMALL' (Cost=2 Card=400 Bytes=1600)
4 2 TABLE ACCESS (FULL) OF 'BIG' (Cost=53 Card=40000 Bytes=160000)
Statistics
------------------------------------------------------
0 recursive calls
0 db block gets
543 consistent gets
0 physical reads
SQL> select count(1) from small a where exists
(select 1 from big b where a.object_id=b.object_id);
COUNT(1)
----------
400
已用時間: 00: 00: 00.09
Execution Plan
-------------------------------------------------------
0 SELECT STATEMENT ptimizer=CHOOSE (Cost=56 Card=1 Bytes=8)
1 0 SORT (AGGREGATE)
2 1 HASH JOIN (SEMI) (Cost=56 Card=400 Bytes=3200)
3 2 TABLE ACCESS (FULL) OF 'SMALL' (Cost=2 Card=400 Bytes=1600)
4 2 TABLE ACCESS (FULL) OF 'BIG' (Cost=53 Card=40000 Bytes=160000)
Statistics
-------------------------------------------------------
0 recursive calls
0 db block gets
543 consistent gets
0 physical reads
刪除表分析,使用提示/*+ use_hash(a,b) */ 或者 /*+ hash_sj */,
兩種語法都可以達到CBO的執行計劃
SQL> analyze table big delete statistics;
SQL> analyze table small delete statistics;
SQL> select /*+ use_hash(a,b) */count(1) from big a where a.object_id in
(select b.object_id from small b);
SQL> select /*+ use_hash(a,b) */count(1) from big a where exists
(select 1 from small b where a.object_id=b.object_id);
SQL> select count(1) from small a where a.object_id in
(select /*+ hash_sj */ b.object_id from big b);
SQL> select count(1) from small a where exists
(select /*+ hash_sj */ 1 from big b where a.object_id=b.object_id);
下表列出了各種情況下的速度情況:
┌───────────┬──────────────────────────┬─────────────────────────┬─────────────┐
│ │ outer big,inner small │ outer small,inner big │ table rows │
├───────────┼──────────┬───────────────┼──────────┬──────────────┼─────────────┤
│ │ IN SQL │ EXISTS SQL │ IN SQL │ EXISTS SQL │ │
├───────────┼──────────┼───────────────┼──────────┼──────────────┼─────────────┤
│un-analyze │ 0.56s │ 3.10s │ 0.56s │ 0.25s │ big=40000 │
├───────────┼──────────┼───────────────┼──────────┼──────────────┤ and │
│ analyzed │ 0.09s │ 0.09s │ 0.09s │ 0.09s │ small=400 │
├───────────┼──────────┼───────────────┼──────────┼──────────────┼─────────────┤
│un-analyze │ 0.72s │ 3.53s │ 0.25s │ 2.97s │ big=5000 │
├───────────┼──────────┼───────────────┼──────────┼──────────────┤ and │
│ analyzed │ 0.09s │ 0.09s │ 0.09s │ 0.09s │ small=4000 │
└───────────┴──────────┴───────────────┴──────────┴──────────────┴─────────────┘
結論:在未對錶進行分析前,若兩個表資料量差異很大,則外層表是大表時使用IN較快,
外層表是小表時使用EXISTS較快;若兩表資料量接近,則使用IN較快;
分析表後無論用IN還是EXISTS都變得更快,由於執行計劃一樣,所以速度一樣;
所以:無論使用IN還是EXISTS,只要使用雜湊連線,即提示/*+ use_hash(a,b) */,
或者在子句中雜湊半連線提示/*+ hash_sj */, 就使其達到最優速度;
附註:半連線的提示有hash_sj、merge_sj、nl_sj
***********************************************************************************************************************
***********************************************************************************************************************
2、NOT IN 和 NOT EXISTS
NOT EXISTS的執行流程
select .....
from rollup R
where not exists ( select 'Found' from title T
where R.source_id = T.Title_ID);
可以理解為:
for x in ( select * from rollup )
loop
if ( not exists ( that query ) ) then
OUTPUT
end if;
end;
注意:NOT EXISTS 與 NOT IN 不能完全互相替換,看具體的需求。如果選擇的列可以為空,則不能被替換。
對於not in 和 not exists的效能區別:
not in 只有當子查詢中,select 關鍵字後的欄位有not null約束或者有這種暗示時用not in,另外如果主查詢中表大,
子查詢中的表小但是記錄多,則應當使用not in,並使用anti hash join.
如果主查詢表中記錄少,子查詢表中記錄多,並有索引,可以使用not exists,另外not in最好也可以用/*+ HASH_AJ */
或者外連線+is null,NOT IN 在基於成本的應用中較好
建立測試用例表big(40000 row)和small(1000 row):
truncate table big;
truncate table small;
insert into big select * from dba_objects where rownum <=20000;
insert into big select * from dba_objects where rownum <=20000;
insert into small select * from dba_objects where rownum <=1000;
commit;
基本句型:
<1> not in
SQL> select count(1) from big a where a.object_id not in (select b.object_id from small b);
COUNT(1)
----------
38000
已用時間: 00: 00: 12.56
Execution Plan
-----------------------------------------------------
0 SELECT STATEMENT ptimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'BIG'
4 2 TABLE ACCESS (FULL) OF 'SMALL'
Statistics
--------------------------------
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/10742223/viewspace-343292/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- MySQL exists 優化 in 效率MySql優化
- IN&EXISTS與NOT IN&NOT EXISTS 的優化原則的討論優化
- 【MySQL】NOT EXISTS優化的一個案例MySql優化
- exists與in子查詢優化優化
- Kettle Table Exists控制元件優化控制元件優化
- EXISTS、IN、NOT EXISTS、NOT IN的區別(ZT)
- [Oracle] exists 和 not existsOracle
- EXISTS、IN、NOT EXISTS、NOT IN用法區別
- oracle之優化一用group by或exists優化distinctOracle優化
- in/exists和not in/not exists執行效率
- Laravel的unique和exists驗證規則的優化Laravel優化
- 使用exists(Semi-Join)優化distinct語句優化
- oracle中的exists 和not exists 用法詳解Oracle
- SQL中IN,NOT IN,EXISTS,NOT EXISTS的用法和差別SQL
- sql:delete if exists還是drop if exists?SQLdelete
- NOT IN、JOIN、IS NULL、NOT EXISTS效率對比Null
- 一個NOT EXISTS含有OR條件子查詢的優化優化
- 對線面試官:SQL中的IN與NOT IN、EXISTS與NOT EXISTS的區別及效能分析面試SQL
- In和exists使用及效能分析(二):exists的使用
- exists和not exists及in和not in的用法與區別
- oracle中的exists和not exists和in用法詳解Oracle
- oracle中關於in和exists,not in 和 not existsOracle
- Oracle學習系列—資料庫優化—In和Exists的使用Oracle資料庫優化
- 【SQL】existsSQL
- Oralce 使用SQL中的exists 和not exists 用法詳解SQL
- In和exists使用及效能分析(三):in和exists的效能分析
- List中對比Contains, Exists, Any之間的優缺點AI
- Oracle 'or exists/in'結合使用引起的filter執行計劃 的優化OracleFilter優化
- 理解exists count
- oracle exists and not existOracle
- Oracle優化案例-用left join代替反連線 not in not exists(十)Oracle優化
- in,exists和not exists ,not in與null的一些關係記載Null
- oracle中的exists理解Oracle
- SQL中EXISTS的使用SQL
- [精選] SQL中的IN與NOT IN、EXISTS與NOT EXISTS的區別及效能分析SQL
- Mysql優化原則_小表驅動大表IN和EXISTS的合理利用MySql優化
- MySQL之in與existsMySql
- in 和 exists區別