【sql調優之執行計劃】merge sort join

版本：10g 10.2.0.1

Sort merge join排序合併連線的原理是兩個連線的行集按照連線列先分別排序，然後再做連線。

Oracle使用merge join的傾向程度和sort_area_size和db_file_mutliblock_read_count兩個初始引數有關，因為需要排序，所以較大的sort_area_size設定會更有利於使用merge join。而多塊讀的設定，定義了oracle操作一次全表掃描或者索引範圍掃描返回的block數量，如果設定的太高，優化器將傾向於全表掃描，而太低則傾向於使用索引。在沒有索引的情況下，而兩個行集都接近有序，則使用merge join也可能提升效率。

看個例子：

SQL> select /*+ use_merge(a,b)*/

2 *

3 from scott.emp a, scott.dept b

4 where a.deptno = b.deptno

5 ;

已選擇15行。

執行計劃

----------------------------------------------------------

Plan hash value: 844388907

--------------------------------------------------------------------------------

--------

me |

--------------------------------------------------------------------------------

--------

| 0 | SELECT STATEMENT | | 15 | 855 | 6 (17)| 00

:00:01 |

| 1 | MERGE JOIN | | 15 | 855 | 6 (17)| 00

:00:01 |

| 2 | TABLE ACCESS BY INDEX ROWID| DEPT | 4 | 80 | 2 (0)| 00

:00:01 |

| 3 | INDEX FULL SCAN | PK_DEPT | 4 | | 1 (0)| 00

:00:01 |

|* 4 | SORT JOIN | | 15 | 555 | 4 (25)| 00

:00:01 |

| 5 | TABLE ACCESS FULL | EMP | 15 | 555 | 3 (0)| 00

:00:01 |

--------------------------------------------------------------------------------

--------

Predicate Information (identified by operation id):

---------------------------------------------------

4 - access("A"."DEPTNO"="B"."DEPTNO")

filter("A"."DEPTNO"="B"."DEPTNO")

統計資訊

----------------------------------------------------------

38 recursive calls

5 db block gets

12 consistent gets

2 physical reads

1012 redo size

1687 bytes sent via SQL*Net to client

385 bytes received via SQL*Net from client

2 SQL*Net roundtrips to/from client

1 sorts (memory)

0 sorts (disk)

15 rows processed

上述資訊可以看到dept使用了索引，而emp為全表，由於索引已經排序，故而後面的1 sort(memory)即一次記憶體排序完成了對emp表按連線列deptno的排序。

下面稍稍修改一下上述sql，使得兩個表都使用全表掃描，則可以看到，進行了2次sort join,後面的2 sorts (memory)兩次記憶體排序也說明了這點，

例如：

SQL>

SQL> select /*+ no_index(a) no_index(b) use_merge(a,b)*/

2 *

3 from scott.emp a, scott.dept b

4 where a.deptno = b.deptno

5 ;

已選擇15行。

執行計劃

----------------------------------------------------------

Plan hash value: 1407029907

----------------------------------------------------------------------------

----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 15 | 855 | 8 (25)| 00:00:01 |

| 1 | MERGE JOIN | | 15 | 855 | 8 (25)| 00:00:01 |

| 2 | SORT JOIN | | 4 | 80 | 4 (25)| 00:00:01 |

| 3 | TABLE ACCESS FULL| DEPT | 4 | 80 | 3 (0)| 00:00:01 |

|* 4 | SORT JOIN | | 15 | 555 | 4 (25)| 00:00:01 |

| 5 | TABLE ACCESS FULL| EMP | 15 | 555 | 3 (0)| 00:00:01 |

----------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

4 - access("A"."DEPTNO"="B"."DEPTNO")

filter("A"."DEPTNO"="B"."DEPTNO")

統計資訊

----------------------------------------------------------

21 recursive calls

5 db block gets

14 consistent gets

5 physical reads

1076 redo size

1687 bytes sent via SQL*Net to client

385 bytes received via SQL*Net from client

2 SQL*Net roundtrips to/from client

2 sorts (memory)

0 sorts (disk)

15 rows processed

SQL>

如果sort_area_size不足以完成排序操作，則將藉助temp表空間來完成排序，即為磁碟排序，大量的磁碟排序是merge join效能下降的一個主要原因。

看一個例子：

SQL> create table t_policy_merge as

2 select * from t_policy a where rownum <1000;

表已建立。

create table t_policy_merge_big as

select * from t_policy union all

select * from t_policy ;

insert into t_policy_merge_big select* from t_policy_merge_big ;

commit;

SQL> select /*+ use_merge(a,b)*/

2 *

3 from t_policy_merge_big a, t_policy_merge b

4 where a.policy_id = b.policy_id;

已選擇9990行。

執行計劃

----------------------------------------------------------

Plan hash value: 2074009417

--------------------------------------------------------------------------------

------------------

(%CPU)| Time |

--------------------------------------------------------------------------------

------------------

| 0 | SELECT STATEMENT | | 3654 | 65M| | 198

K (1)| 00:39:38 |

| 1 | MERGE JOIN | | 3654 | 65M| | 198

K (1)| 00:39:38 |

| 2 | SORT JOIN | | 1000 | 9149K| 15M| 1971

(1)| 00:00:24 |

| 3 | TABLE ACCESS FULL| T_POLICY_MERGE | 1000 | 9149K| | 16

(7)| 00:00:01 |

|* 4 | SORT JOIN | | 99803 | 891M| 1559M| 196

K (1)| 00:39:15 |

| 5 | TABLE ACCESS FULL| T_POLICY_MERGE_BIG | 99803 | 891M| | 1321

(6)| 00:00:16 |

--------------------------------------------------------------------------------

------------------

Predicate Information (identified by operation id):

---------------------------------------------------

4 - access("A"."POLICY_ID"="B"."POLICY_ID")

filter("A"."POLICY_ID"="B"."POLICY_ID")

Note

-----

- dynamic sampling used for this statement

統計資訊

----------------------------------------------------------

60 recursive calls

226 db block gets

5776 consistent gets

10271 physical reads

132 redo size

744643 bytes sent via SQL*Net to client

7700 bytes received via SQL*Net from client

667 SQL*Net roundtrips to/from client

1 sorts (memory)

1 sorts (disk)

9990 rows processed

SQL>

如果使用hash join則會將小表T_POLICY_MERGE hash，從而避免了對大表的磁碟排序，從下面的黑體的db_block_gets, physical reads和sorts (disk)可以看到這個差別。

SQL> select

2 *

3 from t_policy_merge_big a, t_policy_merge b

4 where a.policy_id = b.policy_id;

已選擇9990行。

執行計劃

----------------------------------------------------------

Plan hash value: 3615210701

--------------------------------------------------------------------------------

-----------------

%CPU)| Time |

--------------------------------------------------------------------------------

-----------------

| 0 | SELECT STATEMENT | | 3654 | 65M| | 46107

(1)| 00:09:14 |

|* 1 | HASH JOIN | | 3654 | 65M| 9168K| 46107

(1)| 00:09:14 |

| 2 | TABLE ACCESS FULL| T_POLICY_MERGE | 1000 | 9149K| | 16

(7)| 00:00:01 |

| 3 | TABLE ACCESS FULL| T_POLICY_MERGE_BIG | 99803 | 891M| | 1321

(6)| 00:00:16 |

--------------------------------------------------------------------------------

-----------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - access("A"."POLICY_ID"="B"."POLICY_ID")

Note

-----

- dynamic sampling used for this statement

統計資訊

----------------------------------------------------------

7 recursive calls

0 db block gets

6491 consistent gets

5724 physical reads

0 redo size

2287605 bytes sent via SQL*Net to client

7700 bytes received via SQL*Net from client

667 SQL*Net roundtrips to/from client

2 sorts (memory)

0 sorts (disk)

9990 rows processed

SQL>

注意到前面有些redo的情況，特別是scott下表的測試，做個10046

SQL> alter session set events '10046 trace name context forever, level 8';

會話已更改。

檢視了trace內容後發現：

insert into sys.aud$( sessionid,entryid,statement,ntimestamp#, userid,

userhost,terminal,action#,returncode, obj$creator,obj$name,auth$privileges,

auth$grantee, new$owner,new$name,ses$actions,ses$tid,logoff$pread,

logoff$lwrite,logoff$dead,comment$text,spare1,spare2, priv$used,clientid,

sessioncpu,proxy$sid,user$guid, instance#,process#,xid,scn,auditid,sqlbind,

sqltext)

values

(:1,:2,:3,SYS_EXTRACT_UTC(SYSTIMESTAMP), :4,:5,:6,:7,:8, :9,:10,:11,

:12, :13,:14,:15,:16,:17, :18,:19,:20,:21,:22, :23,:24,:25,:26,

:27, :28,:29,:30,:31,:32,:33,:34)

insert into plan_table (statement_id, timestamp, operation, options,

object_node, object_owner, object_name, object_instance, object_type,

search_columns, id, parent_id, position, other,optimizer, cost, cardinality,

bytes, other_tag, partition_start, partition_stop, partition_id,

distribution, cpu_cost, io_cost, temp_space, access_predicates,

filter_predicates, projection, time, qblock_name, object_alias, plan_id,

depth, remarks, other_xml )

values

(:1,:2,:3,:4,:5,:6,:7,:8,:9,:10,:11,:12,:13,:14,:15,:16,:17,:18,:19,:20,:21,

:22,:23,:24,:25,:26,:27,:28,:29,:30,:31,:32,:33,:34,:35,:36)

sys.aud$表以及plan_table有insert操作，是由於scott上開了審計功能。

來自 “ ITPUB部落格 ” ，連結：http://blog.itpub.net/16179598/viewspace-671104/，如需轉載，請註明出處，否則將追究法律責任。

【sql調優之執行計劃】merge sort join

相關文章