【sql調優之執行計劃】merge sort join

yellowlee發表於2010-08-16

版本:10g 10.2.0.1

Sort merge join排序合併連線的原理是兩個連線的行集按照連線列先分別排序,然後再做連線。

Oracle使用merge join的傾向程度和sort_area_sizedb_file_mutliblock_read_count兩個初始引數有關,因為需要排序,所以較大的sort_area_size設定會更有利於使用merge join。而多塊讀的設定,定義了oracle操作一次全表掃描或者索引範圍掃描返回的block數量,如果設定的太高,優化器將傾向於全表掃描,而太低則傾向於使用索引。在沒有索引的情況下,而兩個行集都接近有序,則使用merge join也可能提升效率。

 

看個例子:

SQL> select /*+ use_merge(a,b)*/

  2   *

  3    from scott.emp a, scott.dept b

  4   where a.deptno = b.deptno

  5  ;

 

已選擇15行。

 

 

執行計劃

----------------------------------------------------------

Plan hash value: 844388907

 

--------------------------------------------------------------------------------

 

--------

 

| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Ti

 

me     |

 

--------------------------------------------------------------------------------

 

--------

 

|   0 | SELECT STATEMENT             |         |    15 |   855 |     6  (17)| 00

 

:00:01 |

 

|   1 |  MERGE JOIN                  |         |    15 |   855 |     6  (17)| 00

 

:00:01 |

 

|   2 |   TABLE ACCESS BY INDEX ROWID| DEPT    |     4 |    80 |     2   (0)| 00

 

:00:01 |

 

|   3 |    INDEX FULL SCAN           | PK_DEPT |     4 |       |     1   (0)| 00

 

:00:01 |

 

|*  4 |   SORT JOIN                  |         |    15 |   555 |     4  (25)| 00

 

:00:01 |

 

|   5 |    TABLE ACCESS FULL         | EMP     |    15 |   555 |     3   (0)| 00

 

:00:01 |

 

--------------------------------------------------------------------------------

 

--------

 

 

Predicate Information (identified by operation id):

---------------------------------------------------

 

   4 - access("A"."DEPTNO"="B"."DEPTNO")

       filter("A"."DEPTNO"="B"."DEPTNO")

 

 

統計資訊

----------------------------------------------------------

         38  recursive calls

          5  db block gets

         12  consistent gets

          2  physical reads

       1012  redo size

       1687  bytes sent via SQL*Net to client

        385  bytes received via SQL*Net from client

          2  SQL*Net roundtrips to/from client

          1  sorts (memory)

          0  sorts (disk)

         15  rows processed

 

上述資訊可以看到dept使用了索引,而emp為全表,由於索引已經排序,故而後面的1 sort(memory)即一次記憶體排序完成了對emp表按連線列deptno的排序。

下面稍稍修改一下上述sql,使得兩個表都使用全表掃描,則可以看到,進行了2sort join,後面的2  sorts (memory)兩次記憶體排序也說明了這點,

例如:

SQL>

SQL> select /*+ no_index(a) no_index(b) use_merge(a,b)*/

  2   *

  3    from scott.emp a, scott.dept b

  4   where a.deptno = b.deptno

  5  ;

 

已選擇15行。

 

 

執行計劃

----------------------------------------------------------

Plan hash value: 1407029907

 

----------------------------------------------------------------------------

| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |

----------------------------------------------------------------------------

|   0 | SELECT STATEMENT    |      |    15 |   855 |     8  (25)| 00:00:01 |

|   1 |  MERGE JOIN         |      |    15 |   855 |     8  (25)| 00:00:01 |

|   2 |   SORT JOIN         |      |     4 |    80 |     4  (25)| 00:00:01 |

|   3 |    TABLE ACCESS FULL| DEPT |     4 |    80 |     3   (0)| 00:00:01 |

|*  4 |   SORT JOIN         |      |    15 |   555 |     4  (25)| 00:00:01 |

|   5 |    TABLE ACCESS FULL| EMP  |    15 |   555 |     3   (0)| 00:00:01 |

----------------------------------------------------------------------------

 

Predicate Information (identified by operation id):

---------------------------------------------------

 

   4 - access("A"."DEPTNO"="B"."DEPTNO")

       filter("A"."DEPTNO"="B"."DEPTNO")

 

 

統計資訊

----------------------------------------------------------

         21  recursive calls

          5  db block gets

         14  consistent gets

          5  physical reads

       1076  redo size

       1687  bytes sent via SQL*Net to client

        385  bytes received via SQL*Net from client

          2  SQL*Net roundtrips to/from client

          2  sorts (memory)

          0  sorts (disk)

         15  rows processed

 

SQL>

 

如果sort_area_size不足以完成排序操作,則將藉助temp表空間來完成排序,即為磁碟排序,大量的磁碟排序是merge join效能下降的一個主要原因。

看一個例子:

SQL> create table t_policy_merge as

  2  select * from t_policy a where rownum <1000;

 

表已建立。

 

create table t_policy_merge_big as

select * from t_policy union all

select * from t_policy union all

select * from t_policy union all

select * from t_policy union all

select * from t_policy ;

 

insert into t_policy_merge_big select* from t_policy_merge_big ;

commit;

 

SQL> select /*+ use_merge(a,b)*/

  2   *

  3    from t_policy_merge_big a, t_policy_merge b

  4   where a.policy_id = b.policy_id;

 

已選擇9990行。

 

 

執行計劃

----------------------------------------------------------

Plan hash value: 2074009417

 

--------------------------------------------------------------------------------

 

------------------

 

| Id  | Operation           | Name               | Rows  | Bytes |TempSpc| Cost

(%CPU)| Time     |

 

--------------------------------------------------------------------------------

 

------------------

 

|   0 | SELECT STATEMENT    |                    |  3654 |    65M|       |   198

 

K  (1)| 00:39:38 |

 

|   1 |  MERGE JOIN         |                    |  3654 |    65M|       |   198

 

K  (1)| 00:39:38 |

 

|   2 |   SORT JOIN         |                    |  1000 |  9149K|    15M|  1971

 

   (1)| 00:00:24 |

 

|   3 |    TABLE ACCESS FULL| T_POLICY_MERGE     |  1000 |  9149K|       |    16

 

   (7)| 00:00:01 |

 

|*  4 |   SORT JOIN         |                    | 99803 |   891M|  1559M|   196

 

K  (1)| 00:39:15 |

 

|   5 |    TABLE ACCESS FULL| T_POLICY_MERGE_BIG | 99803 |   891M|       |  1321

 

   (6)| 00:00:16 |

 

--------------------------------------------------------------------------------

 

------------------

 

 

Predicate Information (identified by operation id):

---------------------------------------------------

 

   4 - access("A"."POLICY_ID"="B"."POLICY_ID")

       filter("A"."POLICY_ID"="B"."POLICY_ID")

 

Note

-----

   - dynamic sampling used for this statement

 

 

統計資訊

----------------------------------------------------------

         60  recursive calls

        226  db block gets

       5776  consistent gets

      10271  physical reads

        132  redo size

     744643  bytes sent via SQL*Net to client

       7700  bytes received via SQL*Net from client

        667  SQL*Net roundtrips to/from client

          1  sorts (memory)

          1  sorts (disk)

       9990  rows processed

 

SQL>

如果使用hash join則會將小表T_POLICY_MERGE hash,從而避免了對大表的磁碟排序,從下面的黑體的db_block_gets, physical readssorts (disk)可以看到這個差別。

SQL> select

  2   *

  3    from t_policy_merge_big a, t_policy_merge b

  4   where a.policy_id = b.policy_id;

 

已選擇9990行。

 

 

執行計劃

----------------------------------------------------------

Plan hash value: 3615210701

 

--------------------------------------------------------------------------------

 

-----------------

 

| Id  | Operation          | Name               | Rows  | Bytes |TempSpc| Cost (

 

%CPU)| Time     |

 

--------------------------------------------------------------------------------

 

-----------------

 

|   0 | SELECT STATEMENT   |                    |  3654 |    65M|       | 46107

  (1)| 00:09:14 |

 

|*  1 |  HASH JOIN         |                    |  3654 |    65M|  9168K| 46107

  (1)| 00:09:14 |

 

|   2 |   TABLE ACCESS FULL| T_POLICY_MERGE     |  1000 |  9149K|       |    16

  (7)| 00:00:01 |

 

|   3 |   TABLE ACCESS FULL| T_POLICY_MERGE_BIG | 99803 |   891M|       |  1321

  (6)| 00:00:16 |

 

--------------------------------------------------------------------------------

 

-----------------

 

 

Predicate Information (identified by operation id):

---------------------------------------------------

 

   1 - access("A"."POLICY_ID"="B"."POLICY_ID")

 

Note

-----

   - dynamic sampling used for this statement

 

 

統計資訊

----------------------------------------------------------

          7  recursive calls

          0  db block gets

       6491  consistent gets

       5724  physical reads

          0  redo size

    2287605  bytes sent via SQL*Net to client

       7700  bytes received via SQL*Net from client

        667  SQL*Net roundtrips to/from client

          2  sorts (memory)

          0  sorts (disk)

       9990  rows processed

 

SQL>

 

注意到前面有些redo的情況,特別是scott下表的測試,做個10046

SQL> alter session set events '10046 trace name context forever, level 8';

 

會話已更改。

 

檢視了trace內容後發現:

insert into sys.aud$( sessionid,entryid,statement,ntimestamp#, userid,

  userhost,terminal,action#,returncode, obj$creator,obj$name,auth$privileges,

  auth$grantee, new$owner,new$name,ses$actions,ses$tid,logoff$pread,

  logoff$lwrite,logoff$dead,comment$text,spare1,spare2,  priv$used,clientid,

  sessioncpu,proxy$sid,user$guid, instance#,process#,xid,scn,auditid,sqlbind,

  sqltext)

values

(:1,:2,:3,SYS_EXTRACT_UTC(SYSTIMESTAMP),     :4,:5,:6,:7,:8,     :9,:10,:11,

  :12,     :13,:14,:15,:16,:17,     :18,:19,:20,:21,:22,     :23,:24,:25,:26,

  :27,     :28,:29,:30,:31,:32,:33,:34)

insert into plan_table (statement_id, timestamp, operation, options,

  object_node, object_owner, object_name, object_instance, object_type,

  search_columns, id, parent_id, position, other,optimizer, cost, cardinality,

   bytes, other_tag, partition_start, partition_stop, partition_id,

  distribution, cpu_cost, io_cost, temp_space, access_predicates,

  filter_predicates, projection, time, qblock_name, object_alias, plan_id,

  depth, remarks, other_xml )

values

(:1,:2,:3,:4,:5,:6,:7,:8,:9,:10,:11,:12,:13,:14,:15,:16,:17,:18,:19,:20,:21,

  :22,:23,:24,:25,:26,:27,:28,:29,:30,:31,:32,:33,:34,:35,:36)

 

sys.aud$表以及plan_tableinsert操作,是由於scott上開了審計功能。

 

 

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/16179598/viewspace-671104/,如需轉載,請註明出處,否則將追究法律責任。

相關文章