use_nl,use_hash,use_merge的3種連線方式驅動與被驅動關係

hd_system發表於2016-11-09

雜湊連線(HASH JOIN)是一種兩個表在做表連線時主要依靠雜湊運算來得到連線結果集的表連線方法。

對於排序合併連線,如果兩個表在施加了目標SQL中指定的謂詞條件後得到的結果集很大而且需要排序,則排序合併連線的執行效率一定不高;而對於巢狀迴圈連線,如果驅動表所對應的驅動結果集的記錄數很大,即便在被驅動表的連線列上存在索引,此時使用巢狀迴圈連線的執行效率也會同樣不高。為了解決這個問題,於是引進了雜湊連線。在ORACLE 10g及其以後的版本中,最佳化器 (實際上是CBO,因為雜湊連線僅適用於CBO)在解析目標SQL的時候是否考慮雜湊連線受限於隱含引數_HASH_JOIN_ENABLED,預設值是TRUE.

對於雜湊連線的優缺點及適用場景如下:

a,雜湊連線不一定會排序,或者說大多數情況下都不需要排序

b,雜湊連線的驅動表所對應的連線列的選擇性儘可能好。

c,雜湊只能用於CBO,而且只能用於等值連線的條件。(即使是雜湊反連線,ORACLE實際上也是將其換成等值連線)。

c,雜湊連線很適用小表和大表之間做連線且連線結果集的記錄數較多的情形,特別是小表的選擇性非常好的情況下,這個時候雜湊連線的執行時間就可以近似看做和全表掃描個個大表的費用時間相當。

e,當兩個雜湊連線的時候,如果在施加了目標SQL中指定的謂詞條件後得到的資料量較小的那個結果集所對應的HASH TABLE能夠完全被容納在記憶體中(PGA的工作區),此時的雜湊連線的執行效率非常高。


oracle表之間的連線之雜湊連線(Hash Join),其特點如下:

1,驅動表和被驅動表都是最多隻被訪問一次。

2,雜湊連線的表有驅動順序。

3,雜湊表連線的表無需要排序,但是他在做連線之前做雜湊運算的時候,會用到HASH_AREA_SIZE來建立雜湊表。

4,雜湊連線不適用於的連線條件是:不等於<>,大於>,小於<,小於等於<=,大於等於>=,like。

5,雜湊連線索引列在表連線中無特殊要求,與單表情況無異。

************************************************************************************************
SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for : Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

點選(此處)摺疊或開啟

  1. --use_nl,use_hash,use_merge的3種連線方式驅動與被驅動關係
  2.       
  3.     一.HASH連線方式
  4.       
  5.     --建一個小表和一個大一些的表
  6.     SQL> create table t1 as select * from dba_objects where rownum<11;
  7.       
  8.     Table created.
  9.       
  10.     SQL> create table t2 as select * from dba_objects;
  11.       
  12.     Table created.
  13.       
  14.     SQL> select count(1) from t1;
  15.       
  16.       COUNT(1)
  17.     ----------
  18.             10
  19.       
  20.     SQL> select count(1) from t2;
  21.       
  22.       COUNT(1)
  23.     ----------
  24.          75211
  25.           
  26.     --試驗HASH,可以看出hash是存在驅動與被驅動關係,驅動表適合於兩表間較小的一個,耗用更少的排序開銷;
  27.     SQL> select /*+ leading(t2) use_hash(t2,t1) */t1.status,t2.status
  28.       2 from t1,t2
  29.       3 where t1.object_id=t2.object_id;
  30.       
  31.     STATUS STATUS
  32.     ------- -------
  33.     VALID VALID
  34.     VALID VALID
  35.     VALID VALID
  36.     VALID VALID
  37.     VALID VALID
  38.     VALID VALID
  39.     VALID VALID
  40.     VALID VALID
  41.     VALID VALID
  42.     VALID VALID
  43.       
  44.     10 rows selected.
  45.       
  46.     --看到排序區中排序的開銷為3439K;
  47.     SQL> @allstat
  48.       
  49.     PLAN_TABLE_OUTPUT
  50.     ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  51.     SQL_ID 315bd4pgdyt86, child number 1
  52.     -------------------------------------
  53.     select /*+ leading(t2) use_hash(t2,t1) */t1.status,t2.status from t1,t2
  54.     where t1.object_id=t2.object_id
  55.       
  56.     Plan hash value: 2959412835
  57.       
  58.     -------------------------------------------------------------------------------------------------------------------------
  59.     | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
  60.     -------------------------------------------------------------------------------------------------------------------------
  61.     | 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.08 | 1078 | 1071 | | | |
  62.     |* 1 | HASH JOIN | | 1 | 10 | 10 |00:00:00.08 | 1078 | 1071 | 3439K| 1573K| 5617K (0)|
  63.     | 2 | TABLE ACCESS FULL| T2 | 1 | 79628 | 75211 |00:00:00.02 | 1074 | 1071 | | | |
  64.     | 3 | TABLE ACCESS FULL| T1 | 1 | 10 | 10 |00:00:00.01 | 4 | 0 | | | |
  65.     -------------------------------------------------------------------------------------------------------------------------
  66.       
  67.     Predicate Information (identified by operation id):
  68.     ---------------------------------------------------
  69.       
  70.        1 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
  71.       
  72.     Note
  73.     -----
  74.        - dynamic sampling used for this statement (level=2)
  75.       
  76.       
  77.     25 rows selected.
  78.       
  79.     --而T1表為驅動表時,排序的開銷為1206K比上面的3439K小得多;
  80.     SQL> select /*+ leading(t1) use_hash(t2,t1) */t1.status,t2.status
  81.       2 from t1,t2
  82.       3 where t1.object_id=t2.object_id;
  83.       
  84.     STATUS STATUS
  85.     ------- -------
  86.     VALID VALID
  87.     VALID VALID
  88.     VALID VALID
  89.     VALID VALID
  90.     VALID VALID
  91.     VALID VALID
  92.     VALID VALID
  93.     VALID VALID
  94.     VALID VALID
  95.     VALID VALID
  96.       
  97.     10 rows selected.
  98.       
  99.     SQL> @allstat
  100.       
  101.     PLAN_TABLE_OUTPUT
  102.     ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  103.     SQL_ID 3zc11bg9gdq31, child number 1
  104.     -------------------------------------
  105.     select /*+ leading(t1) use_hash(t2,t1) */t1.status,t2.status from t1,t2
  106.     where t1.object_id=t2.object_id
  107.       
  108.     Plan hash value: 1838229974
  109.       
  110.     -------------------------------------------------------------------------------------------------------------------------
  111.     | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
  112.     -------------------------------------------------------------------------------------------------------------------------
  113.     | 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.08 | 1078 | 1071 | | | |
  114.     |* 1 | HASH JOIN | | 1 | 10 | 10 |00:00:00.08 | 1078 | 1071 | 1206K| 1206K| 1126K (0)|
  115.     | 2 | TABLE ACCESS FULL| T1 | 1 | 10 | 10 |00:00:00.01 | 3 | 0 | | | |
  116.     | 3 | TABLE ACCESS FULL| T2 | 1 | 79628 | 75211 |00:00:00.03 | 1075 | 1071 | | | |
  117.     -------------------------------------------------------------------------------------------------------------------------
  118.       
  119.     Predicate Information (identified by operation id):
  120.     ---------------------------------------------------
  121.       
  122.        1 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
  123.       
  124.     Note
  125.     -----
  126.        - dynamic sampling used for this statement (level=2)
  127.       
  128.       
  129.     25 rows selected.
  130.       
  131.     --而ORACLE在預估COST時第一類的COST也是大於第2類,也就是預設O是會採用第2種方式;
  132.     --第1類的COST為419,而第2類的COST預估為305;
  133.     SQL> set auto traceonly
  134.     SP2-0158: unknown SET autocommit option "traceonly"
  135.     Usage: SET AUTO[COMMIT] { OFF | ON | IMM[EDIATE] | n }
  136.     SQL> set autot traceonly exp
  137.     SQL>
  138.     SQL> select /*+ leading(t2) use_hash(t2,t1) */t1.status,t2.status
  139.       2 from t1,t2
  140.       3 where t1.object_id=t2.object_id;
  141.       
  142.     Execution Plan
  143.     ----------------------------------------------------------
  144.     Plan hash value: 2959412835
  145.       
  146.     -----------------------------------------------------------------------------------
  147.     | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
  148.     -----------------------------------------------------------------------------------
  149.     | 0 | SELECT STATEMENT | | 10 | 360 | | 419 (1)| 00:00:06 |
  150.     |* 1 | HASH JOIN | | 10 | 360 | 2336K| 419 (1)| 00:00:06 |
  151.     | 2 | TABLE ACCESS FULL| T2 | 79628 | 1399K| | 301 (1)| 00:00:04 |
  152.     | 3 | TABLE ACCESS FULL| T1 | 10 | 180 | | 3 (0)| 00:00:01 |
  153.     -----------------------------------------------------------------------------------
  154.       
  155.     Predicate Information (identified by operation id):
  156.     ---------------------------------------------------
  157.       
  158.        1 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
  159.       
  160.     Note
  161.     -----
  162.        - dynamic sampling used for this statement (level=2)
  163.       
  164.     SQL> select /*+ leading(t1) use_hash(t2,t1) */t1.status,t2.status
  165.       2 from t1,t2
  166.       3 where t1.object_id=t2.object_id;
  167.       
  168.     Execution Plan
  169.     ----------------------------------------------------------
  170.     Plan hash value: 1838229974
  171.       
  172.     ---------------------------------------------------------------------------
  173.     | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  174.     ---------------------------------------------------------------------------
  175.     | 0 | SELECT STATEMENT | | 10 | 360 | 305 (1)| 00:00:04 |
  176.     |* 1 | HASH JOIN | | 10 | 360 | 305 (1)| 00:00:04 |
  177.     | 2 | TABLE ACCESS FULL| T1 | 10 | 180 | 3 (0)| 00:00:01 |
  178.     | 3 | TABLE ACCESS FULL| T2 | 79628 | 1399K| 301 (1)| 00:00:04 |
  179.     ---------------------------------------------------------------------------
  180.       
  181.     Predicate Information (identified by operation id):
  182.     ---------------------------------------------------
  183.       
  184.        1 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
  185.       
  186.     Note
  187.     -----
  188.        - dynamic sampling used for this statement (level=2)
  189.          
  190.     --HASH連線方式預設當然是採用後一種COST較小那類;
  191.     SQL> select t1.status,t2.status
  192.       2 from t1,t2
  193.       3 where t1.object_id=t2.object_id;
  194.       
  195.     Execution Plan
  196.     ----------------------------------------------------------
  197.     Plan hash value: 1838229974
  198.       
  199.     ---------------------------------------------------------------------------
  200.     | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  201.     ---------------------------------------------------------------------------
  202.     | 0 | SELECT STATEMENT | | 10 | 360 | 305 (1)| 00:00:04 |
  203.     |* 1 | HASH JOIN | | 10 | 360 | 305 (1)| 00:00:04 |
  204.     | 2 | TABLE ACCESS FULL| T1 | 10 | 180 | 3 (0)| 00:00:01 |
  205.     | 3 | TABLE ACCESS FULL| T2 | 79628 | 1399K| 301 (1)| 00:00:04 |
  206.     ---------------------------------------------------------------------------
  207.       
  208.     Predicate Information (identified by operation id):
  209.     ---------------------------------------------------
  210.       
  211.        1 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
  212.       
  213.     Note
  214.     -----
  215.        - dynamic sampling used for this statement (level=2)
  216.       
  217.          
  218.     二.排序合併連線方式
  219.     --可以看到無論是哪種表排在前面先執行,其開銷都是一樣的;也就是說明merge方式是不存在驅動與被驅動的區別關係 ;
  220.     PLAN_TABLE_OUTPUT
  221.     ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  222.     SQL_ID 2yw072zjb9h5b, child number 1
  223.     -------------------------------------
  224.     select /*+ leading(t1) use_merge(t2,t1) */t1.status,t2.status from
  225.     t1,t2 where t1.object_id=t2.object_id
  226.       
  227.     Plan hash value: 412793182
  228.       
  229.     --------------------------------------------------------------------------------------------------------------------------
  230.     | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
  231.     --------------------------------------------------------------------------------------------------------------------------
  232.     | 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.06 | 1077 | 1071 | | | |
  233.     | 1 | MERGE JOIN | | 1 | 10 | 10 |00:00:00.06 | 1077 | 1071 | | | |
  234.     | 2 | SORT JOIN | | 1 | 10 | 10 |00:00:00.01 | 3 | 0 | 2048 | 2048 | 2048 (0)|
  235.     | 3 | TABLE ACCESS FULL| T1 | 1 | 10 | 10 |00:00:00.01 | 3 | 0 | | | |
  236.     |* 4 | SORT JOIN | | 10 | 79628 | 10 |00:00:00.06 | 1074 | 1071 | 2675K| 740K| 2377K (0)|
  237.     | 5 | TABLE ACCESS FULL| T2 | 1 | 79628 | 75211 |00:00:00.02 | 1074 | 1071 | | | |
  238.     --------------------------------------------------------------------------------------------------------------------------
  239.       
  240.     Predicate Information (identified by operation id):
  241.     ---------------------------------------------------
  242.       
  243.        4 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
  244.            filter("T1"."OBJECT_ID"="T2"."OBJECT_ID")
  245.       
  246.     Note
  247.     -----
  248.        - dynamic sampling used for this statement (level=2)
  249.       
  250.       
  251.     28 rows selected.
  252.       
  253.     PLAN_TABLE_OUTPUT
  254.     ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  255.     SQL_ID csjsgn0hh4dr6, child number 1
  256.     -------------------------------------
  257.     select /*+ leading(t2) use_merge(t2,t1) */t1.status,t2.status from
  258.     t1,t2 where t1.object_id=t2.object_id
  259.       
  260.     Plan hash value: 1792967693
  261.       
  262.     --------------------------------------------------------------------------------------------------------------------------
  263.     | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
  264.     --------------------------------------------------------------------------------------------------------------------------
  265.     | 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.09 | 1077 | 1071 | | | |
  266.     | 1 | MERGE JOIN | | 1 | 10 | 10 |00:00:00.09 | 1077 | 1071 | | | |
  267.     | 2 | SORT JOIN | | 1 | 79628 | 54 |00:00:00.09 | 1074 | 1071 | 2675K| 740K| 2377K (0)|
  268.     | 3 | TABLE ACCESS FULL| T2 | 1 | 79628 | 75211 |00:00:00.02 | 1074 | 1071 | | | |
  269.     |* 4 | SORT JOIN | | 54 | 10 | 10 |00:00:00.01 | 3 | 0 | 2048 | 2048 | 2048 (0)|
  270.     | 5 | TABLE ACCESS FULL| T1 | 1 | 10 | 10 |00:00:00.01 | 3 | 0 | | | |
  271.     --------------------------------------------------------------------------------------------------------------------------
  272.       
  273.     Predicate Information (identified by operation id):
  274.     ---------------------------------------------------
  275.       
  276.        4 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
  277.            filter("T1"."OBJECT_ID"="T2"."OBJECT_ID")
  278.       
  279.     Note
  280.     -----
  281.        - dynamic sampling used for this statement (level=2)
  282.       
  283.     三.巢狀迴圈連線方式是影響最為明顯的,buffers數相差了10倍之多;
  284.       
  285.     PLAN_TABLE_OUTPUT
  286.     ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  287.     SQL_ID 35g7vtpc63s04, child number 0
  288.     -------------------------------------
  289.     select /*+ leading(t1) use_nl(t2,t1) */t1.status,t2.status from t1,t2
  290.     where t1.object_id=t2.object_id
  291.       
  292.     Plan hash value: 1967407726
  293.       
  294.     ----------------------------------------------------------------------------------------------
  295.     | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
  296.     ----------------------------------------------------------------------------------------------
  297.     | 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.11 | 10745 | 10710 |
  298.     | 1 | NESTED LOOPS | | 1 | 10 | 10 |00:00:00.11 | 10745 | 10710 |
  299.     | 2 | TABLE ACCESS FULL| T1 | 1 | 10 | 10 |00:00:00.01 | 4 | 0 |
  300.     |* 3 | TABLE ACCESS FULL| T2 | 10 | 1 | 10 |00:00:00.11 | 10741 | 10710 |
  301.     ----------------------------------------------------------------------------------------------
  302.       
  303.     Predicate Information (identified by operation id):
  304.     ---------------------------------------------------
  305.       
  306.        3 - filter("T1"."OBJECT_ID"="T2"."OBJECT_ID")
  307.       
  308.     Note
  309.     -----
  310.        - dynamic sampling used for this statement (level=2)
  311.          
  312.     PLAN_TABLE_OUTPUT
  313.     ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  314.     SQL_ID 1a5dw45kqph9p, child number 0
  315.     -------------------------------------
  316.     select /*+ leading(t2) use_nl(t2,t1) */t1.status,t2.status from t1,t2
  317.     where t1.object_id=t2.object_id
  318.       
  319.     Plan hash value: 4016936828
  320.       
  321.     ----------------------------------------------------------------------------------------------
  322.     | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
  323.     ----------------------------------------------------------------------------------------------
  324.     | 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.93 | 226K| 1071 |
  325.     | 1 | NESTED LOOPS | | 1 | 10 | 10 |00:00:00.93 | 226K| 1071 |
  326.     | 2 | TABLE ACCESS FULL| T2 | 1 | 79628 | 75211 |00:00:00.06 | 1075 | 1071 |
  327.     |* 3 | TABLE ACCESS FULL| T1 | 75211 | 1 | 10 |00:00:00.74 | 225K| 0 |
  328.     ----------------------------------------------------------------------------------------------
  329.       
  330.     Predicate Information (identified by operation id):
  331.     ---------------------------------------------------
  332.       
  333.        3 - filter("T1"."OBJECT_ID"="T2"."OBJECT_ID")
  334.       
  335.     Note
  336.     -----
  337.        - dynamic sampling used for this statement (level=2)
  338.       
  339.       
  340.     25 rows selected.
  341.       
  342.               
  343.     小結:從以上可看出,影響大到小的順序為nl->hash->merge(不影響);
  344.     應用:日常工作環境中經常可看到,由於nl方式不當而應使用hash的案例,畢竟NL對資料量的大小影響是最為敏感;
  345.     而ORACLE選擇NL方式並沒有錯,錯就錯在預估值那裡,所以當用explain plan for....,autot等方式看並不能看出問題;
  346.     當透過running time的統計資訊收集然後select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
  347.     就會發現真正的原因是預估值與實際值的問題相差過大導致的;
  348.     在這種情況下執行計劃的繫結是比較實用和有效的;


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29209863/viewspace-2128167/,如需轉載,請註明出處,否則將追究法律責任。

相關文章