nested loops 和hash join的一點測試

xfhuangfu發表於2015-07-04
第一種情況:
表上沒有索引 ,cbo自動選擇 hash join 
SQL> select count(*) from t;
  COUNT(*)
----------
     50371
SQL> select count(*) from t1;
  COUNT(*)
----------
        50
 

SQL> select * from t,t1 where t.object_id=t1.object_id;
已選擇50行。

執行計劃
----------------------------------------------------------
Plan hash value: 1444793974
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    50 | 17700 |   147   (5)| 00:00:02 |
|*  1 |  HASH JOIN         |      |    50 | 17700 |   147   (5)| 00:00:02 |
|   2 |   TABLE ACCESS FULL| T1   |    50 |  8850 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T    | 50473 |  8724K|   143   (3)| 00:00:02 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T"."OBJECT_ID"="T1"."OBJECT_ID")
Note
-----
   - dynamic sampling used for this statement

統計資訊
----------------------------------------------------------
        532  recursive calls
          0  db block gets
        827  consistent gets
          0  physical reads
          0  redo size
       6213  bytes sent via SQL*Net to client
        418  bytes received via SQL*Net from client
          5  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
         50  rows processed
SQL>

現在強制 cbo 走 nested loops
如果走nested loop  效率比較差,不論 use_nl 後面表的順序與from字句後面的順序是否相同,執行計劃都一樣
SQL> select /*+ use_nl(t t1) */ * from t1,t where t.object_id=t1.object_id;
已選擇50行。

執行計劃
----------------------------------------------------------
Plan hash value: 2196473728
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    50 | 17700 |  7077   (4)| 00:01:25 |
|   1 |  NESTED LOOPS      |      |    50 | 17700 |  7077   (4)| 00:01:25 |
|   2 |   TABLE ACCESS FULL| T1   |    50 |  8850 |     2   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T    |     1 |   177 |   141   (3)| 00:00:02 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("T"."OBJECT_ID"="T1"."OBJECT_ID")
Note
-----
   - dynamic sampling used for this statement

統計資訊
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      34861  consistent gets
          0  physical reads
          0  redo size
       6213  bytes sent via SQL*Net to client
        418  bytes received via SQL*Net from client
          5  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         50  rows processed
        
        
  

SQL> select /*+ use_nl(t1 t) */ * from t1,t where t.object_id=t1.object_id;
已選擇50行。

執行計劃
----------------------------------------------------------
Plan hash value: 2196473728
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    50 | 17700 |  7077   (4)| 00:01:25 |
|   1 |  NESTED LOOPS      |      |    50 | 17700 |  7077   (4)| 00:01:25 |
|   2 |   TABLE ACCESS FULL| T1   |    50 |  8850 |     2   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T    |     1 |   177 |   141   (3)| 00:00:02 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("T"."OBJECT_ID"="T1"."OBJECT_ID")
Note
-----
   - dynamic sampling used for this statement

統計資訊
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      34861  consistent gets
          0  physical reads
          0  redo size
       6213  bytes sent via SQL*Net to client
        418  bytes received via SQL*Net from client
          5  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         50  rows processed
 
 

第二種情況

兩個表都有索引,cbo自動選擇走 nested loops

首先建立了兩個表 t、t1
create table t as select * from dba_objects;
create table t1 as select * from dba_objects where rownum<51;
SQL> select count(*) from t;
  COUNT(*)
----------
     50371
SQL> select count(*) from t1;
  COUNT(*)
----------
        50

然後在表 t 、t1上建立索引
      
SQL> alter table t modify object_id not null;
表已更改。
SQL> alter table t1 modify object_id not null;
表已更改。
SQL> create index idx_t on t(object_id);
索引已建立。
SQL> create index idx_t1 on t1(object_id);
索引已建立。
SQL>
SQL> exec dbms_stats.gather_table_stats(user,'t',cascade=>true);
PL/SQL 過程已成功完成。
SQL>
SQL> exec dbms_stats.gather_table_stats(user,'t1',cascade=>true);
PL/SQL 過程已成功完成。
SQL> select * from t,t1 where t.object_id=t1.object_id;
已選擇50行。

執行計劃
----------------------------------------------------------
Plan hash value: 2462202995
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time|
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    50 |  8200 |   102   (0)| 00:00:02 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T     |     1 |    93 |     2   (0)| 00:00:01 |
|   2 |   NESTED LOOPS              |       |    50 |  8200 |   102   (0)| 00:00:02 |
|   3 |    TABLE ACCESS FULL        | T1    |    50 |  3550 |     2   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN         | IDX_T |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T"."OBJECT_ID"="T1"."OBJECT_ID")

統計資訊
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         71  consistent gets
          0  physical reads
          0  redo size
       6213  bytes sent via SQL*Net to client
        418  bytes received via SQL*Net from client
          5  SQL*Net roundtrips to/from client
          0  sorts (memory)
         
         
         
 
因為 nested  loops為相關聯合型查詢,首先執行3 ,對 小表 t1 只進行一次全表掃描,然後返回一個結果集,該結果集包含50條記錄。
然後從結果集中每次拿出一條記錄在 t 表進行掃描,因為 t表上有索引 IDX_T , 所以就應用 條件 "T"."OBJECT_ID"="T1"."OBJECT_ID"
進行 索引範圍掃描。索引範圍掃描 迴圈執行50次,因此個人認為 4 中的 rows列應為 50 。 
從 t 表 索引  IDX_T  中 掃描得到 50個  ,然後 1 透過 鍵值 對應的 rowid 訪問 t 表 ,一共要訪問 50 條資料  ,因此 個人認為 1 中的 rows 列應為 50 。
這種方法看到的計劃只能理解為驅動表返回多少行,就迴圈多少次,所以他裡面很多1,每次只能找到1行

可以透過下面的語句獲取我想得到的結果

SQL> select /*+ gather_plan_statistics */ * from t,t1 where t.object_id=t1.object_id;

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  25j7fws7h3zhd, child number 1
-------------------------------------
select /*+ gather_plan_statistics */ * from t,t1 where t.object_id=t1.object_id
Plan hash value: 2462202995
----------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Starts | E-Rows | A-Rows  |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------
|   1 |  TABLE ACCESS BY INDEX ROWID  | T     |      1 |      1 |     50  |00:00:00.01 |      71 |
|   2 |   NESTED LOOPS                |       |      1 |     50 |    101  |00:00:00.01 |      66 |
|   3 |    TABLE ACCESS FULL          | T1    |      1 |     50 |     50  |00:00:00.01 |       7 |
|*  4 |    INDEX RANGE SCAN           | IDX_T |     50 |      1 |     50  |00:00:00.01 |      59 |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T"."OBJECT_ID"="T1"."OBJECT_ID")
已選擇20行。
 
starts 表示 操作執行的次數

也可以透過 10046 分析
Rows     Row Source Operation
-------  ---------------------------------------------------
     50  TABLE ACCESS BY INDEX ROWID T (cr=71 pr=0 pw=0 time=121 us)
    101   NESTED LOOPS  (cr=66 pr=0 pw=0 time=8806 us)
     50    TABLE ACCESS FULL T1 (cr=7 pr=0 pw=0 time=203 us)
     50    INDEX RANGE SCAN IDX_T (cr=59 pr=0 pw=0 time=259 us)(object id 67167)
    
也可以驗證執行了 50次 ,而 NESTED LOOPS 行中的 101 是前面兩步的累計值, 迴圈的次數取決於驅動表 t1
 
 

    
現在強制 cbo 走 hash join
leading 提示要求先訪問t1表,也就是小表 t1 作為構造輸入表
use_hash(t)  表示使用 hash join ,t表作為探測輸入表

SQL> select /*+ leading(t1) use_hash(t) */ * from t1,t where t.object_id=t1.object_id;
已選擇50行。

執行計劃
----------------------------------------------------------
Plan hash value: 1444793974
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    50 |  8200 |   147   (5)| 00:00:02 |
|*  1 |  HASH JOIN                    |      |    50 |  8200 |   147   (5)| 00:00:02 |
|   2 |   TABLE ACCESS FULL | T1 |    50 |  3550 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FUL L| T    | 50371 |  4574K|   143   (3)| 00:00:02 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T"."OBJECT_ID"="T1"."OBJECT_ID")

統計資訊
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        704  consistent gets
          0  physical reads
          0  redo size
       6213  bytes sent via SQL*Net to client
        418  bytes received via SQL*Net from client
          5  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         
         
         
 
 
加入index提示,強制走 t、t1表的索引
     
 
SQL> select /*+ leading(t1) use_hash(t) index(t1) index(t) */ * from t1,t where t.object_id=t1.object_id;
已選擇50行。

執行計劃
----------------------------------------------------------
Plan hash value: 844093764
---------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |    50 |  8200 |   887   (1)| 00:00:11 |
|*  1 |  HASH JOIN                   |        |    50 |  8200 |   887   (1)| 00:00:11 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |    50 |  3550 |     2   (0)| 00:00:01 |
|   3 |    INDEX FULL SCAN           | IDX_T1 |    50 |       |     1   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| T      | 50371 |  4574K|   883   (1)| 00:00:11 |
|   5 |    INDEX FULL SCAN           | IDX_T  | 50371 |       |   114   (2)| 00:00:02 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T"."OBJECT_ID"="T1"."OBJECT_ID")

統計資訊
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        887  consistent gets
          0  physical reads
          0  redo size
       4745  bytes sent via SQL*Net to client
        418  bytes received via SQL*Net from client
          5  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         50  rows processed
SQL>
 
總結:
nested loops 適用場景:
1:有一個表比較小
2:被關聯欄位上有索引
3:索引鍵值重複率不應該很高
 
 

hash join 使用場景:
1:一個大表、一個小表
2:關聯欄位上沒有索引(欄位hash完以後,欄位的搜尋不再受索引影響,只收到hash列表效率的影響)
3:返回的結果集比較大
 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/28373936/viewspace-1722284/,如需轉載,請註明出處,否則將追究法律責任。

相關文章