NEST LOOP改為HASH JOIN 效能提高6倍

kewin發表於2011-08-31
NEST LOOP改為HASH JOIN 效能提高6倍
Kevin Zou
2011-8-31

在看系統的AWR report時,發現有個SQL語句出現頻率很高,而且都是在消耗資源的top 5內。決定把這個SQL拉出來看看。

 523,370           1     523,370.0    2.7   253.31   2576.45 bcy886xsqp4u1
Module: sqr@phcmdb (TNS V1-V3)
SELECT b.EMPLID, b.DUR, b.TRC, b.TL_QUANTITY, b.ACTUAL_PUB_DATE, b.PAYABLE_STATU
S, b.TL_PYBL_REASON_CD from ps_job a, ps_tl_payable_time b where a.emplid = b.e
mplid and a.empl_rcd = b.empl_rcd and a.effdt = (select max(a_ed.effdt) from ps_
job a_ed where a.emplid = a_ed.emplid and a.empl_rcd = a_ed.empl_rcd and a_ed.ef

通過V$SQL找到完整的SQL語句:
SELECT  b.EMPLID, b.DUR, b.TRC, b.TL_QUANTITY, b.ACTUAL_PUB_DATE, b.PAYABLE_STATUS, 
b.TL_PYBL_REASON_CD  from  ps_job a,  ps_tl_payable_time b where a.emplid = b.emplid 
and a.empl_rcd = b.empl_rcd 
and a.effdt = (select max(a_ed.effdt) from sysadm.ps_job a_ed where a.emplid = a_ed.emplid 
and a.empl_rcd = a_ed.empl_rcd and a_ed.effdt <= B.DUR) 
AND a.EFFSEQ =(SELECT MAX(a_es.EFFSEQ) FROM sysadm.ps_job a_es WHERE a.EMPLID=a_es.EMPLID 
AND a.EMPL_RCD=a_es.EMPL_RCD AND a_es.EFFDT=a.EFFDT) AND A.COMPANY = '&py'
 AND B.DUR BETWEEN '01-JAN-2011' AND '15-AUG-2011' AND A.PAYGROUP IN('2DS')  
 AND B.TRC IN ('2OT10','2OT15','2OT20','2OT30')   Order by b.EMPLID,b.DUR,b.TRC

 ps_job表有130W的資料;
ps_tl_payable_time 有430W的資料;

檢視其執行計劃:
explain plan for 
SELECT  b.EMPLID, b.DUR, b.TRC, b.TL_QUANTITY, b.ACTUAL_PUB_DATE, b.PAYABLE_STATUS, 
b.TL_PYBL_REASON_CD  from  ps_job a,  ps_tl_payable_time b where a.emplid = b.emplid 
and a.empl_rcd = b.empl_rcd 
and a.effdt = (select max(a_ed.effdt) from sysadm.ps_job a_ed where a.emplid = a_ed.emplid 
and a.empl_rcd = a_ed.empl_rcd and a_ed.effdt <= B.DUR) 
AND a.EFFSEQ =(SELECT MAX(a_es.EFFSEQ) FROM sysadm.ps_job a_es WHERE a.EMPLID=a_es.EMPLID 
AND a.EMPL_RCD=a_es.EMPL_RCD AND a_es.EFFDT=a.EFFDT) AND A.COMPANY = '&py'
 AND B.DUR BETWEEN '01-JAN-2011' AND '15-AUG-2011' AND A.PAYGROUP IN('2DS')  
 AND B.TRC IN ('2OT10','2OT15','2OT20','2OT30')   Order by b.EMPLID,b.DUR,b.TRC

execute plan:
SYS@hr9prd>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
Plan hash value: 1194964640

-----------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                    |     1 |    73 |  1369K  (1)| 03:33:53 |
|   1 |  SORT ORDER BY                    |                     |     1 |    73 |  1369K  (1)| 03:33:53 |
|*  2 |   FILTER                                |                    |       |       |            |          |
|   3 |    NESTED LOOPS                   |                    |   108 |  7884 |   157K  (1)| 00:24:34 |
|*  4 |     TABLE ACCESS FULL          | PS_JOB             | 40296 |  1180K| 17164   (1)| 00:02:41 |
|*  5 |     TABLE ACCESS BY INDEX ROWID| PS_TL_PAYABLE_TIME |     1 |    43 |     4   (0)| 00:00:01 |
|*  6 |      INDEX RANGE SCAN          | PS_TL_PAYABLE_TIME |     1 |       |     3   (0)| 00:00:01 |
|   7 |       SORT AGGREGATE           |                    |     1 |    19 |            |          |
|*  8 |        INDEX RANGE SCAN        | PSAJOB             |     1 |    19 |     3   (0)| 00:00:01 |
|   9 |    SORT AGGREGATE              |                    |     1 |    22 |            |          |
|* 10 |     INDEX RANGE SCAN           | PSAJOB             |     1 |    22 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"."EFFSEQ"= (SELECT MAX(SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFSEQ"))) FROM
              "SYSADM"."PS_JOB" "A_ES" WHERE SYS_OP_DESCEND("EFFDT")=SYS_OP_DESCEND(:B1) AND
              "A_ES"."EMPL_RCD"=:B2 AND "A_ES"."EMPLID"=:B3 AND
              SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))=:B4))
   4 - filter("A"."PAYGROUP"='2DS' AND "A"."COMPANY"='SPT')
   5 - filter("B"."TRC"='2OT10' OR "B"."TRC"='2OT15' OR "B"."TRC"='2OT20' OR
              "B"."TRC"='2OT30')
   6 - access("A"."EMPLID"="B"."EMPLID" AND "A"."EMPL_RCD"="B"."EMPL_RCD" AND
              SYS_OP_DESCEND("DUR")>=HEXTORAW('8790F7F0FEF8FEFAFF')  AND
              SYS_OP_DESCEND("DUR")<=HEXTORAW('8790FEF8FEF8FEFAFF') )
       filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("DUR"))>=TO_DATE(' 2011-01-01 00:00:00',
              'syyyy-mm-dd hh24:mi:ss') AND SYS_OP_UNDESCEND(SYS_OP_DESCEND("DUR"))<=TO_DATE(' 2011-08-15
              00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "A"."EFFDT"= (SELECT
              MAX(SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))) FROM "SYSADM"."PS_JOB" "A_ED" WHERE
              SYS_OP_DESCEND("EFFDT") IS NOT NULL AND SYS_OP_DESCEND("EFFDT")>=SYS_OP_DESCEND(:B1) AND
              "A_ED"."EMPL_RCD"=:B2 AND "A_ED"."EMPLID"=:B3 AND
              SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))<=:B4))
   8 - access("A_ED"."EMPLID"=:B1 AND "A_ED"."EMPL_RCD"=:B2 AND
              SYS_OP_DESCEND("EFFDT")>=SYS_OP_DESCEND(:B3) AND SYS_OP_DESCEND("EFFDT") IS NOT NULL)
       filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))<=:B1)
  10 - access("A_ES"."EMPLID"=:B1 AND "A_ES"."EMPL_RCD"=:B2 AND
              SYS_OP_DESCEND("EFFDT")=SYS_OP_DESCEND(:B3))
       filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))=:B1)

44 rows selected.


看到兩個大表在做NEST LOOP.這明顯是不合理的。

檢視其消耗的資源:
SYS@hr9prd>select  RUNTIME_MEM , FETCHES ,  EXECUTIONS , DISK_READS , BUFFER_GETS  ,USER_IO_WAIT_TIME ,ROWS_PROCESSED 
from v$sqlarea
where sql_id ='bcy886xsqp4u1'   2    3  ;

RUNTIME_MEM       FETCHES EXECUTIONS DISK_READS   BUFFER_GETS    USER_IO_WAIT_TIME ROWS_PROCESSED
----------- ----------   ---------- - ---------           -----------       -----------------    --------------
      26448           132435          1                  599315          20765595            2611266235        1324338

USER_IO_WAIT_TIME為2611266235microseconds = 2611 seconds. 而在AWR REPORT中SQL elapsed time為 2576.45 (這裡USER_IO_WAIT_TIME >elapsed_time ,可能是兩者的統計有出入導致,在這裡不是重點)。說明SQL的等待時間全部都是發在IO 等待上。
加入HINT,改寫SQL
SELECT /*+ ordered */ b.EMPLID, b.DUR, b.TRC, b.TL_QUANTITY, b.ACTUAL_PUB_DATE, b.PAYABLE_STATUS, 
b.TL_PYBL_REASON_CD  from  ps_tl_payable_time b ,ps_job a  where a.emplid = b.emplid 
and a.empl_rcd = b.empl_rcd 
and a.effdt = (select max(a_ed.effdt) from sysadm.ps_job a_ed where a.emplid = a_ed.emplid 
and a.empl_rcd = a_ed.empl_rcd and a_ed.effdt <= B.DUR) 
AND a.EFFSEQ =(SELECT MAX(a_es.EFFSEQ) FROM sysadm.ps_job a_es WHERE a.EMPLID=a_es.EMPLID 
AND a.EMPL_RCD=a_es.EMPL_RCD AND a_es.EFFDT=a.EFFDT) AND A.COMPANY = '&py'
 AND B.DUR BETWEEN '01-JAN-2011' AND '15-AUG-2011' AND A.PAYGROUP IN('2DS')  
 AND B.TRC IN ('2OT10','2OT15','2OT20','2OT30')   Order by b.EMPLID,b.DUR,b.TRC 

Elapsed: 00:06:14.59

Execution Plan
----------------------------------------------------------
Plan hash value: 2212746762


| Id  | Operation             | Name               | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT      |                    |     1 |    73 |       |  1470K  (1)| 03:49:40 |
|   1 |  SORT ORDER BY        |                    |     1 |    73 |    34M|  1470K  (1)| 03:49:40 |
|*  2 |   FILTER              |                    |       |       |       |        |          |
|*  3 |    HASH JOIN          |                    |   393K|    27M|    94M|   291K  (1)| 00:45:35 |
|*  4 |     TABLE ACCESS FULL | PS_TL_PAYABLE_TIME |  1798K|    73M|       |   272K  (1)| 00:42:31 |
|*  5 |     TABLE ACCESS FULL | PS_JOB             | 40296 |  1180K|       | 17164   (1)| 00:02:41 |
|   6 |    SORT AGGREGATE     |                    |     1 |    19 |       |        |          |
|*  7 |     INDEX RANGE SCAN  | PSAJOB             |     1 |    19 |       | 3   (0)| 00:00:01 |
|   8 |      SORT AGGREGATE   |                    |     1 |    22 |       |        |          |
|*  9 |       INDEX RANGE SCAN| PSAJOB             |     1 |    22 |       | 3   (0)| 00:00:01 |



Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"."EFFDT"= (SELECT MAX(SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))) FROM

              "SYSADM"."PS_JOB" "A_ED" WHERE SYS_OP_DESCEND("EFFDT") IS NOT NULL AND

              SYS_OP_DESCEND("EFFDT")>=SYS_OP_DESCEND(:B1) AND "A_ED"."EMPL_RCD"=:B2 AND

              "A_ED"."EMPLID"=:B3 AND SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))<=:B4) AND "A"."EFFSEQ"=

              (SELECT MAX(SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFSEQ"))) FROM "SYSADM"."PS_JOB" "A_ES" WHERE

              SYS_OP_DESCEND("EFFDT")=SYS_OP_DESCEND(:B5) AND "A_ES"."EMPL_RCD"=:B6 AND

              "A_ES"."EMPLID"=:B7 AND SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))=:B8))

   3 - access("A"."EMPLID"="B"."EMPLID" AND "A"."EMPL_RCD"="B"."EMPL_RCD")
   4 - filter("B"."DUR">=TO_DATE(' 2011-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND

              ("B"."TRC"='2OT10' OR "B"."TRC"='2OT15' OR "B"."TRC"='2OT20' OR "B"."TRC"='2OT30') AND

              "B"."DUR"<=TO_DATE(' 2011-08-15 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

   5 - filter("A"."PAYGROUP"='2DS' AND "A"."COMPANY"='SPT')
   7 - access("A_ED"."EMPLID"=:B1 AND "A_ED"."EMPL_RCD"=:B2 AND
              SYS_OP_DESCEND("EFFDT")>=SYS_OP_DESCEND(:B3) AND SYS_OP_DESCEND("EFFDT") IS NOT NULL)

       filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))<=:B1)
   9 - access("A_ES"."EMPLID"=:B1 AND "A_ES"."EMPL_RCD"=:B2 AND
              SYS_OP_DESCEND("EFFDT")=SYS_OP_DESCEND(:B3))
       filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))=:B1)


Statistics
----------------------------------------------------------
        178  recursive calls
          3  db block gets
    7622155  consistent gets
    1572401  physical reads
          0  redo size
   36340125  bytes sent via SQL*Net to client
     971589  bytes received via SQL*Net from client
      88284  SQL*Net roundtrips to/from client
          0  sorts (memory)
          1  sorts (disk)
    1324243  rows processed


整個SQL用時6分15秒就執行完畢。
對比RUNTIME STATS:
                          前          後     比較
physical reads      599315      1572401      增加了973086
consistent gets     20765595    7622155     減少了13143440
Elaped   Time       2576.45      375        僅為之前的15%

如果看到兩個大表直接做聯合,如果優化器選擇了NEST LOOP 作為ACCESS PATH,執行效率很差,要考慮改為HASH JOIN 以提高效能。

-THE END-

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/40239/viewspace-706387/,如需轉載,請註明出處,否則將追究法律責任。

相關文章