生產系統調優之_敢於質疑

jeanron100發表於2014-06-01

接著昨天的那個問題來說。有個sql語句在做了統計資訊收集之後,速度有了一定的提升,從5秒的響應降低到了2秒。但是和預期還是有一定
的差距,按照80條查詢請求在短時間內響應。2秒*80000次/60/60=44.4小時,本來感覺可以接受的一下子就成了大問題。

Elapsed Time (s)

Executions

Elapsed Time per Exec (s)

%Total

%CPU

%IO

SQL Id

SQL Module

SQL Text

31st May

305.44

145

2.11

25.56

99.69

0.26

2fjzq67jbztwv

gext1GenericEx@ccbdbpr1 (TNS V1-V3)

/* */ SELECT DISTINCT 'K', AR....


和那個同事又確認了下,他說在其他專案也用的這個sql語句資料量還要大的多,就是沒有問題。
有了昨天的一些資料,我自己也基本心裡有數了,我表示懷疑,在此基礎上能做的就是仔細看看這個sql語句了,看看到底實現細節有沒有問題。
可以看到加入了一些hint,嵌入了子查詢。
SELECT DISTINCT 'K',
                AR.RESOURCE_VALUE,
                AR.RESOURCE_TYPE,
                GREATEST(TO_CHAR(AR.EFFECTIVE_DATE, 'YYYYMMDDHH24MISS'),
                         TO_CHAR(SB.EFFECTIVE_DATE, 'YYYYMMDDHH24MISS')),
                LEAST(NVL(TO_CHAR(AR.EXPIRATION_DATE, 'YYYYMMDDHH24MISS'),
                          '47001231000000'),
                      NVL(TO_CHAR(SB.EXPIRATION_DATE, 'YYYYMMDDHH24MISS'),
                          '47001231000000')),
                AR.AGR_NO,
                SB.MEDIUM_CUS_ID,
                SB.SUB_STATUS,
                SB.BUSINESS_ENTITY_ID,
                SB.LANGUAGE,
                SB.ROUTING_POLICY_ID,
                SB.L9_PORT_IND,
                SB.L9_SPLIT_PERIOD
  FROM HUGE_RESOURCE AR,  --這個表有大約2000多萬條資料,做了分割槽
       MEDIUM_SUB SB,   --這個表有50萬左右的資料
       (select /*+ RESULT_CACHE */
        DISTINCT PARAM_NAME as PARAM_NAME 
          from SMALL_PARAM    --這個表比較小,有不到2萬條記錄,如果加過濾條件,能過濾掉一半多的資料,因為那個欄位不在索引欄位裡,所以加了result_cache
         where GUIDING_IND = 'Y') OP,
       MEDIUM_CUS CS
WHERE AR.AGR_NO = 1056851
   AND AR.AGREEMENT_KEY = MOD(1056851, 100)
   AND (AR.RESOURCE_STATE != 'F' OR AR.RESOURCE_STATE IS NULL)
   AND AR.RANGE_IND = 'N'
   AND SB.MEDIUM_SUB_NO = AR.AGR_NO
   AND EXISTS
(select /*+ INDEX(OP SMALL_PARAM_1IX) */
         1
          from SMALL_PARAM OP
         where OP.PARAM_NAME in (AR.RESOURCE_PRM_CD, AR.BASE_PARAM_NAME) )
   AND SB.MEDIUM_CUS_ID = CS.MEDIUM_CUS_ID
   AND (SB.EXPIRATION_DATE IS NULL OR SB.EXPIRATION_DATE >= to_date('19000101000000','YYYYMMDDHH24MISS'))
   AND (AR.EXPIRATION_DATE IS NULL OR AR.EXPIRATION_DATE >= to_date('19000101000000','YYYYMMDDHH24MISS'))
   AND (AR.EFFECTIVE_DATE < SB.EXPIRATION_DATE OR
       SB.EXPIRATION_DATE IS NULL)
   AND (AR.EXPIRATION_DATE IS NULL OR
       AR.EXPIRATION_DATE > SB.EFFECTIVE_DATE)
   AND SB.SUB_STATUS != 'T'
UNION ALL
SELECT DISTINCT 'K',
                AR.RESOURCE_VALUE,
                AR.RESOURCE_TYPE,
                GREATEST(TO_CHAR(AR.EFFECTIVE_DATE, 'YYYYMMDDHH24MISS'),
                         TO_CHAR(SH.EFFECTIVE_DATE, 'YYYYMMDDHH24MISS')),
                LEAST(NVL(TO_CHAR(AR.EXPIRATION_DATE, 'YYYYMMDDHH24MISS'),
                          '47001231000000'),
                      NVL(TO_CHAR(SH.EXPIRATION_DATE, 'YYYYMMDDHH24MISS'),
                          '47001231000000')),
                AR.AGR_NO,
                SH.MEDIUM_CUS_ID,
                SH.SUB_STATUS,
                SH.BUSINESS_ENTITY_ID,
                SH.LANGUAGE,
                SH.ROUTING_POLICY_ID,
                SH.L9_PORT_IND,
                SH.L9_SPLIT_PERIOD
  FROM HUGE_RESOURCE AR,
       MEDIUM_SUB_HISTORY SH,
       (select /*+ RESULT_CACHE */
        DISTINCT PARAM_NAME as PARAM_NAME
          from SMALL_PARAM
         where GUIDING_IND = 'Y') OP,
       MEDIUM_CUS CS
WHERE AR.AGR_NO = 1056851
   AND AR.AGREEMENT_KEY = MOD(1056851, 100)
   AND (AR.RESOURCE_STATE != 'F' OR AR.RESOURCE_STATE IS NULL)
   AND AR.RANGE_IND = 'N'
   AND SH.MEDIUM_SUB_NO = AR.AGR_NO
   AND EXISTS
(select /*+ INDEX(OP SMALL_PARAM_1IX) */
         1
          from SMALL_PARAM OP
         where OP.PARAM_NAME in (AR.RESOURCE_PRM_CD, AR.BASE_PARAM_NAME))
   AND SH.MEDIUM_CUS_ID = CS.MEDIUM_CUS_ID
   AND (SH.EXPIRATION_DATE IS NULL OR SH.EXPIRATION_DATE >= to_date('19000101000000','YYYYMMDDHH24MISS'))
   AND (AR.EXPIRATION_DATE IS NULL OR AR.EXPIRATION_DATE >= to_date('19000101000000','YYYYMMDDHH24MISS'))
   AND (AR.EFFECTIVE_DATE < SH.EXPIRATION_DATE OR
       SH.EXPIRATION_DATE IS NULL)
   AND (AR.EXPIRATION_DATE IS NULL OR
       AR.EXPIRATION_DATE > SH.EFFECTIVE_DATE)
   AND SH.SUB_STATUS NOT IN ('C', 'T')



試著在生產上抓了一個執行計劃,又看了一下統計資訊。

Statistics
----------------------------------------------------------
        113  recursive calls
          8  db block gets
      31581  consistent gets
          0  physical reads
          0  redo size
       1997  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         20  sorts (memory)
          0  sorts (disk)
          6  rows processed
 Elapsed: 00:02:44.14



---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name                       | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                            |    13 |  2905 |   519  (59)| 00:00:07 |       |       |


在備份庫上檢視,結果同樣的查詢一下子沒有了反應。執行了將近3分鐘還是沒有反應。備份庫的資源要比生產差一些。好了感覺有問題了。開始看看這個sql語句的第一部分。
SELECT DISTINCT 'K',
                AR.RESOURCE_VALUE,
                AR.RESOURCE_TYPE,
                GREATEST(TO_CHAR(AR.EFFECTIVE_DATE, 'YYYYMMDDHH24MISS'),
                         TO_CHAR(SB.EFFECTIVE_DATE, 'YYYYMMDDHH24MISS')),
                LEAST(NVL(TO_CHAR(AR.EXPIRATION_DATE, 'YYYYMMDDHH24MISS'),
                          '47001231000000'),
                      NVL(TO_CHAR(SB.EXPIRATION_DATE, 'YYYYMMDDHH24MISS'),
                          '47001231000000')),
                AR.AGR_NO,
                SB.MEDIUM_CUS_ID,
                SB.SUB_STATUS,
                SB.BUSINESS_ENTITY_ID,
                SB.LANGUAGE,
                SB.ROUTING_POLICY_ID,
                SB.L9_PORT_IND,
                SB.L9_SPLIT_PERIOD
  FROM HUGE_RESOURCE AR,  --這個表有大約2000多萬條資料,做了分割槽
       MEDIUM_SUB SB,   --這個表有50萬左右的資料
       (select /*+ RESULT_CACHE */
        DISTINCT PARAM_NAME as PARAM_NAME 
          from SMALL_PARAM    --這個表比較小,有不到2萬條記錄,如果加過濾條件,能過濾掉一半多的資料,因為那個欄位不在索引欄位裡,所以加了result_cache
         where GUIDING_IND = 'Y') OP,
       MEDIUM_CUS CS
WHERE AR.AGR_NO = 1056851
   AND AR.AGREEMENT_KEY = MOD(1056851, 100)
   AND (AR.RESOURCE_STATE != 'F' OR AR.RESOURCE_STATE IS NULL)
   AND AR.RANGE_IND = 'N'
   AND SB.MEDIUM_SUB_NO = AR.AGR_NO
   AND EXISTS
(select /*+ INDEX(OP SMALL_PARAM_1IX) */
         1
          from SMALL_PARAM OP
         where OP.PARAM_NAME in (AR.RESOURCE_PRM_CD, AR.BASE_PARAM_NAME) )
   AND SB.MEDIUM_CUS_ID = CS.MEDIUM_CUS_ID
   AND (SB.EXPIRATION_DATE IS NULL OR SB.EXPIRATION_DATE >= to_date('19000101000000','YYYYMMDDHH24MISS'))
   AND (AR.EXPIRATION_DATE IS NULL OR AR.EXPIRATION_DATE >= to_date('19000101000000','YYYYMMDDHH24MISS'))
   AND (AR.EFFECTIVE_DATE < SB.EXPIRATION_DATE OR
       SB.EXPIRATION_DATE IS NULL)
   AND (AR.EXPIRATION_DATE IS NULL OR
       AR.EXPIRATION_DATE > SB.EFFECTIVE_DATE)
   AND SB.SUB_STATUS != 'T'
因為結果集的輸出中沒有op這個表的列,而且在where子句中存在exists語句,在exists裡面也沒有做關聯,那個同事堅持說想在做關聯的時候把op的資料先做了result cache,在子查詢中就能做關聯了,避免重複的表掃描。聽起來好像有道理,我覺得語句有問題,儘管說是產品部分提供的方案。op在from 後,但是和後面的流程都沒有關聯,但也沒有做笛卡爾積。在他的強烈反對中我把以下的部分從from中刪除。
  (select /*+ RESULT_CACHE */
        DISTINCT PARAM_NAME as PARAM_NAME 
          from SMALL_PARAM    --這個表比較小,有不到2萬條記錄,如果加過濾條件,能過濾掉一半多的資料,因為那個欄位不在索引欄位裡,所以加了result_cache
         where GUIDING_IND = 'Y') OP,
然後在備份庫上重新跑一次,沒想到一下子就有了反應。存在一定的物理讀,第二次執行就沒有了。邏輯度有了近6倍的提升,執行時間在0.02-0.07之間。
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       5854  consistent gets
        605  physical reads
         72  redo size
       1900  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          4  rows processed
按照這個速度,0.07秒*80000次/60/60=1.55小時,剩下的事情就是和他們確認一些具體的細節了。下午晚些時候產品部分也確認這確實是個問題。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/23718752/viewspace-1173785/,如需轉載,請註明出處,否則將追究法律責任。

相關文章