都是標量子查詢惹的禍

531968912發表於2016-08-08


都是標量子查詢惹的禍


系統又報了一個跑的慢的sql語句,看圖就知道這個很恐怖的,已經跑了1天了,還需要跑6個月的時間,

 

把sql語句拿出來瞅瞅:

--201406 XX

select SUM(UNPOSTING_AMT_CYC) 分期未結清餘額,

SUM(greatest(nvl(BAL_TOTAL,0),0)) 餘額,COUNT(1),DELQ_LEVEL_CI,city,FQ_TOTAL,

CASE WHEN greatest(nvl(BAL_TOTAL,0),0) < 500 THEN '1.[0,500)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 500

AND greatest(nvl(BAL_TOTAL,0),0) < 5000 THEN '2.[500,5000)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 5000

AND greatest(nvl(BAL_TOTAL,0),0) < 20000 THEN '3.[5000,2W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 20000

AND greatest(nvl(BAL_TOTAL,0),0) < 50000 THEN '4[2W,5W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 50000 THEN '5[5W,+)'

else '其他' end 額度區間,

CASE WHEN EXISTS (SELECT 1 FROM RISKPUBSC.MICRO_BUSI_DATABASE B WHERE STATUS_CURR IS NOT NULL AND A.ACCT= B.ACCT) THEN '小企業主'

else '非小企業主' end 小企業主標識

from riskpubstrategy.lwt_ambs_cc_201406 A

group by DELQ_LEVEL_CI,city,FQ_TOTAL,

CASE WHEN greatest(nvl(BAL_TOTAL,0),0) < 500 THEN '1.[0,500)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 500

AND greatest(nvl(BAL_TOTAL,0),0) < 5000 THEN '2.[500,5000)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 5000

AND greatest(nvl(BAL_TOTAL,0),0) < 20000 THEN '3.[5000,2W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 20000

AND greatest(nvl(BAL_TOTAL,0),0) < 50000 THEN '4[2W,5W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 50000 THEN '5[5W,+)'

else '其他' end,

CASE WHEN EXISTS (SELECT 1 FROM RISKPUBSC.MICRO_BUSI_DATABASE B WHERE STATUS_CURR IS NOT NULL AND A.ACCT= B.ACCT) THEN '小企業主'

else '非小企業主' end;

 

記憶體中的執行計劃:SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('a50xqp45uy256',0,'advanced'));

簡單分析下,查詢的只有一個表,這就很奇怪了,不管這個表多麼大,也不可能跑這麼久的吧,但是有一個標量子查詢,根據經驗,如果標量子查詢的表資料量很多,或者關聯的列沒有索引的話,標量子查詢的效能將會是非常差的,我們由執行計劃也可以看出標量子查詢的表RISKPUBSC.MICRO_BUSI_DATABASE的關聯列沒有索引,至此,我們猜測可能是由於這個標量子查詢引起sql語句效能低下,猜測歸猜測,那我們實驗一下呢?

 

先去掉標量子查詢,然後執行一下,如下:

--201406 XX

select SUM(UNPOSTING_AMT_CYC) 分期未結清餘額,

SUM(greatest(nvl(BAL_TOTAL,0),0)) 餘額,COUNT(1),DELQ_LEVEL_CI,city,FQ_TOTAL,

CASE WHEN greatest(nvl(BAL_TOTAL,0),0) < 500 THEN '1.[0,500)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 500

AND greatest(nvl(BAL_TOTAL,0),0) < 5000 THEN '2.[500,5000)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 5000

AND greatest(nvl(BAL_TOTAL,0),0) < 20000 THEN '3.[5000,2W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 20000

AND greatest(nvl(BAL_TOTAL,0),0) < 50000 THEN '4[2W,5W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 50000 THEN '5[5W,+)'

else '其他' end 額度區間/*,

CASE WHEN EXISTS (SELECT 1 FROM RISKPUBSC.MICRO_BUSI_DATABASE B WHERE STATUS_CURR IS NOT NULL AND A.ACCT= B.ACCT) THEN '小企業主'

else '非小企業主' end 小企業主標識*/

from riskpubstrategy.lwt_ambs_cc_201406 A

group by DELQ_LEVEL_CI,city,FQ_TOTAL,

CASE WHEN greatest(nvl(BAL_TOTAL,0),0) < 500 THEN '1.[0,500)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 500

AND greatest(nvl(BAL_TOTAL,0),0) < 5000 THEN '2.[500,5000)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 5000

AND greatest(nvl(BAL_TOTAL,0),0) < 20000 THEN '3.[5000,2W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 20000

AND greatest(nvl(BAL_TOTAL,0),0) < 50000 THEN '4[2W,5W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 50000 THEN '5[5W,+)'

else '其他' end/*,

CASE WHEN EXISTS (SELECT 1 FROM RISKPUBSC.MICRO_BUSI_DATABASE B WHERE STATUS_CURR IS NOT NULL AND A.ACCT= B.ACCT) THEN '小企業主'

else '非小企業主' end*/

 

執行一下: ,發現這個sql很快的,大約274秒,也就是5分鐘左右,好吧,看來真是標量子查詢惹的禍,,,,,,那我們首先建立索引看看效能如何呢?

create index ind_MICRO_BUSI_DATABASE_acct on RISKPUBSC.MICRO_BUSI_DATABASE(acct) NOLOGGING parallel 8;

alter index ind_MICRO_BUSI_DATABASE_acct NOPARALLEL;

 

然後再重新執行一下之前的語句:

--201406 XX

select SUM(UNPOSTING_AMT_CYC) 分期未結清餘額,

SUM(greatest(nvl(BAL_TOTAL,0),0)) 餘額,COUNT(1),DELQ_LEVEL_CI,city,FQ_TOTAL,

CASE WHEN greatest(nvl(BAL_TOTAL,0),0) < 500 THEN '1.[0,500)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 500

AND greatest(nvl(BAL_TOTAL,0),0) < 5000 THEN '2.[500,5000)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 5000

AND greatest(nvl(BAL_TOTAL,0),0) < 20000 THEN '3.[5000,2W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 20000

AND greatest(nvl(BAL_TOTAL,0),0) < 50000 THEN '4[2W,5W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 50000 THEN '5[5W,+)'

else '其他' end 額度區間,

CASE WHEN EXISTS (SELECT 1 FROM RISKPUBSC.MICRO_BUSI_DATABASE B WHERE STATUS_CURR IS NOT NULL AND A.ACCT= B.ACCT) THEN '小企業主'

else '非小企業主' end 小企業主標識

from riskpubstrategy.lwt_ambs_cc_201406 A

group by DELQ_LEVEL_CI,city,FQ_TOTAL,

CASE WHEN greatest(nvl(BAL_TOTAL,0),0) < 500 THEN '1.[0,500)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 500

AND greatest(nvl(BAL_TOTAL,0),0) < 5000 THEN '2.[500,5000)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 5000

AND greatest(nvl(BAL_TOTAL,0),0) < 20000 THEN '3.[5000,2W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 20000

AND greatest(nvl(BAL_TOTAL,0),0) < 50000 THEN '4[2W,5W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 50000 THEN '5[5W,+)'

else '其他' end,

CASE WHEN EXISTS (SELECT 1 FROM RISKPUBSC.MICRO_BUSI_DATABASE B WHERE STATUS_CURR IS NOT NULL AND A.ACCT= B.ACCT) THEN '小企業主'

else '非小企業主' end;

加了索引後其執行計劃:

 

發現效能比較好,,大約就8分鐘搞定,,,,,,至此其實這個sql加個索引就算優化好了,但是根據經驗,如果標量子查詢的表的資料量超大(一般認為超過100W,不是絕對的)的時候,而且關聯的列又不是主鍵的時候,標量子查詢的效能其實不是怎麼好,,,,,,好吧,那我們來看看他們的資料量如何呢?????

 

首先看看資料量,

SELECT v.TABLE_NAME,

v.BLOCKS,

v.table_size2,

v.NUM_ROWS,

v.LAST_ANALYZED

FROM vw_table_lhr v

WHERE v.TABLE_NAME IN ('LWT_AMBS_CC_201406',

'MICRO_BUSI_DATABASE');

發現標量子查詢的表是有點大,300W的資料量,如果是標量子查詢的話,大約要對標量子查詢的表(MICRO_BUSI_DATABASE)掃描1988W次,相當於NL連線了,所以這個效能不怎麼好,那麼就修改為左外連線唄,修改後的sql:

--201406 XX

select SUM(UNPOSTING_AMT_CYC) 分期未結清餘額,

SUM(greatest(nvl(BAL_TOTAL,0),0)) 餘額,COUNT(1),DELQ_LEVEL_CI,city,FQ_TOTAL,

CASE WHEN greatest(nvl(BAL_TOTAL,0),0) < 500 THEN '1.[0,500)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 500

AND greatest(nvl(BAL_TOTAL,0),0) < 5000 THEN '2.[500,5000)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 5000

AND greatest(nvl(BAL_TOTAL,0),0) < 20000 THEN '3.[5000,2W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 20000

AND greatest(nvl(BAL_TOTAL,0),0) < 50000 THEN '4[2W,5W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 50000 THEN '5[5W,+)'

else '其他' end 額度區間,

CASE WHEN b.ACCT is not null THEN '小企業主'

else '非小企業主' end 小企業主標識

from riskpubstrategy.lwt_ambs_cc_201406 A

LEFT outer join RISKPUBSC.MICRO_BUSI_DATABASE B

on (a.ACCT=b.ACCT)

group by DELQ_LEVEL_CI,city,FQ_TOTAL,

CASE WHEN greatest(nvl(BAL_TOTAL,0),0) < 500 THEN '1.[0,500)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 500

AND greatest(nvl(BAL_TOTAL,0),0) < 5000 THEN '2.[500,5000)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 5000

AND greatest(nvl(BAL_TOTAL,0),0) < 20000 THEN '3.[5000,2W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 20000

AND greatest(nvl(BAL_TOTAL,0),0) < 50000 THEN '4[2W,5W)'

WHEN greatest(nvl(BAL_TOTAL,0),0) >= 50000 THEN '5[5W,+)'

else '其他' end,

CASE WHEN b.ACCT is not null THEN '小企業主'

else '非小企業主' end

 

執行計劃:

 

 

時間呢?????尼瑪,,,,,,,這麼快,,,,,由6個月轉換為44秒,,,,,,又是由自行車的速度到飛船的速度呀,,,,,,飛躍吧,,,,,,,,

這裡細心的人會發現,我去掉了b.STATUS_CURR is not null 這個條件,因為我查詢過這個列本身就沒有為空的資料,,,,,這個又是個小小的優化喲,,,,,

 

 

最後簡單總結下:

標量子查詢的使用是有條件的:

①標量子查詢的涉及的表的關聯列必須要有索引或者關聯列是主鍵列

②標量子查詢的涉及的表的資料量不能太大,資料量太大即使關聯列有索引依然很慢的

③對同一個表不能有多個標量子查詢,這樣效能也不會很高

 

第三個注意事項是神馬意思???????簡單舉個例子:

Select (select a from taba t1 where t1.id=t.id) a,

(select b from taba t1 where t1.id=t.id) b,

(select c from taba t1 where t1.id=t.id) c

……….

From tabb ;

 

這種情況大家還是修改為左外連線比較快。。。。。。如果taba稍微大點的話就會導致到大表的多次掃描,這個嚴重影響效能。。。。。。。。。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/26736162/viewspace-1222431/,如需轉載,請註明出處,否則將追究法律責任。

相關文章