遇到個子查詢巢狀 UNION ALL 的SQL語句很慢,謂詞過濾條件不能內推進去,需要最佳化這段 UNION ALL這塊的內容。
UNION ALL 慢SQL:
SELECT * FROM ((SELECT jfw03.id AS newid, jfw03.emc_person_id AS bizPersonOid, jfw03.jfw03001 AS jfw03001, jfw03.jfw03002 AS jfw03002, jfw03.jfw03003 AS jfw03003, jfw03.jfw03004 AS jfw03004, jfw03.jfw03005 AS jfw03005, jfw03.jfw03006 AS jfw03006, jfw03.jfw03007 AS jfw03007, jfw03.remark AS remark, old03.id AS oldId, old03.jfw03001 AS jfw03001pre, old03.jfw03002 AS jfw03002pre, old03.jfw03003 AS jfw03003pre, old03.jfw03004 AS jfw03004pre, old03.jfw03005 AS jfw03005pre, old03.jfw03006 AS jfw03006pre, old03.jfw03007 AS jfw03007pre, old03.remark AS remarkpre FROM emc jfw03 LEFT JOIN emc old03 ON jfw03.emc_person_id = old03.emc_person_id WHERE jfw03.jfw03001 = old03.jfw03001 AND jfw03.handle_mark = '2' AND old03.handle_mark IN ('1', '3')) UNION ALL (SELECT jfw03.id AS newid, jfw03.emc_person_id AS bizPersonOid, jfw03.jfw03001, jfw03.jfw03002, jfw03.jfw03003, jfw03.jfw03004, jfw03.jfw03005, jfw03.jfw03006, jfw03.jfw03007, jfw03.remark, NULL AS oldId, NULL AS jfw03001pre, NULL AS jfw03002pre, NULL AS jfw03003pre, NULL AS jfw03004pre, NULL AS jfw03005pre, NULL AS jfw03006pre, NULL AS jfw03007pre, NULL AS remarkpre FROM emc jfw03 WHERE jfw03.handle_mark = '2' AND NOT EXISTS (SELECT 1 FROM emc old03 WHERE jfw03.emc_person_id = old03.emc_person_id AND jfw03.jfw03001 = old03.jfw03001 AND old03.handle_mark IN ('1', '3'))) UNION ALL (SELECT NULL AS newid, old03.emc_person_id AS bizPersonOid, NULL AS jfw03001, NULL AS jfw03002, NULL AS jfw03003, NULL AS jfw03004, NULL AS jfw03005, NULL AS jfw03006, NULL AS jfw03007, NULL AS remark, old03.id AS oldId, old03.jfw03001 AS jfw03001pre, old03.jfw03002 AS jfw03002pre, old03.jfw03003 AS jfw03003pre, old03.jfw03004 AS jfw03004pre, old03.jfw03005 AS jfw03005pre, old03.jfw03006 AS jfw03006pre, old03.jfw03007 AS jfw03007pre, old03.remark AS remarkpre FROM emc old03 WHERE old03.handle_mark IN ('1', '3') AND NOT EXISTS (SELECT 1 FROM emc jfw03 WHERE jfw03.emc_person_id = old03.emc_person_id AND jfw03.jfw03001 = old03.jfw03001 AND jfw03.handle_mark = '2'))) jfw03;
執行計劃:
QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------- Append (cost=0.69..7253737.08 rows=49711 width=274) (actual time=0.252..60906.535 rows=3291407 loops=1) -> Nested Loop (cost=0.69..2753734.17 rows=49709 width=274) (actual time=0.251..16535.465 rows=49558 loops=1) -> Seq Scan on emc jfw03 (cost=0.00..347258.58 rows=1132320 width=139) (actual time=0.010..5103.042 rows=1114280 loops=1) Filter: (handle_mark = '2'::bpchar) Rows Removed by Filter: 8885720 -> Index Scan using idx_1_2_3_emc on emc old03 (cost=0.69..2.12 rows=1 width=139) (actual time=0.010..0.010 rows=0 loops=1114280) Index Cond: ((jfw03001 = jfw03.jfw03001) AND (emc_person_id = jfw03.emc_person_id)) Filter: (handle_mark = ANY ('{1,3}'::bpchar[])) Rows Removed by Filter: 1 -> Nested Loop Anti Join (cost=0.69..1864694.18 rows=1 width=399) (actual time=0.032..16606.236 rows=1065808 loops=1) -> Seq Scan on emc jfw03_1 (cost=0.00..347258.58 rows=1132320 width=139) (actual time=0.013..5171.111 rows=1114280 loops=1) Filter: (handle_mark = '2'::bpchar) Rows Removed by Filter: 8885720 -> Index Only Scan using idx_1_2_3_emc on emc old03_1 (cost=0.69..1.33 rows=1 width=105) (actual time=0.009..0.009 rows=0 loops=1114280) Index Cond: ((jfw03001 = jfw03_1.jfw03001) AND (emc_person_id = jfw03_1.emc_person_id)) Filter: (handle_mark = ANY ('{1,3}'::bpchar[])) Rows Removed by Filter: 1 Heap Fetches: 0 -> Nested Loop Anti Join (cost=0.69..2634563.07 rows=1 width=399) (actual time=0.033..27638.050 rows=2176041 loops=1) -> Seq Scan on emc old03_2 (cost=0.00..347258.58 rows=2205308 width=139) (actual time=0.014..5510.884 rows=2225077 loops=1) Filter: (handle_mark = ANY ('{1,3}'::bpchar[])) Rows Removed by Filter: 7774923 -> Index Only Scan using idx_1_2_3_emc on emc jfw03_2 (cost=0.69..1.03 rows=1 width=105) (actual time=0.009..0.009 rows=0 loops=2225077) Index Cond: ((jfw03001 = old03_2.jfw03001) AND (emc_person_id = old03_2.emc_person_id) AND (handle_mark = '2'::bpchar)) Heap Fetches: 0 Planning Time: 0.318 ms Execution Time: 60987.916 ms (27 行記錄)
加個索引看看速度
CREATE INDEX idx_emc_1 ON emc ( handle_mark );
加索引後執行計劃
QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------- Append (cost=21059.67..7011218.65 rows=49969 width=278) (actual time=115.683..46578.675 rows=3291407 loops=1) -> Nested Loop (cost=21059.67..2655195.15 rows=49967 width=278) (actual time=115.682..11939.294 rows=49558 loops=1) -> Bitmap Heap Scan on emc jfw03 (cost=21058.98..257323.10 rows=1120329 width=141) (actual time=115.458..417.906 rows=1114280 loops=1) Recheck Cond: (handle_mark = '2'::bpchar) Heap Blocks: exact=221151 -> Bitmap Index Scan on idx_emc_1 (cost=0.00..20778.90 rows=1120329 width=0) (actual time=92.887..92.887 rows=1114280 loops=1) Index Cond: (handle_mark = '2'::bpchar) -> Index Scan using idx_1_2_3_emc on emc old03 (cost=0.69..2.13 rows=1 width=141) (actual time=0.010..0.010 rows=0 loops=1114280) Index Cond: ((jfw03001 = jfw03.jfw03001) AND (emc_person_id = jfw03.emc_person_id)) Filter: (handle_mark = ANY ('{1,3}'::bpchar[])) Rows Removed by Filter: 1 -> Nested Loop Anti Join (cost=21059.67..1766155.15 rows=1 width=401) (actual time=127.721..11690.213 rows=1065808 loops=1) -> Bitmap Heap Scan on emc jfw03_1 (cost=21058.98..257323.10 rows=1120329 width=141) (actual time=127.640..423.128 rows=1114280 loops=1) Recheck Cond: (handle_mark = '2'::bpchar) Heap Blocks: exact=221151 -> Bitmap Index Scan on idx_emc_1 (cost=0.00..20778.90 rows=1120329 width=0) (actual time=105.117..105.117 rows=1114280 loops=1) Index Cond: (handle_mark = '2'::bpchar) -> Index Only Scan using idx_1_2_3_emc on emc old03_1 (cost=0.69..1.34 rows=1 width=105) (actual time=0.009..0.009 rows=0 loops=1114280) Index Cond: ((jfw03001 = jfw03_1.jfw03001) AND (emc_person_id = jfw03_1.emc_person_id)) Filter: (handle_mark = ANY ('{1,3}'::bpchar[])) Rows Removed by Filter: 1 Heap Fetches: 0 -> Nested Loop Anti Join (cost=39323.15..2589118.81 rows=1 width=401) (actual time=201.187..22828.908 rows=2176041 loops=1) -> Bitmap Heap Scan on emc old03_2 (cost=39322.47..289365.69 rows=2222658 width=141) (actual time=201.135..603.170 rows=2225077 loops=1) Recheck Cond: (handle_mark = ANY ('{1,3}'::bpchar[])) Heap Blocks: exact=222258 -> Bitmap Index Scan on idx_emc_1 (cost=0.00..38766.81 rows=2222658 width=0) (actual time=179.770..179.770 rows=2225077 loops=1) Index Cond: (handle_mark = ANY ('{1,3}'::bpchar[])) -> Index Only Scan using idx_1_2_3_emc on emc jfw03_2 (cost=0.69..1.02 rows=1 width=105) (actual time=0.009..0.009 rows=0 loops=2225077) Index Cond: ((jfw03001 = old03_2.jfw03001) AND (emc_person_id = old03_2.emc_person_id) AND (handle_mark = '2'::bpchar)) Heap Fetches: 0 Planning Time: 0.373 ms Execution Time: 46666.296 ms (33 行記錄)
快了14秒,沒達到目的效果,無解,只能改寫SQL,可以改 FULL JOIN。
等價改寫:
SELECT * FROM ( with x as (SELECT * FROM emc where handle_mark IN ('1', '3','2')) SELECT jfw03.id AS newid, jfw03.emc_person_id AS bizPersonOid, jfw03.jfw03001 AS jfw03001, jfw03.jfw03002 AS jfw03002, jfw03.jfw03003 AS jfw03003, jfw03.jfw03004 AS jfw03004, jfw03.jfw03005 AS jfw03005, jfw03.jfw03006 AS jfw03006, jfw03.jfw03007 AS jfw03007, jfw03.remark AS remark, old03.id AS oldId, old03.jfw03001 AS jfw03001pre, old03.jfw03002 AS jfw03002pre, old03.jfw03003 AS jfw03003pre, old03.jfw03004 AS jfw03004pre, old03.jfw03005 AS jfw03005pre, old03.jfw03006 AS jfw03006pre, old03.jfw03007 AS jfw03007pre, old03.remark AS remarkpre FROM (SELECT * FROM x WHERE handle_mark IN ('2')) jfw03 full join (SELECT * FROM x WHERE handle_mark IN ('1', '3')) old03 ON jfw03.emc_person_id = old03.emc_person_id AND jfw03.jfw03001 = old03.jfw03001);
等價改寫計劃:
QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------- Merge Full Join (cost=10000479890.42..10000480406.20 rows=33430 width=1844) (actual time=9433.530..10954.254 rows=3291407 loops=1) Merge Cond: ((x.emc_person_id = x_1.emc_person_id) AND (x.jfw03001 = x_1.jfw03001)) CTE x -> Bitmap Heap Scan on emc (cost=57545.45..325771.51 rows=3342986 width=145) (actual time=268.766..721.294 rows=3339357 loops=1) Recheck Cond: (handle_mark = ANY ('{1,3,2}'::bpchar[])) Heap Blocks: exact=222260 -> Bitmap Index Scan on idx_emc_1 (cost=0.00..56709.71 rows=3342986 width=0) (actual time=246.952..246.953 rows=3339357 loops=1) Index Cond: (handle_mark = ANY ('{1,3,2}'::bpchar[])) -> Sort (cost=76389.65..76431.43 rows=16715 width=924) (actual time=3818.370..4021.726 rows=1114280 loops=1) Sort Key: x.emc_person_id, x.jfw03001 Sort Method: quicksort Memory: 345133kB -> CTE Scan on x (cost=0.00..75217.18 rows=16715 width=924) (actual time=268.780..1391.594 rows=1114280 loops=1) Filter: (handle_mark = '2'::bpchar) Rows Removed by Filter: 2225077 -> Sort (cost=77729.26..77812.83 rows=33430 width=924) (actual time=5615.144..5949.562 rows=2225599 loops=1) Sort Key: x_1.emc_person_id, x_1.jfw03001 Sort Method: quicksort Memory: 676056kB -> CTE Scan on x x_1 (cost=0.00..75217.18 rows=33430 width=924) (actual time=0.007..360.353 rows=2225077 loops=1) Filter: (handle_mark = ANY ('{1,3}'::bpchar[])) Rows Removed by Filter: 1114280 Planning Time: 0.138 ms Execution Time: 11101.829 ms (22 行記錄)
改寫後11s就能跑出結果,符合效能最佳化目標。
這條改寫SQL留個坑,不完全等價,完全等價的沒放出來,有興趣的同學可以思考下,自己做做實驗。