[20240409]為什麼一條sql語句在例項2執行要慢的分析.txt

lfree發表於2024-04-09
[20240409]為什麼一條sql語句在例項2執行要慢的分析.txt

--//生產系統遇到一個奇怪現象,一條sql語句在例項2要比例項1慢很多,展開分析看看.

1.環境:
SYS@127.0.0.1:9014/ywdb> @ ver1
PORT_STRING VERSION BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx 11.2.0.4.0 Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

2.問題展開:

SYS@127.0.0.1:9014/ywdb> @ sqlhh af4692qv1f56q 10/24
time unit : millisecond
BEGIN_INTERVAL_TIME INST_ID SQL_ID PLAN_HASH_VALUE EXECUTIONS ELA_MS_PER_EXEC CPU_MS_PER_EXEC ROWS_PER_EXEC LIOS_PER_EXEC BLKRD_PER_EXEC IOW_MS_PER_EXEC AVG_IOW_MS CLW_MS_PER_EXEC APW_MS_PER_EXEC CCW_MS_PER_EXEC
------------------- ---------- ------------- --------------- ---------- --------------- --------------- ------------- ------------- -------------- --------------- ----------- --------------- --------------- ---------------
2024-04-09 00:00:19 1 af4692qv1f56q 1518780269 1 4 4 1.0 209 0 0 0.0 0 0 0
2024-04-09 00:00:19 2 af4692qv1f56q 1518780269 1 1968 1698 1.0 265 0 0 0.0 0 0 0
2024-04-09 01:00:24 1 af4692qv1f56q 1518780269 4 4 3 1.0 209 0 0 0.0 0 0 0
2024-04-09 01:00:24 2 af4692qv1f56q 1518780269 1 2097 2010 1.0 265 0 0 0.0 0 0 0
2024-04-09 02:00:06 1 af4692qv1f56q 1518780269 3 5 5 1.0 209 0 0 0.0 0 0 0
2024-04-09 02:00:06 2 af4692qv1f56q 1518780269 1 2095 1722 1.0 265 0 0 0.0 0 0 0
2024-04-09 03:00:18 1 af4692qv1f56q 1518780269 3 4 4 1.0 209 0 0 0.0 0 0 0
2024-04-09 03:00:18 2 af4692qv1f56q 1518780269 1 2022 1846 1.0 265 0 0 0.0 0 0 0
2024-04-09 04:00:05 1 af4692qv1f56q 1518780269 2 5 5 1.0 209 0 0 0.0 0 0 0
2024-04-09 04:00:05 2 af4692qv1f56q 1518780269 2 1007 998 1.0 237 0 0 0.0 0 0 0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2024-04-09 05:00:31 2 af4692qv1f56q 1518780269 5 4 3 1.0 209 0 0 0.0 0 0 0
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2024-04-09 05:00:31 1 af4692qv1f56q 1518780269 1 4 3 1.0 209 0 0 0.0 0 0 0
2024-04-09 06:00:33 2 af4692qv1f56q 1518780269 3 4 3 1.0 209 0 0 0.0 0 0 0
2024-04-09 06:00:34 1 af4692qv1f56q 1518780269 2 3 3 1.0 209 0 0 0.0 0 0 0
2024-04-09 07:00:53 1 af4692qv1f56q 1518780269 2 4 4 1.0 209 0 0 0.0 0 0 0
2024-04-09 07:00:53 2 af4692qv1f56q 1518780269 3 4 3 1.0 209 0 0 0.0 0 0 0
2024-04-09 08:00:07 2 af4692qv1f56q 1518780269 3 5 4 1.0 209 0 0 0.0 0 0 0
2024-04-09 08:00:07 1 af4692qv1f56q 1518780269 2 4 4 1.0 209 0 0 0.0 0 0 0
18 rows selected.

--//可以發現一個特點,執行沒有變化PLAN_HASH_VALUE=1518780269,例項2執行要有一點慢,也有執行快的時候,如果你仔細看就可以發現
--//這條sql語句執行次數很少.無論是例項1還是例項2.按照道理這類sql語句應該不會記錄在dba_hist_sqlstat裡面,可以推斷,因為例項
--//2共享記憶體緊張,該語句經常被踢出共享池,導致執行出現硬分析導致的情況,而且每次硬分析時間還"很長".

SYS@127.0.0.1:9014/ywdb> @ ashtop inst_id,module sql_id='af4692qv1f56q' &day ''
Total Distinct Distinct Distinct
Seconds AAS %This INST_ID MODULE FIRST_SEEN LAST_SEEN Execs Seen Tstamps Execs Seen1
--------- ------- ------- ---------- ------------------------------ ------------------- ------------------- ---------- -------- -----------
150 .0 100% | 2 httpd.exe 2024-04-08 09:15:24 2024-04-09 04:25:32 1 150 150
--//僅僅例項2出現效能問題.
--//先看看是什麼語句:

SYS@127.0.0.1:9014/ywdb> @ sqlid_fms af4692qv1f56q
C200
---------------------------------------------------------------------------------------------------------------------------------
SELECT D.constraint_type as CONSTRAINT_TYPE, C.COLUMN_NAME, C.position, D.r_constraint_name,
E.table_name as table_ref, f.column_name as column_ref,
C.table_name
FROM ALL_CONS_COLUMNS C
inner join ALL_constraints D on D.OWNER = C.OWNER and D.constraint_name = C.constraint_name
left join ALL_constraints E on E.OWNER = D.r_OWNER and E.constraint_name = D.r_constraint_name
left join ALL_cons_columns F on F.OWNER = E.OWNER and F.constraint_name = E.constraint_name and F.position = c.position
WHERE C.OWNER = 'PORTAL_HIS'
and C.table_name = 'DATA_EXCHANGE_FP'
and D.constraint_type <> 'P'
order by d.constraint_name, c.position
--//多數情況這語句是遞迴呼叫執行的,這類語句裡面的檢視定義非常複雜,如果檢視執行計劃非常複雜.第1次分析時間長,不可避免的.

time unit : microsecond

SQL_ID CHILD_NUMBER INST_ID FIRST_LOAD_TIME BUFFER_GETS EXECUTIONS ELAPSED_TIME AVG_ELA_TIME PLAN_HASH_VALUE LIOS_PER_EXEC FORCE_MATCHING_SIGNATURE
------------- ------------ ---------- -------------------------------------- ----------- ---------- ------------ ------------ --------------- ------------- ------------------------
af4692qv1f56q 0 2 2024-03-29/10:25:38 3400 16 2071474 129467.13 1518780269 213 3702873826194195448
af4692qv1f56q 0 1 2024-03-29/10:04:51 131289 628 4634160 7379.24 1518780269 209 3702873826194195448

--//可以發現例項2記錄的執行次數少(16次),可以推斷該語句一直呆在例項1共享池中,而例項2經常被刷出共享池,導致例項2的執行出現硬分析.
--//如果仔細看sql語句的執行計劃就知道,這條sql語句的執行計劃非常複雜,可以參考連結:
--//https://blog.itpub.net/267265/viewspace-3008682/ => 20240311]sql_id=c8s65f1cuhcb1的最佳化.txt ,與這個例子有一些類似.
--//如何驗證我的推斷呢?

SYS@127.0.0.1:9014/ywdb> @ ashtop inst_id,event sql_id='af4692qv1f56q' &day ''
Total Distinct Distinct Distinct
Seconds AAS %This INST_ID EVENT FIRST_SEEN LAST_SEEN Execs Seen Tstamps Execs Seen1
--------- ------- ------- ---------- ------------------------------------------ ------------------- ------------------- ---------- -------- -----------
148 .0 100% | 2 2024-04-08 09:25:30 2024-04-09 04:25:32 1 148 148

--//event記錄為空,基本消耗CPU上,如果知道消耗集中在分析上呢?
--//最新的ashtop指令碼加入了time_model_name,可以瞭解一些細節.

SYS@127.0.0.1:9014/ywdb> @ ashtop inst_id,time_model_name,time_model sql_id='af4692qv1f56q' &day ''
Total Distinct Distinct Distinct
Seconds AAS %This INST_ID TIME_MODEL_NAME TIME_MODEL FIRST_SEEN LAST_SEEN Execs Seen Tstamps Execs Seen1
--------- ------- ------- ---------- -------------------------------------------------- ---------- ------------------- ------------------- ---------- -------- -----------
148 .0 100% | 2 PARSE HARD_PARSE 144 2024-04-08 09:25:30 2024-04-09 04:25:32 1 148 148

--//很明顯CPU的消耗集中在PARSE HARD_PARSE,注:我猜測PARSE表示軟分析,HARD_PARSE是硬分析.
--//TIME_MODEL= 144 = 0x90
$ echo "obase=2;144" | bc -l
10010000
--//第4,7位為1,從0開始計數.

--//看看ashtop.sql指令碼的其中一段:
, CASE WHEN BITAND(time_model, POWER(2, 01)) = POWER(2, 01) THEN 'DBTIME ' END
||CASE WHEN BITAND(time_model, POWER(2, 02)) = POWER(2, 02) THEN 'BACKGROUND ' END
||CASE WHEN BITAND(time_model, POWER(2, 03)) = POWER(2, 03) THEN 'CONNECTION_MGMT ' END
||CASE WHEN BITAND(time_model, POWER(2, 04)) = POWER(2, 04) THEN 'PARSE ' END
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
||CASE WHEN BITAND(time_model, POWER(2, 05)) = POWER(2, 05) THEN 'FAILED_PARSE ' END
||CASE WHEN BITAND(time_model, POWER(2, 06)) = POWER(2, 06) THEN 'NOMEM_PARSE ' END
||CASE WHEN BITAND(time_model, POWER(2, 07)) = POWER(2, 07) THEN 'HARD_PARSE ' END
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
||CASE WHEN BITAND(time_model, POWER(2, 08)) = POWER(2, 08) THEN 'NO_SHARERS_PARSE ' END
||CASE WHEN BITAND(time_model, POWER(2, 09)) = POWER(2, 09) THEN 'BIND_MISMATCH_PARSE ' END
||CASE WHEN BITAND(time_model, POWER(2, 10)) = POWER(2, 10) THEN 'SQL_EXECUTION ' END
||CASE WHEN BITAND(time_model, POWER(2, 11)) = POWER(2, 11) THEN 'PLSQL_EXECUTION ' END
||CASE WHEN BITAND(time_model, POWER(2, 12)) = POWER(2, 12) THEN 'PLSQL_RPC ' END
||CASE WHEN BITAND(time_model, POWER(2, 13)) = POWER(2, 13) THEN 'PLSQL_COMPILATION ' END
||CASE WHEN BITAND(time_model, POWER(2, 14)) = POWER(2, 14) THEN 'JAVA_EXECUTION ' END
||CASE WHEN BITAND(time_model, POWER(2, 15)) = POWER(2, 15) THEN 'BIND ' END
||CASE WHEN BITAND(time_model, POWER(2, 16)) = POWER(2, 16) THEN 'CURSOR_CLOSE ' END
||CASE WHEN BITAND(time_model, POWER(2, 17)) = POWER(2, 17) THEN 'SEQUENCE_LOAD ' END
||CASE WHEN BITAND(time_model, POWER(2, 18)) = POWER(2, 18) THEN 'INMEMORY_QUERY ' END
||CASE WHEN BITAND(time_model, POWER(2, 19)) = POWER(2, 19) THEN 'INMEMORY_POPULATE ' END
||CASE WHEN BITAND(time_model, POWER(2, 20)) = POWER(2, 20) THEN 'INMEMORY_PREPOPULATE ' END
||CASE WHEN BITAND(time_model, POWER(2, 21)) = POWER(2, 21) THEN 'INMEMORY_REPOPULATE ' END
||CASE WHEN BITAND(time_model, POWER(2, 22)) = POWER(2, 22) THEN 'INMEMORY_TREPOPULATE ' END
||CASE WHEN BITAND(time_model, POWER(2, 23)) = POWER(2, 23) THEN 'TABLESPACE_ENCRYPTION ' END time_model_name
--//注意看下劃線,cpu消耗集中在PARSE 以及HARD_PARSE上,這也驗證我的判斷.

3.繼續:
--//按照連結https://blog.itpub.net/267265/viewspace-3008682/ => 20240311]sql_id=c8s65f1cuhcb1的最佳化.txt
--//應該還有最佳化空間,嘗試看看,不行放棄!!
--//像這類硬分析很慢的sql語句如何解決呢,可以使用sql profile來控制減少分析時間.
--//參考 https://blog.itpub.net/267265/viewspace-2151428/=>[20180302]sql profile能減少分析時間嗎?
--//注意如果你使用sql profile,一般情況下記錄是一些統計資訊,並不是執行計劃的outline.
--//必須使用sql profile交換方法.

SYS@127.0.0.1:9014/ywdb> @ spsw af4692qv1f56q 0 af4692qv1f56q 0 '' true
PL/SQL procedure successfully completed.
=================================================================================================================================================
if drop or alter sql profile ,run :
execute dbms_sqltune.drop_sql_profile(name => 'switch tuning af4692qv1f56q')
execute dbms_sqltune.alter_sql_profile(name => 'switch tuning af4692qv1f56q',attribute_name=>'STATUS',value=>'DISABLED')
=================================================================================================================================================

--//先禁用sql profile
SYS@127.0.0.1:9014/ywdb> execute dbms_sqltune.alter_sql_profile(name => 'switch tuning af4692qv1f56q',attribute_name=>'STATUS',value=>'DISABLED')
PL/SQL procedure successfully completed.

$ cat aa.txt
alter session set current_schema=PORTAL_PPI;
alter session set statistics_level=all;
column COLUMN_REF format a20

SELECt D.constraint_type as CONSTRAINT_TYPE, C.COLUMN_NAME, C.position, D.r_constraint_name,
E.table_name as table_ref, f.column_name as column_ref,
C.table_name
FROM ALL_CONS_COLUMNS C
inner join ALL_constraints D on D.OWNER = C.OWNER and D.constraint_name = C.constraint_name
left join ALL_constraints E on E.OWNER = D.r_OWNER and E.constraint_name = D.r_constraint_name
left join ALL_cons_columns F on F.OWNER = E.OWNER and F.constraint_name = E.constraint_name and F.position = c.position
WHERE C.OWNER = 'PORTAL_HIS'
and C.table_name = 'DATA_EXCHANGE_FP'
and D.constraint_type <> 'P'
order by d.constraint_name, c.position;
alter session set current_schema=SYS ;
--//注:適當將SELECT換成小寫,做小量修改.

SYS@127.0.0.1:9014/ywdb> @ aa.txt
Session altered.
Session altered.
C COLUMN_NAME POSITION R_CONSTRAINT_NAME TABLE_REF COLUMN_REF TABLE_NAME
- -------------------- ---------- ------------------------------ ------------------------------ -------------------- ----------
C NO DATA_EXCHANGE_FP
Elapsed: 00:00:02.30
Session altered.
--//可以發現第1次執行需要硬分析總是在2秒上下.

SYS@127.0.0.1:9014/ywdb> execute dbms_sqltune.alter_sql_profile(name => 'switch tuning af4692qv1f56q',attribute_name=>'STATUS',value=>'ENABLED')
PL/SQL procedure successfully completed.

--//適當修改aa.txt裡面的sql語句,就是順便找一個字元換成小寫.我的測試依次將SELECT換成小寫,這樣sql profile依舊生效.

SYS@127.0.0.1:9014/ywdb> @ aa.txt
Session altered.
Session altered.

C COLUMN_NAME POSITION R_CONSTRAINT_NAME TABLE_REF COLUMN_REF TABLE_NAME
- -------------------- ---------- ------------------------------ ------------------------------ -------------------- ----------
C NO DATA_EXCHANGE_FP
Elapsed: 00:00:01.24
Session altered.

--//再次小量修改aa.txt
SYS@127.0.0.1:9014/ywdb> @ aa.txt
Session altered.
Session altered.
C COLUMN_NAME POSITION R_CONSTRAINT_NAME TABLE_REF COLUMN_REF TABLE_NAME
- -------------------- ---------- ------------------------------ ------------------------------ -------------------- ----------
C NO DATA_EXCHANGE_FP
Elapsed: 00:00:00.92
Session altered.

--//可以發現這樣第一次執行的硬分析遠比原來沒有使用sql profile的時間小.

--//不做修改:
SYS@127.0.0.1:9014/ywdb> @ aa.txt
Session altered.
Session altered.

C COLUMN_NAME POSITION R_CONSTRAINT_NAME TABLE_REF COLUMN_REF TABLE_NAME
- -------------------- ---------- ------------------------------ ------------------------------ -------------------- ----------
C NO DATA_EXCHANGE_FP
Elapsed: 00:00:00.02
Session altered.

--//因為語句已經在共享池,做的是軟分析,執行很快.
--//順便說一下,該問題也許跟我最近設定一些會話設定cursor_sharing=exact有關,謂詞存在to_char函式,並且大量語句使用文字變數.
--//等一段時間觀察看看.

SYS@127.0.0.1:9014/ywdb> @ sqlhh af4692qv1f56q 4/24
time unit : millisecond
BEGIN_INTERVAL_TIME INST_ID SQL_ID PLAN_HASH_VALUE EXECUTIONS ELA_MS_PER_EXEC CPU_MS_PER_EXEC ROWS_PER_EXEC LIOS_PER_EXEC BLKRD_PER_EXEC IOW_MS_PER_EXEC AVG_IOW_MS CLW_MS_PER_EXEC APW_MS_PER_EXEC CCW_MS_PER_EXEC
------------------- ---------- ------------- --------------- ---------- --------------- --------------- ------------- ------------- -------------- --------------- ----------- --------------- --------------- ---------------
2024-04-09 12:00:15 2 af4692qv1f56q 1518780269 1 790 765 1.0 265 0 0 0.0 0 0 0
2024-04-09 12:00:15 1 af4692qv1f56q 1518780269 1 5 5 1.0 209 0 0 0.0 0 0 0
2024-04-09 13:00:23 1 af4692qv1f56q 1518780269 1 4 4 1.0 209 0 0 0.0 0 0 0
2024-04-09 13:00:23 2 af4692qv1f56q 1518780269 1 773 641 1.0 265 0 0 0.0 0 0 0
2024-04-09 14:00:00 1 af4692qv1f56q 1518780269 2 4 5 1.0 209 0 0 0.0 0 0 0
2024-04-09 14:00:00 2 af4692qv1f56q 1518780269 1 884 875 1.0 307 0 0 0.0 0 0 0
6 rows selected.
--//下午觀察,現在平均8XX毫秒,問題還是在例項2.

SYS@127.0.0.1:9014/ywdb> @ dashtop to_char(sample_time,'yyyymmdd'),time_model sql_id='af4692qv1f56q' &100day
Total
Seconds AAS %This TO_CHAR TIME_MODEL FIRST_SEEN LAST_SEEN
--------- ------- ------- -------- ---------- ------------------- -------------------
180 .0 31% 20240408 144 2024-04-08 00:40:34 2024-04-08 23:18:10
110 .0 19% 20240407 144 2024-04-07 13:33:56 2024-04-07 23:09:44
90 .0 16% 20240329 144 2024-03-29 09:44:09 2024-03-29 23:48:32
70 .0 12% 20240313 144 2024-03-13 10:04:45 2024-03-13 21:12:21
30 .0 5% 20240401 144 2024-04-01 15:31:07 2024-04-01 16:44:27
30 .0 5% 20240409 144 2024-04-09 00:09:17 2024-04-09 13:24:14
20 .0 3% 20240311 144 2024-03-11 20:12:36 2024-03-11 21:53:53
20 .0 3% 20240330 144 2024-03-30 00:48:52 2024-03-30 00:58:55
10 .0 2% 20240331 144 2024-03-31 21:49:00 2024-03-31 21:49:00
10 .0 2% 20240401 1024 2024-04-01 20:51:10 2024-04-01 20:51:10
10 .0 2% 20240409 1024 2024-04-09 09:37:57 2024-04-09 09:37:57
11 rows selected.

4.附上sqlid_fms.sql指令碼:

$ cat sqlid_fms.sql
-- Copyright 2023 lfree. All rights reserved.
-- Licensed under the Apache License, Version 2.0. See LICENSE.txt for terms and conditions.
--------------------------------------------------------------------------------
--
-- File name: sqlid_fms.sql
-- Purpose: query sql_id to find force_matching_signature from gv$sql
--
-- Author: lfree
--
-- Usage:
-- @ sqlid_fms.sql <sql_id>
--
--------------------------------------------------------------------------------

SELECT REPLACE (sql_fulltext, CHR (13), '') c200
FROM gv$sqlarea
WHERE sql_id = '&&1' AND ROWNUM = 1;

prompt
prompt time unit : microsecond
prompt

SELECT sql_id
, child_number
, inst_id
, first_load_time
, buffer_gets
, executions
, elapsed_time
, ROUND (elapsed_time / nullif(executions, 0), 2) avg_ela_time
, plan_hash_value, ROUND (buffer_gets / nullif(executions, 0), 2) lios_per_exec
, force_matching_signature
FROM gv$sql
WHERE sql_id = '&&1'
ORDER BY 4 DESC;

相關文章