Oracle ADDM 自動診斷監視工具介紹及使用

lygle發表於2013-03-28

Oracle AWR 介紹(AWR -- Automatic Workload Repository)

一. ADDM概述

ADDM(Automatic Database Diagnostic Monitor) 是植入Oracle資料庫的一個自診斷引擎.ADDM 通過檢查和分析AWR獲取的資料來判斷Oracle資料庫中可能的問題.

在Oracle9i及之前，DBA們已經擁有了很多很好用的效能分析工具，比如，tkprof、sql_trace、statspack、set event 10046&10053等等。這些工具能夠幫助DBA很快的定位效能問題。但這些工具都只給出一些統計資料，然後再由DBA們根據自己的經驗進行優化。
Oracle10g中推出了新的優化診斷工具：資料庫自動診斷監視工具（Automatic Database Diagnostic Monitor :ADDM）和SQL優化建議工具（SQL Tuning Advisor: STA）。這兩個工具的結合使用，能使DBA節省大量優化時間，也大大減少了系統當機的危險。簡單點說，ADDM就是收集相關的統計資料到自動工作量知識庫(Automatic Workload Repository :AWR)中，而STA則根據這些資料，給出優化建議。

例如，一個系統資源緊張，出現了明顯的效能問題，由以往的辦法，做個一個statspack快照，等30分鐘，再做一次。檢視報告，發現’ db file scattered read’事件在top 5 events裡面。根據經驗，這個事件一般可能是因為缺少索引、統計分析資訊不夠新、熱表都放在一個資料檔案上導致IO爭用等原因引起的。根據這些經驗，我們需要逐個來定位排除，比如檢視語句的查詢計劃、檢視user_tables的last_analysed子段，檢查熱塊等等步驟來最後定位出原因，並給出優化建議。但是，有了STA以後，它就可以根據ADDM採集到的資料直接給出優化建議，甚至給出優化後的語句。

ADDM能發現定位的問題包括：
•作業系統記憶體頁入頁出問題
•由於Oracle負載和非Oracle負載導致的CPU瓶頸問題
•導致不同資源負載的Top SQL語句和物件——CPU消耗、IO頻寬佔用、潛在IO問題、RAC內部通訊繁忙
•按照PLSQL和JAVA執行時間排的Top SQL語句.
•過多地連線 (login/logoff).
•過多硬解析問題——由於shared pool過小、書寫問題、繫結大小不適應、解析失敗原因引起的。
•過多軟解析問題
•索引查詢過多導致資源爭用.
•由於使用者鎖導致的過多的等待時間 (通過包dbms_lock加的鎖)
•由於DML鎖導致的過多等待時間(例如鎖住表了)
•由於管道輸出導致的過多等待時間(如通過包dbms_pipe.put進行管道輸出)
•由於併發更新同一個記錄導致的過多等待時間(行級鎖等待)
•由於ITL不夠導致的過多等待時間（大量的事務操作同一個資料塊）
•系統中過多的commit和rollback(logfile sync事件).
•由於磁碟頻寬太小和其他潛在問題（如由於logfile太小導致過多的checkpoint，MTTR設定問題，過多的undo操作等等）導致的IO效能問題I
•對於DBWR程式寫資料塊，磁碟IO吞吐量不足
•由於歸檔程式無法跟上redo日至產生的速度，導致系統變慢
•redo資料檔案太小導致的問題
•由於擴充套件磁碟分配導致的爭用
•由於移動一個物件的高水位導致的爭用問題
•記憶體太小問題——SGA Target, PGA, Buffer Cache, Shared Pool
•在一個例項或者一個機群環境中存在頻繁讀寫爭用的熱塊
•在一個例項或者一個機群環境中存在頻繁讀寫爭用的熱物件
•RAC環境中內部通訊問題
•LMS程式無法跟上導致鎖請求阻塞
•在RAC環境中由於阻塞和爭用導致的例項傾斜
•RMAN導致的IO和CPU問題
•Streams和AQ問題
•資源管理等待事件

ADDM提供了一個整體的優化方案.基於一段時間內的AWR snapshots(預設一小時一次)可以執行ADDM 分析,它可以幫我們診斷在這段期間內資料庫可能存在的瓶頸.

ADDM分析的目標是減小吞吐量的度量值, 在這裡我們將它稱為"DB TIME". DB TIME是一個累積值(資料庫伺服器處理使用者請求所花費的時間). 它包括了等待時間和CPU處理的時間(針對所有活躍的使用者程式而言),可以通過查詢下面兩個檢視來獲得它的值: V$SESS_TIME_MODEL, V$SYS_TIME_MODEL.

AWR收集的資料時放到記憶體中（share pool），通過一個新的後臺程式MMON定期寫到磁碟中。所以10g的share pool要求比以前版本更大，一般推薦比以前大15-20%。

注意: ADDM不會將處理使用者響應時間作為調優的目標, 你應該使用"TRACE"技術來監控它.

通過減小"DB TIME", 使用同樣多的系統資源,資料庫伺服器可以處理更多的使用者請求,也就是提高了吞吐量. 通過ADDM報告的問題是按照DB time排序的.

二. ADDM 分析的結果
ADDM 分析的結果以一些"Finding"的樣式來表達. 每個"Finding"都屬於以下三種型別之一:

1. 問題: 描述了導致資料庫效能問題的根源;
2. 徵兆: 包含了可能導致其他問題的資訊
3. 資訊: 報告其他沒有問題的模組

三. 設定ADDM
預設情況下,ADDM已經被自動啟用,通過初始化引數檔案中的STATISTICS_LEVEL來控制.
這個引數應該被設定成TYPICAL或者ALL(預設值是TYPICAL).如果你將這個引數設定成basic,很多Oracle的特性將被遮蔽.
ALTER SESSION SET STATISTICS_LEVEL= TYPICAL;

ADDM 對於I/O效能的評估分析在部分程度上依賴於這個DBIO_EXPECTED. 這個引數的含義是讀取一個資料塊所花費的平均時間(以微秒為單位). Oracle使用的是預設值(10毫秒), 對於現在流行的硬碟來說, 這是一個比較合適的值.如果你的硬碟比較陳舊,或者你有一個非常好的RAM DISK,請修改這個值.

為了決定DBIO_EXPECTED這個引數該怎樣去正確地配置,需要完成下面的步驟:

1. 基於你的機器的硬體,估量一下讀取單個資料庫塊所花費的平均時間.
注意:這個度量應該針對隨機的I/O(包括尋道的時間).傳統的值應該屬於5000-20000微秒這個區間.

2. 為接下來的ADDM執行設定一個時間引數. 例如:如果估計的值是8000微秒,你應該以SYS的身份執行
下面的過程:

EXECUTE DBMS_ADVISOR.SET_DEFAULT_PARAMETER ('ADDM','DBIO_EXPECTED',8000);

四. 通過Oracle Enterprise Manager來訪問ADDM:

五. 診斷與ADDM相關的問題:
為了診斷資料庫效能問題, ADDM分析可以跨越任意兩個snapshots,只要它們滿足下面兩個條件:
1. 兩個快照在建立過程中沒有錯誤並且沒有被刪除;
2. 兩個快照期間資料庫不能發生關閉和啟動的事件
(同statspack).

最簡單的執行ADDM分析的方法就是執行Enterprise Manager.
另外,也可以手工地執行 $ORACLE_HOME/rdbms/admin/addmrpt.sql以及dbms_advisor包.
這些指令碼和包可以被任何使用者執行,只要它們被賦予了ADVISOR的角色.

5.1 使用addmrpt.sql來執行
和statspack包中的spreport.sql非常相似

5.2 使用dbms_advisor包:
基本步驟:
1) 建立一個task: dbms_advisor.create_task ;

2) 設定相關的引數:
START_SNAPSHOT,END_SNAPSHOT
(通過DBMS_ADVISOR.SET_TASK_PARAMETER來完成)

3) 執行這個task: DBMS_ADVISOR.E

六. 與 ADDM相關的檢視:
DBA_ADVISOR_TASKS
DBA_ADVISOR_LOG
DBA_ADVISOR_RECOMMENDATIONS
DBA_ADVISOR_FINDINGS

七．工作採集、診斷過程
Oracle10g提供了一個圖形化的介面（通過OEM），使這個工具使用起來非常簡單。下面這裡介紹一下如何通過sqlplus使用這個工具。

第一步：建立測試用的表
SQL> CREATE TABLE bigtab AS SELECT rownum as "id", a.* FROM dba_objects a;
Table created.
SQL> create table smalltab as select rownum as "id", a.* FROM dba_tables a;
Table created.
SQL> ALTER TABLE bigtab MODIFY (empno NUMBER);
Table altered.
SQL> DECLARE
2 n NUMBER;
3 BEGIN
4 FOR n IN 1..100
5 LOOP
6 INSERT INTO bigtab SELECT rownum as "id", a.* FROM dba_objects a;
7 COMMIT;
8 END LOOP;
9 END;
/
PL/SQL procedure successfully completed.

第二步：採集一次工作量快照
SQL> begin
2 dbms_workload_repository.create_snapshot('TYPICAL');
3 end;
4 /
PL/SQL procedure successfully completed.

第三步：進行一些高負荷操作
DECLARE
v_var number;
BEGIN
FOR n IN 1..6
LOOP
select count(*) into v_var from bigtab b, smalltab a;
END LOOP;
END;
/
PL/SQL procedure successfully completed.

第四步：再次採集一次工作量快照
要注意的是：兩次快照之間的間隔時間必須足夠（一般推薦30分鐘左右），否則得到的ADDM報告中就會提示：THERE WAS NOT ENOUGH DATABASE TIME FOR ADDM ANALYSIS.
SQL> begin
2 dbms_workload_repository.create_snapshot('TYPICAL');
3 end;
4 /
PL/SQL procedure successfully completed.

第五步：建立一個優化診斷任務並執行

先獲取到兩次快照的ID:
SQL> select snap_id from
2 (SELECT * FROM dba_hist_snapshot
3 ORDER BY snap_id desc)
4 where rownum <=2;
SNAP_ID
--------
66
65

然後建立優化任務，並執行。

DECLARE
task_name VARCHAR2(30) := 'DEMO_ADDM01';
task_desc VARCHAR2(30) := 'ADDM Feature Test';
task_id NUMBER;
BEGIN
dbms_advisor.create_task('ADDM', task_id, task_name, task_desc, null);
dbms_advisor.set_task_parameter(task_name, 'START_SNAPSHOT', 65);
dbms_advisor.set_task_parameter(task_name, 'END_SNAPSHOT', 66);
dbms_advisor.set_task_parameter(task_name, 'INSTANCE', 1);
dbms_advisor.set_task_parameter(task_name, 'DB_ID', 1712582900);
dbms_advisor.execute_task(task_name);
END;
/
PL/SQL procedure successfully completed.

DBID 檢視sql

SQL> select dbid from v$database;

DBID

----------

1712582900

其中，set_task_parameter是用來設定任務引數的。START_SNAPSHOT是起始快照ID，END_SNAPSHOT是結束快照ID，INSTANCE是例項號，對於單例項，一般是1，在RAC環境下，可以通過查詢檢視v$instance得到，DB_ID是資料庫的唯一識別號，可以通過查詢v$database查到。

第六步：檢視優化建議結果
通知函式dbms_advisor.get_task_report可以得到優化建議結果。

SQL> SET LONG 1000000 PAGESIZE 0 LONGCHUNKSIZE 1000
SQL> COLUMN get_clob FORMAT a80
SQL> SELECT dbms_advisor.get_task_report('DEMO_ADDM01', 'TEXT', 'ALL') FROM DUAL;

DBMS_ADVISOR.GET_TASK_REPORT('
--------------------------------------------------------------------------------
DETAILED ADDM REPORT FOR TASK 'DEMO_ADDM01' WITH ID 243
-------------------------------------------------------

Analysis Period: 23-NOV-2005 from 15:02:27 to 16:06:42
Database ID/Instance: 1712582900/1
Database/Instance Names: EDGAR/edgar
Host Name: HUANGED
Database Version: 10.2.0.1.0
Snapshot Range: from 65 to 66
Database Time: 1463 seconds
Average Database Load: .4 active sessions

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

FINDING 1: 100% impact (1463 seconds)
-------------------------------------
Significant virtual memory paging was detected on the host operating system.

RECOMMENDATION 1: Host Configuration, 100% benefit (1463 seconds)
ACTION: Host operating system was experiencing significant paging but no
particular root cause could be detected. Investigate processes that
do not belong to this instance running on the host that are consuming
significant amount of virtual memory. Also consider adding more
physical memory to the host.

FINDING 2: 100% impact (1463 seconds)
-------------------------------------
SQL statements consuming significant database time were found.

RECOMMENDATION 1: SQL Tuning, 68% benefit (998 seconds)
ACTION: Tune the PL/SQL block with SQL_ID "064wqx7c5b81z". Refer to the
"Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
and Reference"
RELEVANT OBJECT: SQL statement with SQL_ID 064wqx7c5b81z
DECLARE
v_var number;
BEGIN
FOR n IN 1..10000
LOOP
select count(*) into v_var from bigtab b, smalltab a;
END LOOP;
END;

RECOMMENDATION 2: SQL Tuning, 67% benefit (986 seconds)
ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID
"fvqfghq71cqns".
RELEVANT OBJECT: SQL statement with SQL_ID fvqfghq71cqns and
PLAN_HASH 3281046854
SELECT COUNT(*) FROM BIGTAB B, SMALLTAB A
RATIONALE: SQL statement with SQL_ID "fvqfghq71cqns" was executed 6
times and had an average elapsed time of 166 seconds.

FINDING 3: 69% impact (1002 seconds)
------------------------------------
Time spent on the CPU by the instance was responsible for a substantial part
of database time.

RECOMMENDATION 1: SQL Tuning, 67% benefit (986 seconds)
ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID
"fvqfghq71cqns".
RELEVANT OBJECT: SQL statement with SQL_ID fvqfghq71cqns and
PLAN_HASH 3281046854
SELECT COUNT(*) FROM BIGTAB B, SMALLTAB A
RATIONALE: SQL statement with SQL_ID "fvqfghq71cqns" was executed 6
times and had an average elapsed time of 166 seconds.
RATIONALE: Average CPU used per execution was 162 seconds.

RECOMMENDATION 2: SQL Tuning, 2.1% benefit (30 seconds)
ACTION: Tune the PL/SQL block with SQL_ID "2b064ybzkwf1y". Refer to the
"Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
and Reference"
RELEVANT OBJECT: SQL statement with SQL_ID 2b064ybzkwf1y
BEGIN EMD_NOTIFICATION.QUEUE_READY(:1, :2, :3); END;
RATIONALE: SQL statement with SQL_ID "2b064ybzkwf1y" was executed 125
times and had an average elapsed time of 0.26 seconds.
RATIONALE: Average CPU used per execution was 0.24 seconds.

FINDING 4: 2.2% impact (33 seconds)
-----------------------------------
PL/SQL execution consumed significant database time.

RECOMMENDATION 1: SQL Tuning, 2.2% benefit (33 seconds)
ACTION: Tune the PL/SQL block with SQL_ID "2b064ybzkwf1y". Refer to the
"Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
and Reference"
RELEVANT OBJECT: SQL statement with SQL_ID 2b064ybzkwf1y
BEGIN EMD_NOTIFICATION.QUEUE_READY(:1, :2, :3); END;
RATIONALE: SQL statement with SQL_ID "2b064ybzkwf1y" was executed 125
times and had an average elapsed time of 0.26 seconds.
RATIONALE: Average time spent in PL/SQL execution was 0.26 seconds.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

ADDITIONAL INFORMATION
----------------------

Wait class "Application" was not consuming significant database time.
Wait class "Commit" was not consuming significant database time.
Wait class "Concurrency" was not consuming significant database time.
Wait class "Configuration" was not consuming significant database time.
Wait class "Network" was not consuming significant database time.
Wait class "User I/O" was not consuming significant database time.
Session connect and disconnect calls were not consuming significant database
time.
Hard parsing of SQL statements was not consuming significant database time.

The analysis of I/O performance is based on the default assumption that the
average read time for one database block is 10000 micro-seconds.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TERMINOLOGY
-----------

DATABASE TIME: This is the ADDM's measurement of throughput. From the user's
point of view: this is the total amount of time spent by users waiting for
a response from the database after issuing a call (not including
networking). From the database instance point of view: this is the total
time spent by forground processes waiting for a database resource (e.g.,
read I/O), running on the CPU and waiting for a free CPU (run-queue). The
target of ADDM analysis is to reduce this metric as much as possible,
thereby reducing the instance's response time.

AVERAGE DATABASE LOAD: At any given time we can count how many users (also
called 'Active Sessions') are waiting for an answer from the instance. This
is the ADDM's measurement for instance load. The 'Average Database Load' is
the average of the the load measurement taken over the entire analysis
period. We get this number by dividing the 'Database Time' by the analysis
period. For example, if the analysis period is 30 minutes and the 'Database
Time' is 90 minutes, we have an average of 3 users waiting for a response.

IMPACT: Each finding has an 'Impact' associated with it. The impact is the
portion of the 'Database Time' the finding deals with. If we assume that
the problem described by the finding is completely solved, then the
'Database Time' will be reduced by the amount of the 'Impact'.

BENEFIT: Each recommendation has a 'benefit' associated with it. The ADDM
analysis estimates that the 'Database Time' can be reduced by the 'benefit'
amount if all the actions of the recommendation are performed.

說明：其中第五步到第六步可以直接執行$ORACLE_HOME/rdbms/admin/addmrpt.sql來得到，這個指令碼的執行過程和statspack指令碼執行過程類似：

SQL> @addmrpt

Current Instance
~~~~~~~~~~~~~~~~

DB Id DB Name Inst Num Instance
----------- ------------ -------- ------------
1712582900 EDGAR 1 edgar

Instances in this Workload Repository schema
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

DB Id Inst Num DB Name Instance Host
------------ -------- ------------ ------------ ------------
* 1712582900 1 EDGAR edgar HUANGED

Using 1712582900 for database Id
Using 1 for instance number

Specify the number of days of snapshots to choose from
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Entering the number of days (n) will result in the most recent
(n) days of snapshots being listed. Pressing without
specifying a number lists all completed snapshots.

Listing the last 3 days of Completed Snapshots

Snap
Instance DB Name Snap Id Snap Started Level
------------ ------------ --------- ------------------ -----
edgar EDGAR 7 22 Nov 2005 00:00 1
... ...
64 23 Nov 2005 15:02 1
65 23 Nov 2005 16:00 1
66 23 Nov 2005 16:06 1

Specify the Begin and End Snapshot Ids
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Enter value for begin_snap: 65
Begin Snapshot Id specified: 66

Enter value for end_snap: 66
End Snapshot Id specified: 66

Specify the Report Name
~~~~~~~~~~~~~~~~~~~~~~~
The default report file name is addmrpt_1_65_66.txt. To use this name,
press to continue, otherwise enter an alternative.

Enter value for report_name:

Using the report name addmrpt_1_65_66.txt

Running the ADDM analysis on the specified pair of snapshots ...

Generating the ADDM report for this analysis ...

... ...

此外，如果是RAC環境下，可以執行$ORACLE_HOME/rdbms/admin/addmrpti.sql，這指令碼的執行，會多出要求輸入DB ID和instance ID的要求。

八．診斷結果分析

我們從上面的建議結果看到了，ADDM Report的結果與Statspack Report的結果大不相同。Statspack Report的結果給出的都是統計資料、各種事件，然後由DBA根據這些資料給出優化建議，而ADDM Report的結果包含就已經是給出的優化建議了

第一部分：
Analysis Period: 23-NOV-2005 from 15:02:27 to 16:06:42
Database ID/Instance: 1712582900/1
Database/Instance Names: EDGAR/edgar
Host Name: HUANGED
Database Version: 10.2.0.1.0
Snapshot Range: from 65 to 66
Database Time: 1463 seconds
Average Database Load: .4 active sessions

這一部分包括一些基礎資訊，分析時間段、DB和instance ID&名字、主機名字、Oracle版本、快照範圍、資料庫消耗時間、多少個活動會話。

第二部分：
下面就是ADDM發現的問題，並給出的相應建議。在我們這個例子中總共發現4個問題，下面一一解釋一下。

第一個問題：
FINDING 1: 100% impact (1463 seconds)
-------------------------------------
Significant virtual memory paging was detected on the host operating system.

RECOMMENDATION 1: Host Configuration, 100% benefit (1463 seconds)
ACTION: Host operating system was experiencing significant paging but no
particular root cause could be detected. Investigate processes that
do not belong to this instance running on the host that are consuming
significant amount of virtual memory. Also consider adding more
physical memory to the host.
先看第一行：100% impact (1463 seconds)，這是這個問題所持續的實踐及其對系統的影響，它的時間是1463秒，和分析期間的資料庫消耗時間（在第一部分中）是一樣（1463秒），所以對系統的影響是1463/1463*100=100%的。
再看第二行：Significant virtual memory paging was detected on the host operating system.，這是ADDM發現的這個問題的具體描述：在作業系統中發現有顯著的虛擬記憶體頁入頁出的問題。
然後看ADDM給出的建議及其作用：Host Configuration, 100% benefit (1463 seconds)——更改主機配置，100%有效。
最後是具體該如何操作：略——在主機的作業系統上發現了明顯的頁入頁出，但是沒有發現明顯導致記憶體頻繁換如換出的根本原因。需要仔細檢查那些消耗大量虛擬記憶體的程式（除Oracle例項外）。此外，還可以考慮增大主機的實體記憶體。說明一下：我的這個例項是跑在我自己的PC機上，Oracle執行的同時有大量的其他消耗記憶體的程式（word等）在執行，所以肯定有大量的記憶體交換存在。

再看第二個問題：
FINDING 2: 100% impact (1463 seconds)
-------------------------------------
SQL statements consuming significant database time were found.

RECOMMENDATION 1: SQL Tuning, 68% benefit (998 seconds)
ACTION: Tune the PL/SQL block with SQL_ID "064wqx7c5b81z". Refer to the
"Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
and Reference"
RELEVANT OBJECT: SQL statement with SQL_ID 064wqx7c5b81z
DECLARE
v_var number;
BEGIN
FOR n IN 1..10000
LOOP
select count(*) into v_var from bigtab b, smalltab a;
END LOOP;
END;
ADDM發現有SQL語句在消耗大量資料庫時間，它的影響是100%的。給出的建議是優化SQL，能取得68%的效果。
具體操作是優化ADDM找到的PL/SQL塊，它的SQL_ID是"064wqx7c5b81z"（可以通過select sql_text from v$sql where sql_id=’064wqx7c5b81z’;查到）。至於如何優化SQL語句，可以參考Oracle文件PL/SQL User's Guide and Reference中的Tuning PL/SQL Applications章節。下面的內容便是我們用來插入資料的測試語句。

下面是ADDM發現的其他問題語句：
FINDING 3: 69% impact (1002 seconds)
------------------------------------
Time spent on the CPU by the instance was responsible for a substantial part
of database time.

RECOMMENDATION 1: SQL Tuning, 67% benefit (986 seconds)
ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID
"fvqfghq71cqns".
RELEVANT OBJECT: SQL statement with SQL_ID fvqfghq71cqns and
PLAN_HASH 3281046854
SELECT COUNT(*) FROM BIGTAB B, SMALLTAB A
RATIONALE: SQL statement with SQL_ID "fvqfghq71cqns" was executed 6
times and had an average elapsed time of 166 seconds.
RATIONALE: Average CPU used per execution was 162 seconds.

RECOMMENDATION 2: SQL Tuning, 2.1% benefit (30 seconds)
ACTION: Tune the PL/SQL block with SQL_ID "2b064ybzkwf1y". Refer to the
"Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
and Reference"
RELEVANT OBJECT: SQL statement with SQL_ID 2b064ybzkwf1y
BEGIN EMD_NOTIFICATION.QUEUE_READY(:1, :2, :3); END;
RATIONALE: SQL statement with SQL_ID "2b064ybzkwf1y" was executed 125
times and had an average elapsed time of 0.26 seconds.
RATIONALE: Average CPU used per execution was 0.24 seconds.
這個問題的描述是，例項消耗的CPU事件佔據了大量的資料庫執行時間。由於發現了兩條問題語句，所以這裡有兩個建議。
第一個建議就是優化我們的測試語句。並且說明了這個問題的根本原因：這條語句總共執行過6次，平均每次消耗了166秒。平均這個問題消耗的CPU時間是162秒。
第二個建議實際上是針對一個系統過程，這個過程是用來讀取佇列資訊的，消耗的資源比較小，我們這裡就不需要關心了。

再看最後一個問題：
FINDING 4: 2.2% impact (33 seconds)
-----------------------------------
PL/SQL execution consumed significant database time.

RECOMMENDATION 1: SQL Tuning, 2.2% benefit (33 seconds)
ACTION: Tune the PL/SQL block with SQL_ID "2b064ybzkwf1y". Refer to the
"Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide
and Reference"
RELEVANT OBJECT: SQL statement with SQL_ID 2b064ybzkwf1y
BEGIN EMD_NOTIFICATION.QUEUE_READY(:1, :2, :3); END;
RATIONALE: SQL statement with SQL_ID "2b064ybzkwf1y" was executed 125
times and had an average elapsed time of 0.26 seconds.
RATIONALE: Average time spent in PL/SQL execution was 0.26 seconds.
從內容上看，這個問題就是上一個問題中的第二個建議。但是，它導致的結果是不一樣的。看這個問題的描述：PL/SQL的執行次數消耗了大量的資料庫時間。它的根本原因是因為執行次數太多（125次）。可見ADDM的問題檢查相當全面。

第三部分：
這一部分的內容是關於此次優化建議的一些附加資訊：
ADDITIONAL INFORMATION
----------------------

Wait class "Application" was not consuming significant database time.
Wait class "Commit" was not consuming significant database time.
Wait class "Concurrency" was not consuming significant database time.
Wait class "Configuration" was not consuming significant database time.
Wait class "Network" was not consuming significant database time.
Wait class "User I/O" was not consuming significant database time.
Session connect and disconnect calls were not consuming significant database
time.
Hard parsing of SQL statements was not consuming significant database time.

The analysis of I/O performance is based on the default assumption that the
average read time for one database block is 10000 micro-seconds.

這是關於這次優化診斷對各類事件（在Oracle10g，新增了很多新的事件，主要是將原先一些較含糊的事件細化了，同時將所有事件進行了歸類。可以檢視檢視V$SYSTEM_WAIT_CLASS）的一些總結：Application、Commit、Concurrency、Configuration、Network、User I/O類等待事件沒有顯著消耗資料庫時間；會話連線、斷連請求沒有消耗大量資料庫時間；對SQL語句的硬解析沒有消耗大量資料庫時間；對IO效能的分析是基於預設假設每次讀一個資料塊的時間是10000微秒的。

第四部分：
這部分是對診斷報告中用到的術語的解釋：
TERMINOLOGY
-----------

DATABASE TIME: This is the ADDM's measurement of throughput. From the user's
point of view: this is the total amount of time spent by users waiting for
a response from the database after issuing a call (not including
networking). From the database instance point of view: this is the total
time spent by forground processes waiting for a database resource (e.g.,
read I/O), running on the CPU and waiting for a free CPU (run-queue). The
target of ADDM analysis is to reduce this metric as much as possible,
thereby reducing the instance's response time.

AVERAGE DATABASE LOAD: At any given time we can count how many users (also
called 'Active Sessions') are waiting for an answer from the instance. This
is the ADDM's measurement for instance load. The 'Average Database Load' is
the average of the the load measurement taken over the entire analysis
period. We get this number by dividing the 'Database Time' by the analysis
period. For example, if the analysis period is 30 minutes and the 'Database
Time' is 90 minutes, we have an average of 3 users waiting for a response.

IMPACT: Each finding has an 'Impact' associated with it. The impact is the
portion of the 'Database Time' the finding deals with. If we assume that
the problem described by the finding is completely solved, then the
'Database Time' will be reduced by the amount of the 'Impact'.

BENEFIT: Each recommendation has a 'benefit' associated with it. The ADDM
analysis estimates that the 'Database Time' can be reduced by the 'benefit'
amount if all the actions of the recommendation are performed.

DATABASE TIME：是ADDM的度量資料。從使用者角度看：這是從向資料庫請求開始，消耗在使用者等待響應上的全部時間（不包括網路響應時間）；從資料庫例項角度看：前臺程式消耗在等待一種資料庫資源（例如，IO讀）、CPU執行和等待CPU釋放（佇列等待）的總共時間。ADDM分析的目標就儘量降低這個數字，也就是減少例項響應時間
AVERAGE DATABASE LOAD：所有能統計到的有多少使用者（也稱為“活動會話”）等待例項響應。這是例項負荷的度量指標。平均資料庫負荷是由整個分析計算出來的平均負荷。通過“Database Time”除以分析週期時間得到。例如，分析週期時30分鐘，而資料庫執行消耗時間是90分鐘，那就說明平均有3個使用者在等待響應。
IMPACT：每一個找到的問題都有“影響”這一項。“影響”是資料庫消耗時間用於處理這個問題的時間不分。假定我們所找到的這個問題完全解決，那麼資料庫消耗時間就會相應減少“影響”時間。
BENEFIT：每一個找到的問題都“受益”這一項。如果所有建議操作得到實施，ADDM分析估計資料庫消耗時間能減少“受益”的全部時間。

找到了有問題的SQL後我們就可以用 Oracle SQL Tuning Advisor 工具來優化該SQL，關於STA的使用，請參考Blog：

如何用 SQL Tuning Advisor (STA) 優化SQL語句

http://space.itpub.net/28673746/viewspace-757331

來自 “ ITPUB部落格 ” ，連結：http://blog.itpub.net/28673746/viewspace-757337/，如需轉載，請註明出處，否則將追究法律責任。

Oracle ADDM 自動診斷監視工具介紹
2017-05-06
Oracle
Oracle10g資料庫自動診斷監視工具(ADDM)使用指南
2009-09-08
Oracle資料庫
自動資料庫診斷監控 ADDM（Automatic Database Diagnostic Monitor）！
2017-07-01
資料庫Database
9. Oracle常用分析診斷工具——9.3.ADDM
2020-03-04
Oracle
Oracle ADDM --dbms_addm執行oracle資料庫診斷
2017-06-12
Oracle資料庫
oracle診斷工具-RDA使用
2017-09-14
Oracle
Oracle診斷工具RDA使用
2007-07-06
Oracle
MYTFA(MYSQL 診斷資訊收集工具)介紹
2023-03-02
MySql
【轉】oracle診斷工具-RDA使用
2011-04-15
Oracle
oracle診斷工具-RDA使用(轉)
2009-08-07
Oracle
dbms_addm執行oracle資料庫診斷
2016-09-30
Oracle資料庫
oracle 效能診斷工具
2013-06-25
Oracle
2 Day DBA-管理方案物件-監控和優化資料庫-使用ADDM診斷效能問題-手動執行ADDM
2014-02-07
物件優化資料庫
MongoDB自帶的監控工具介紹
2016-01-12
MongoDB
oracle sqlt(sqltxplain) 診斷工具
2016-05-31
OracleSQLAI
JEB工具的介紹及使用
2020-12-03
Ansible自動化部署工具-元件及語法介紹
2023-11-09
元件
使用 Diagwait 作為診斷工具，獲取用於診斷 Oracle Clusterware 節點驅逐
2015-05-01
AIOracle
光模組數字診斷（DDM）功能介紹
2020-04-14
網路診斷工具的使用
2021-04-08
Oracle效能診斷檢視總結
2012-07-14
Oracle
使用ErrorStack進行Oracle錯誤跟蹤及診斷
2016-05-20
ErrorOracle
2 Day DBA-管理方案物件-監控和優化資料庫-效能自我診斷：自動資料庫診斷監控
2014-02-06
物件優化資料庫
官方壓測工具mysqlslap介紹及使用
2015-11-27
MySql
Alibaba Java診斷工具Arthas檢視Dubbo動態代理類
2022-04-07
Java
Oracle 補丁體系及 opatch 工具介紹
2014-01-10
Oracle
使用crsctl工具診斷cluster問題
2010-11-10
吃透 JVM 診斷方法與工具使用
2024-08-01
JVM
自動化運維工具Ansible介紹
2019-02-27
運維
mytop 使用介紹 mysql實時監控工具
2021-09-09
MySql
[原創]SpotLight效能監控工具使用介紹
2014-10-17
Oracle BBED 工具介紹
2016-12-19
Oracle
Oracle DBV 工具介紹
2011-01-18
Oracle
【Oracle】Opatch 工具介紹
2011-11-27
Oracle
ORACLE 檢視介紹
2006-12-27
Oracle
[原創]zabbix工具介紹，安裝及使用
2014-11-05
Xcode自帶的超好用的診斷工具
2018-05-09
XCode
ORACLE診斷案例
2019-03-26
Oracle

Oracle ADDM 自動診斷監視工具介紹及使用

相關文章