啥是AWR?
======================================================================================
AWR (Automatic Workload Repository)
一堆歷史效能資料,放在SYSAUX表空間上, AWR和SYSAUX都是10g出現的,是Oracle調優的關鍵特性; 大約1999年左右開始開發,已經有15年曆史
預設快照間隔1小時,10g儲存7天、11g儲存8天; 可以通過DBMS_WORKLOAD_REPOSITORY.MODIFY_SNAPSHOT_SETTINGS修改
DBA_HIST_WR_CONTROL
AWR程式核心是dbms_workload_repository包
@?/rdbms/admin/awrrpt 本例項
@?/rdbms/admin/awrrpti RAC中選擇例項號
誰維護AWR?
主要是MMON(Manageability Monitor Process)和它的小工程式(m00x)
MMON的功能包括:
1.啟動slave程式m00x去做AWR快照
2.當某個度量閥值被超過時發出alert告警
3.為最近改變過的SQL物件捕獲指標資訊
AWR小技巧
手動執行一個快照:
Exec dbms_workload_repository.create_snapshot; (這個要背出來哦,用的時候去翻手冊,丟臉哦 J!)
建立一個AWR基線
Exec DBMS_WORKLOAD_REPOSITORY.CREATE_BASELINE(start_snap_id,end_snap_id ,baseline_name);
@?/rdbms/admin/awrddrpt AWR比對報告
@?/rdbms/admin/awrgrpt RAC 全域性AWR
自動生成AWR HTML報告:
http://www.oracle-base.com/dba/10g/generate_multiple_awr_reports.sql
1、報告總結
WORKLOAD REPOSITORY report for DB Name DB Id Instance Inst Num Startup Time Release RAC ------------ ----------- ------------ -------- --------------- ----------- --- MAC 2629627371 askmaclean.com 1 22-Jan-13 16:49 11.2.0.3.0 YES Host Name Platform CPUs Cores Sockets Memory(GB) ---------------- -------------------------------- ---- ----- ------- ---------- MAC10 AIX-Based Systems (64-bit) 128 32 320.00 Snap Id Snap Time Sessions Curs/Sess --------- ------------------- -------- --------- Begin Snap: 5853 23-Jan-13 15:00:56 3,520 1.8 End Snap: 5854 23-Jan-13 15:30:41 3,765 1.9 Elapsed: 29.75 (mins) DB Time: 7,633.76 (mins)
Elapsed 為該AWR效能報告的時間跨度(自然時間的跨度,例如前一個快照snapshot是4點生成的,後一個快照snapshot是6點生成的,則若使用@?/rdbms/admin/awrrpt 指令碼中指定這2個快照的話,那麼其elapsed = (6-4)=2 個小時),一個AWR效能報告 至少需要2個AWR snapshot效能快照才能生成 ( 注意這2個快照時間 例項不能重啟過,否則指定這2個快照生成AWR效能報告 會報錯),AWR效能報告中的 指標往往是 後一個快照和前一個快照的 指標的delta,這是因為 累計值並不能反映某段時間內的系統workload。
DB TIME= 所有前臺session花費在database呼叫上的總和時間:
- 注意是前臺程式foreground sessions
- 包括CPU時間、IO Time、和其他一系列非空閒等待時間,別忘了cpu on queue time
DB TIME 不等於 響應時間,DB TIME高了未必響應慢,DB TIME低了未必響應快
DB Time描繪了資料庫總體負載,但要和elapsed time逝去時間結合其他來。
Average Active Session AAS= DB time/Elapsed Time
DB Time =60 min , Elapsed Time =60 min AAS=60/60=1 負載一般
DB Time= 1min , Elapsed Time= 60 min AAS= 1/60 負載很輕
DB Time= 60000 min,Elapsed Time= 60 min AAS=1000 系統hang了吧?
DB TIME= DB CPU + Non-Idle Wait + Wait on CPU queue
如果僅有2個邏輯CPU,而2個session在60分鐘都沒等待事件,一直跑在CPU上,那麼:
DB CPU= 2 * 60 mins , DB Time = 2* 60 + 0 + 0 =120
AAS = 120/60=2 正好等於OS load 2。
如果有3個session都100%僅消耗CPU,那麼總有一個要wait on queue
DB CPU = 2* 60 mins ,wait on CPU queue= 60 mins
AAS= (120+ 60)/60=3 主機load 亦為3,此時vmstat 看waiting for run time
真實世界中? DB Cpu = xx mins , Non-Idle Wait= enq:TX + cursor pin S on X + latch : xxx + db file sequential read + ……….. 阿貓阿狗
1-1 記憶體引數大小
Cache Sizes Begin End ~~~~~~~~~~~ ---------- ---------- Buffer Cache: 49,152M 49,152M Std Block Size: 8K Shared Pool Size: 13,312M 13,312M Log Buffer: 334,848K
記憶體管理方式:MSMM、ASMM(sga_target)、AMM(memory_target)
小記憶體有小記憶體的問題, 大記憶體有大記憶體的麻煩! ORA-04031???!!
Buffer cache和shared pool size的 begin/end值在ASMM、AMM和11gR2 MSMM下可是會動的哦!
這裡說 shared pool一直收縮,則在shrink過程中一些row cache 物件被lock住可能導致前臺row cache lock等解析等待,最好別讓shared pool shrink。如果這裡shared pool一直在grow,那說明shared pool原有大小不足以滿足需求(可能是大量硬解析),結合下文的解析資訊和SGA breakdown來一起診斷問題。
1-2 Load Profile
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~ --------------- --------------- ---------- ---------- DB Time(s): 256.6 0.2 0.07 0.03 DB CPU(s): 3.7 0.0 0.00 0.00 Redo size: 1,020,943.0 826.5 Logical reads: 196,888.0 159.4 Block changes: 6,339.4 5.1 Physical reads: 5,076.7 4.1 Physical writes: 379.2 0.3 User calls: 10,157.4 8.2 Parses: 204.0 0.2 Hard parses: 0.9 0.0 W/A MB processed: 5.0 0.0 Logons: 1.7 0.0 Executes: 3,936.6 3.2 Rollbacks: 1,126.3 0.9 Transactions: 1,235.3 % Blocks changed per Read: 53.49 Recursive Call %: 98.04 Rollback per transaction %: 36.57 Rows per Sort: 73.70
指標 | 指標含義 |
redo size | 單位 bytes,redo size可以用來估量update/insert/delete的頻率,大的redo size往往對lgwr寫日誌,和arch歸檔造成I/O壓力, Per Transaction可以用來分辨是 大量小事務, 還是少量大事務。如上例每秒redo 約1MB ,每個事務800 位元組,符合OLTP特徵 |
Logical Read | 單位 次數*塊數, 相當於 “人*次”, 如上例 196,888 * db_block_size=1538MB/s , 邏輯讀耗CPU,主頻和CPU核數都很重要,邏輯讀高則DB CPU往往高,也往往可以看到latch: cache buffer chains等待。 大量OLTP系統(例如siebel)可以高達幾十乃至上百Gbytes。 |
Block changes | 單位 次數*塊數 , 描繪資料變化頻率 |
Physical Read | 單位次數*塊數, 如上例 5076 * 8k = 39MB/s, 物理讀消耗IO讀,體現在IOPS和吞吐量等不同緯度上;但減少物理讀可能意味著消耗更多CPU。好的儲存 每秒物理讀能力達到幾GB,例如Exadata。 這個physical read包含了physical reads cache和physical reads direct |
Physical writes | 單位 次數*塊數,主要是DBWR寫datafile,也有direct path write。 dbwr長期寫出慢會導致定期log file switch(checkpoint no complete) 檢查點無法完成的前臺等待。 這個physical write 包含了physical writes direct +physical writes from cache |
User Calls | 單位次數,使用者呼叫數,more details from internal |
Parses | 解析次數,包括軟解析+硬解析,軟解析優化得不好,則誇張地說幾乎等於每秒SQL執行次數。 即執行解析比1:1,而我們希望的是 解析一次 到處執行哦! |
Hard Parses | 萬惡之源. Cursor pin s on X, library cache: mutex X , latch: row cache objects /shared pool……………..。 硬解析最好少於每秒20次 |
W/A MB processed |
單位MB W/A workarea workarea中處理的資料數量 結合 In-memory Sort%, sorts (disk) PGA Aggr一起看 |
Logons | 登陸次數, logon storm 登陸風暴,結合AUDIT審計資料一起看。短連線的附帶效應是遊標快取無用 |
Executes | 執行次數,反應執行頻率 |
Rollback | 回滾次數, 反應回滾頻率, 但是這個指標不太精確,參考而已,別太當真 |
Transactions | 每秒事務數,是資料庫層的TPS,可以看做壓力測試或比對效能時的一個指標,孤立看無意義 |
% Blocks changed per Read |
每次邏輯讀導致資料塊變化的比率;如果’redo size’, ‘block changes’ ‘pct of blocks changed per read’三個指標都很高,則說明系統正執行大量insert/update/delete; pct of blocks changed per read = (block changes ) /( logical reads) |
Recursive Call % | 遞迴呼叫的比率;Recursive Call % = (recursive calls)/(user calls) |
Rollback per transaction % | 事務回滾比率。 Rollback per transaction %= (rollback)/(transactions) |
Rows per Sort | 平均每次排序涉及到的行數 ; Rows per Sort= ( sorts(rows) ) / ( sorts(disk) + sorts(memory)) |
注意這些Load Profile 負載指標 在本環節提供了 2個維度 per second 和 per transaction。
per Second: 主要是把 快照內的delta值除以 快站時間的秒數 , 例如 在 A快照中V$SYSSTAT檢視反應 table scans (long tables) 這個指標是 100 ,在B快照中V$SYSSTAT檢視反應 table scans (long tables) 這個指標是 3700, 而A快照和B快照 之間 間隔了一個小時 3600秒, 則 對於 table scans (long tables) per second 就是 ( 3700- 100) /3600=1。
pert Second是我們審視資料的主要維度 ,任何效能資料脫離了 時間模型則毫無意義。
在statspack/AWR出現之前 的調優 洪荒時代, 有很多DBA 依賴 V$SYSSTAT等檢視中的累計 統計資訊來調優,以當前的調優眼光來看,那無異於刀耕火種。
per transaction : 基於事務的維度, 與per second相比 是把除數從時間的秒數改為了該段時間內的事務數。 這個維度的很大使用者是用來 識別應用特性的變化 ,若2個AWR效能報告中該維度指標 出現了大幅變化,例如 redo size從本來per transaction 1k變化為 10k per transaction,則說明SQL業務邏輯肯定發生了某些變化。
注意AWR中的這些指標 並不僅僅用來孤立地瞭解 Oracle資料庫負載情況, 實施調優工作。 對於 故障診斷 例如HANG、Crash等, 完全可以通過對比問題時段的效能報告和常規時間來對比,通過各項指標的對比往往可以找出 病灶所在。
SELECT VALUE FROM DBA_HIST_SYSSTAT WHERE SNAP_ID = :B4 AND DBID = :B3 AND INSTANCE_NUMBER = :B2 AND STAT_NAME in ( "db block changes","user calls","user rollbacks","user commits",redo size","physical reads direct","physical writes","parse count (hard)","parse count (total)","session logical reads","recursive calls","redo log space requests","redo entries","sorts (memory)","sorts (disk)","sorts (rows)","logons cumulative","parse time cpu","parse time elapsed","execute count","logons current","opened cursors current","DBWR fusion writes","gcs messages sent","ges messages sent","global enqueue gets sync","global enqueue get time","gc cr blocks received","gc cr block receive time","gc current blocks received","gc current block receive time","gc cr blocks served","gc cr block build time","gc cr block flush time","gc cr block send time","gc current blocks served","gc current block pin time","gc current block flush time","gc current block send time","physical reads","physical reads direct (lob)", SELECT TOTAL_WAITS FROM DBA_HIST_SYSTEM_EVENT WHERE SNAP_ID = :B4 AND DBID = :B3 AND INSTANCE_NUMBER = :B2 AND EVENT_NAME in ("gc buffer busy","buffer busy waits" SELECT VALUE FROM DBA_HIST_SYS_TIME_MODEL WHERE DBID = :B4 AND SNAP_ID = :B3 AND INSTANCE_NUMBER = :B2 AND STAT_NAME in ("DB CPU","sql execute elapsed time","DB time" SELECT VALUE FROM DBA_HIST_PARAMETER WHERE SNAP_ID = :B4 AND DBID = :B3 AND INSTANCE_NUMBER = :B2 AND PARAMETER_NAME in ("__db_cache_size","__shared_pool_size","sga_target","pga_aggregate_target","undo_management","db_block_size","log_buffer","timed_statistics","statistics_level" SELECT BYTES FROM DBA_HIST_SGASTAT WHERE SNAP_ID = :B4 AND DBID = :B3 AND INSTANCE_NUMBER = :B2 AND POOL IN ('shared pool', 'all pools') AND NAME in ("free memory", SELECT BYTES FROM DBA_HIST_SGASTAT WHERE SNAP_ID = :B4 AND DBID = :B3 AND INSTANCE_NUMBER = :B2 AND NAME = :B1 AND POOL IS NULL SELECT (E.BYTES_PROCESSED - B.BYTES_PROCESSED) FROM DBA_HIST_PGA_TARGET_ADVICE B, DBA_HIST_PGA_TARGET_ADVICE E WHERE B.DBID = :B4 AND B.SNAP_ID = :B3 AND B.INSTANCE_NUM BER = :B2 AND B.ADVICE_STATUS = 'ON' AND E.DBID = B.DBID AND E.SNAP_ID = :B1 AND E.INSTANCE_NUMBER = B.INSTANCE_NUMBER AND E.PGA_TARGET_FACTOR = 1 AND B.PGA_TARGET_FACT OR = 1 AND E.ADVICE_STATUS = 'ON' SELECT SUM(E.TOTAL_WAITS - NVL(B.TOTAL_WAITS, 0)) FROM DBA_HIST_SYSTEM_EVENT B, DBA_HIST_SYSTEM_EVENT E WHERE B.SNAP_ID(+) = :B4 AND E.SNAP_ID = :B3 AND B.DBID(+) = :B2 AND E.DBID = :B2 AND B.INSTANCE_NUMBER(+) = :B1 AND E.INSTANCE_NUMBER = :B1 AND B.EVENT_ID(+) = E.EVENT_ID AND (E.EVENT_NAME = 'latch free' OR E.EVENT_NAME LIKE 'latch :%') SELECT DECODE(B.TOTAL_SQL, 0, 0, 100*(1-B.SINGLE_USE_SQL/B.TOTAL_SQL)), DECODE(E.TOTAL_SQL, 0, 0, 100*(1-E.SINGLE_USE_SQL/E.TOTAL_SQL)), DECODE(B.TOTAL_SQL_MEM, 0, 0, 1 00*(1-B.SINGLE_USE_SQL_MEM/B.TOTAL_SQL_MEM)), DECODE(E.TOTAL_SQL_MEM, 0, 0, 100*(1-E.SINGLE_USE_SQL_MEM/E.TOTAL_SQL_MEM)) FROM DBA_HIST_SQL_SUMMARY B, DBA_HIST_SQL_SUMM ARY E WHERE B.SNAP_ID = :B4 AND E.SNAP_ID = :B3 AND B.INSTANCE_NUMBER = :B2 AND E.INSTANCE_NUMBER = :B2 AND B.DBID = :B1 AND E.DBID = :B1 SELECT EVENT, WAITS, TIME, DECODE(WAITS, NULL, TO_NUMBER(NULL), 0, TO_NUMBER(NULL), TIME/WAITS*1000) AVGWT, PCTWTT, WAIT_CLASS FROM (SELECT EVENT, WAITS, TIME, PCTWTT, WAIT_CLASS FROM (SELECT E.EVENT_NAME EVENT, E.TOTAL_WAITS - NVL(B.TOTAL_WAITS,0) WAITS, (E.TIME_WAITED_MICRO - NVL(B.TIME_WAITED_MICRO,0)) / 1000000 TIME, 100 * (E.TIME _WAITED_MICRO - NVL(B.TIME_WAITED_MICRO,0)) / :B1 PCTWTT, E.WAIT_CLASS WAIT_CLASS FROM DBA_HIST_SYSTEM_EVENT B, DBA_HIST_SYSTEM_EVENT E WHERE B.SNAP_ID(+) = :B5 AND E.S NAP_ID = :B4 AND B.DBID(+) = :B3 AND E.DBID = :B3 AND B.INSTANCE_NUMBER(+) = :B2 AND E.INSTANCE_NUMBER = :B2 AND B.EVENT_ID(+) = E.EVENT_ID AND E.TOTAL_WAITS > NVL(B.TO TAL_WAITS,0) AND E.WAIT_CLASS != 'Idle' UNION ALL SELECT 'CPU time' EVENT, TO_NUMBER(NULL) WAITS, :B6 /1000000 TIME, 100 * :B6 / :B1 PCTWTT, NULL WAIT_CLASS FROM DUAL W HERE :B6 > 0) ORDER BY TIME DESC, WAITS DESC) WHERE ROWNUM <= :B7 SELECT SUM(E.TIME_WAITED_MICRO - NVL(B.TIME_WAITED_MICRO,0)) FROM DBA_HIST_SYSTEM_EVENT B, DBA_HIST_SYSTEM_EVENT E WHERE B.SNAP_ID(+) = :B4 AND E.SNAP_ID = :B3 AND B.DB ID(+) = :B2 AND E.DBID = :B2 AND B.INSTANCE_NUMBER(+) = :B1 AND E.INSTANCE_NUMBER = :B1 AND B.EVENT_ID(+) = E.EVENT_ID AND E.WAIT_CLASS = 'User I/O' SELECT (E.ESTD_LC_TIME_SAVED - B.ESTD_LC_TIME_SAVED) FROM DBA_HIST_SHARED_POOL_ADVICE B, DBA_HIST_SHARED_POOL_ADVICE E WHERE B.DBID = :B3 AND B.INSTANCE_NUMBER = :B2 AN D B.SNAP_ID = :B4 AND E.DBID = :B3 AND E.INSTANCE_NUMBER = :B2 AND E.SNAP_ID = :B1 AND E.SHARED_POOL_SIZE_FACTOR = 1 AND B.SHARED_POOL_SIZE_FACTOR = 1
1-3 Instance Efficiency Percentages (Target 100%)
Instance Efficiency Percentages (Target 100%) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Buffer Nowait %: 99.97 Redo NoWait %: 100.00 Buffer Hit %: 97.43 In-memory Sort %: 100.00 Library Hit %: 99.88 Soft Parse %: 99.58 Execute to Parse %: 94.82 Latch Hit %: 99.95 Parse CPU to Parse Elapsd %: 1.75 % Non-Parse CPU: 99.85
上述所有指標 的目標均為100%,即越大越好,在少數bug情況下可能超過100%或者為負值。
- 80%以上 %Non-Parse CPU
- 90%以上 Buffer Hit%, In-memory Sort%, Soft Parse%
- 95%以上 Library Hit%, Redo Nowait%, Buffer Nowait%
- 98%以上 Latch Hit%
1、 Buffer Nowait % session申請一個buffer(相容模式)不等待的次數比例。 需要訪問buffer時立即可以訪問的比率, 不相容的情況 在9i中是 buffer busy waits,從10g以後 buffer busy waits 分離為 buffer busy wait 和 read by other session2個等待事件 :
9i 中 waitstat的總次數基本等於buffer busy waits等待事件的次數 SQL> select sum(TOTAL_WAITS) from v$system_event where event='buffer busy waits'; SUM(TOTAL_WAITS) —————- 33070394 SQL> select sum(count) from v$waitstat; SUM(COUNT) ———- 33069335 10g waitstat的總次數基本等於 buffer busy waits 和 read by other session 等待的次數總和 SQL> select sum(TOTAL_WAITS) from v$system_event where event='buffer busy waits' or event='read by other session'; SUM(TOTAL_WAITS) —————- 60675815 SQL> select sum(count) from v$waitstat; SUM(COUNT) ———- 60423739
Buffer Nowait %的計算公式是 sum(v$waitstat.wait_count) / (v$sysstat statistic session logical reads),例如在AWR中:
Class | Waits | Total Wait Time (s) | Avg Time (ms) |
---|---|---|---|
data block | 24,543 | 2,267 | 92 |
undo header | 743 | 2 | 3 |
undo block | 1,116 | 0 | 0 |
1st level bmb | 35 | 0 | 0 |
session logical reads | 40,769,800 | 22,544.84 | 204.71 |
Buffer Nowait %: | 99.94 |
Buffer Nowait= ( 40,769,800 – (24543+743+1116+35))/ ( 40,769,800) = 0.99935= 99.94%
SELECT SUM(WAIT_COUNT) FROM DBA_HIST_WAITSTAT WHERE SNAP_ID = :B3 AND DBID = :B2 AND INSTANCE_NUMBER = :B1
2、buffer HIT%: 經典的經典,快取記憶體命中率,反應物理讀和快取命中間的糾結,但這個指標即便99% 也不能說明物理讀等待少了
不合理的db_cache_size,或者是SGA自動管理ASMM /Memory 自動管理AMM下都可能因為db_cache_size過小引起大量的db file sequential /scattered read等待事件; maclean曾經遇到過因為大量硬解析導致ASMM 下shared pool共享池大幅度膨脹,而db cache相應縮小shrink的例子,最終db cache收縮到只有幾百兆,本來沒有的物理讀等待事件都大幅湧現出來 。
此外與 buffer HIT%相關的指標值得關注的還有 table scans(long tables) 大表掃描這個統計專案、此外相關的欄目還有Buffer Pool Statistics 、Buffer Pool Advisory等(如果不知道在哪裡,直接找一個AWR 去搜尋這些關鍵詞即可)。
buffer HIT%在 不同版本有多個計算公式:
在9i中
Buffer Hit Ratio = 1 – ((physical reads – physical reads direct – physical reads direct (lob)) / (db block gets + consistent gets – physical reads direct – physical reads direct (lob))
在10g以後:
Buffer Hit Ratio= 1 – ((‘physical reads cache’) / (‘consistent gets from cache’ + ‘db block gets from cache’)
注意:但是實際AWR中 似乎還是按照9i中的演算法,雖然演算法的區別對最後算得的比率影響不大。
對於buffer hit % 看它的命中率有多高沒有意義,主要是關注 未命中的次數有多少。通過上述公式很容易反推出未命中的物理讀的次數。
db block gets 、consistent gets 以及 session logical reads的關係如下:
db block gets=db block gets direct+ db block gets from cache
consistent gets = consistent gets from cache+ consistent gets direct
consistent gets from cache= consistent gets – examination + else
consistent gets – examination==>指的是不需要pin buffer直接可以執行consistent get的次數,常用於索引,只需要一次latch get
session logical reads = db block gets +consistent gets
其中physical reads 、physical reads cache、physical reads direct、physical reads direct (lob)幾者的關係為:
physical reads = physical reads cache + physical reads direct
這個公式其實說明了 物理讀有2種 :
- 物理讀進入buffer cache中 ,是常見的模式 physical reads cache
- 物理讀直接進入PGA 直接路徑讀, 即physical reads direct
physical reads | 8 | Total number of data blocks read from disk. This value can be greater than the value of “physical reads direct” plus “physical reads cache” as reads into process private buffers also included in this statistic. |
physical reads cache | 8 | Total number of data blocks read from disk into the buffer cache. This is a subset of “physical reads” statistic. |
physical reads direct | 8 | Number of reads directly from disk, bypassing the buffer cache. For example, in high bandwidth, data-intensive operations such as parallel query, reads of disk blocks bypass the buffer cache to maximize transfer rates and to prevent the premature aging of shared data blocks resident in the buffer cache. |
physical reads direct = physical reads direct (lob) + physical reads direct temporary tablespace + physical reads direct(普通)
這個公式也說明了 直接路徑讀 分成三個部分:
- physical reads direct (lob) 直接路徑讀LOB物件
- physical reads direct temporary tablespace 直接路徑讀臨時表空間
- physical read direct(普通) 普通的直接路徑讀, 一般是11g開始的自動的大表direct path read和並行引起的direct path read
physical writes direct= physical writes direct (lob)+ physical writes direct temporary tablespace
DBWR checkpoint buffers written = DBWR thread checkpoint buffers written+ DBWR tablespace checkpoint buffers written+ DBWR PQ tablespace checkpoint buffers written+….
3、Redo nowait%: session在生成redo entry時不用等待的比例,redo相關的資源爭用例如redo space request爭用可能造成生成redo時需求等待。此項資料來源於v$sysstat中的(redo log space requests/redo entries)。 一般來說10g以後不太用關注log_buffer引數的大小,需要關注是否有十分頻繁的 log switch ; 過小的redo logfile size 如果配合較大的SGA和頻繁的commit提交都可能造成該問題。 考慮增到redo logfile 的尺寸 : 1~4G 每個,7~10組都是合適的。同時考慮優化redo logfile和datafile 的I/O。
4、In-memory Sort%:這個指標因為它不計算workarea中所有的操作型別,所以現在越來越雞肋了。 純粹在記憶體中完成的排序比例。資料來源於v$sysstat statistics sorts (disk) 和 sorts (memory), In-memory Sort% = sort(memory) / ( sort(disk)+ sort(memory) )
5、
Library Hit%: library cache命中率,申請一個library cache object例如一個SQL cursor時,其已經在library cache中的比例。 資料來源 V$librarycache的pins和pinhits。 合理值:>95% ,該比例來源於1- ( Σ(pin Requests * Pct Miss) / Sum(Pin Requests) )
此外保證SQL語句繫結變數和遊標可以共享也是很重要的因素。
Library Cache Activity DB/Inst: G10R25/G10R25 Snaps: 2964-2965 -> "Pct Misses" should be very low http://www.askmaclean.com Get Pct Pin Pct Invali- Namespace Requests Miss Requests Miss Reloads dations --------------- ------------ ------ -------------- ------ ---------- -------- BODY 5 0.0 6 16.7 1 0 CLUSTER 10 0.0 26 0.0 0 0 SQL AREA 601,357 99.8 902,828 99.7 47 2 TABLE/PROCEDURE 83 9.6 601,443 0.0 48 0
GETS | NUMBER | Number of times a lock was requested for objects of this namespace |
GETHITS | NUMBER | Number of times an object’s handle was found in memory |
GETHITRATIO | NUMBER | Ratio of GETHITS to GETS |
PINS | NUMBER | Number of times a PIN was requested for objects of this namespace |
PINHITS | NUMBER | Number of times all of the metadata pieces of the library object were found in memory |
PINHITRATIO | NUMBER | Ratio of PINHITS to PINS |
RELOADS | NUMBER | Any PIN of an object that is not the first PIN performed since the object handle was created, and which requires loading the object from disk |
INVALIDATIONS | NUMBER | Total number of times objects in this namespace were marked invalid because a dependent object was modified |
SELECT SUM(PINS), SUM(PINHITS) FROM DBA_HIST_LIBRARYCACHE WHERE SNAP_ID = :B3 AND DBID = :B2 AND INSTANCE_NUMBER = :B1
6、
Soft Parse: 軟解析比例,無需多說的經典指標,資料來源v$sysstat statistics的parse count(total)和parse count(hard)。 合理值>95%
Soft Parse %是AWR中另一個重要的解析指標,該指標反應了快照時間內 軟解析次數 和 總解析次數 (soft+hard 軟解析次數+硬解析次數)的比值,若該指標很低,那麼說明了可能存在劇烈的hard parse硬解析,大量的硬解析會消耗更多的CPU時間片併產生解析爭用(此時可以考慮使用cursor_sharing=FORCE); 理論上我們總是希望 Soft Parse % 接近於100%, 但並不是說100%的軟解析就是最理想的解析狀態,通過設定 session_cached_cursors引數和反覆重用遊標我們可以讓解析來的更輕量級,即通俗所說的利用會話快取遊標實現的軟軟解析(soft soft parse)。
7、
Execute to Parse% 指標反映了執行解析比 其公式為 1-(parse/execute) , 目標為100% 及接近於只 執行而不解析。 資料來源v$sysstat statistics parse count (total) 和execute count
在oracle中解析往往是執行的先提工作,但是通過遊標共享 可以解析一次 執行多次, 執行解析可能分成多種場景:
- hard coding => 硬編碼程式碼 硬解析一次 ,執行一次, 則理論上其執行解析比 為 1:1 ,則理論上Execute to Parse =0 極差,且soft parse比例也為0%
- 繫結變數但是仍軟解析=》 軟解析一次,執行一次 , 這種情況雖然比前一種好 但是執行解析比(這裡的parse,包含了軟解析和硬解析)仍是1:1, 理論上Execute to Parse =0 極差, 但是soft parse比例可能很高
- 使用 靜態SQL、動態繫結、session_cached_cursor、open cursors等技術實現的 解析一次,執行多次, 執行解析比為N:1, 則 Execute to Parse= 1- (1/N) 執行次數越多 Execute to Parse越接近100% ,這種是我們在OLTP環境中喜聞樂見的!
通俗地說 soft parse% 反映了軟解析率, 而軟解析在oracle中仍是較昂貴的操作, 我們希望的是解析1次執行N次,如果每次執行均需要軟解析,那麼雖然soft parse%=100% 但是parse time仍可能是消耗DB TIME的大頭。
Execute to Parse反映了 執行解析比,Execute to Parse和soft parse% 都很低 那麼說明確實沒有繫結變數 , 而如果 soft parse% 接近99% 而Execute to Parse 不足90% 則說明沒有執行解析比低, 需要通過 靜態SQL、動態繫結、session_cached_cursor、open cursors等技術減少軟解析。
8、
Latch Hit%: willing-to-wait latch閂申請不要等待的比例。 資料來源V$latch gets和misses
Latch Name ---------------------------------------- Get Requests Misses Sleeps Spin Gets Sleep1 Sleep2 Sleep3 -------------- ----------- ----------- ---------- -------- -------- -------- shared pool 9,988,637 364 23 341 0 0 0 library cache 6,753,468 152 6 146 0 0 0 Memory Management Latch 369 1 1 0 0 0 0 qmn task queue latch 24 1 1 0 0 0 0
Latch Hit%:= (1 – (Sum(misses) / Sum(gets)))
關於Latch的更多資訊內容可以參考 AWR後面的專欄Latch Statistics, 注意對於一個併發設計良好的OLTP應用來說,Latch、Enqueue等併發控制不應當成為系統的主要瓶頸, 同時對於這些併發爭用而言 堆積硬體CPU和記憶體 很難有效改善效能。
SELECT SUM(GETS), SUM(MISSES) FROM DBA_HIST_LATCH WHERE SNAP_ID = :B3 AND DBID = :B2 AND INSTANCE_NUMBER = :B1
9、
Parse CPU To Parse Elapsd:該指標反映了 快照內解析CPU時間和總的解析時間的比值(Parse CPU Time/ Parse Elapsed Time); 若該指標水平很低,那麼說明在整個解析過程中 實際在CPU上運算的時間是很短的,而主要的解析時間都耗費在各種其他非空閒的等待事件上了(如latch:shared pool,row cache lock之類等) 資料來源 V$sysstat 的 parse time cpu和parse time elapsed
10、
%Non-Parse CPU 非解析cpu比例,公式為 (DB CPU – Parse CPU)/DB CPU, 若大多數CPU都用在解析上了,則可能好鋼沒用在刃上了。 資料來源 v$sysstat 的 parse time cpu和 cpu used by this session
1-4 Shared Pool Statistics
Shared Pool Statistics Begin End ------ ------ Memory Usage %: 84.64 79.67 % SQL with executions>1: 93.77 24.69 % Memory for SQL w/exec>1: 85.36 34.8
該環節提供一個大致的SQL重用及shared pool記憶體使用的評估。 應用是否共享SQL? 有多少記憶體是給只執行一次的SQL佔掉的,對比共享SQL呢?
如果該環節中% SQL with executions>1的 比例 小於%90 , 考慮用下面連結的SQL去抓 硬編碼的非繫結變數SQL語句。
利用FORCE_MATCHING_SIGNATURE捕獲非繫結變數SQL
Memory Usage %: (shared pool 的實時大小- shared pool free memory)/ shared pool 的實時大小, 代表shared pool的空間使用率,雖然有使用率但沒有標明碎片程度
% SQL with executions>1 複用的SQL佔總的SQL語句的比率,資料來源 DBA_HIST_SQL_SUMMARY 的 SINGLE_USE_SQL和TOTAL_SQL:1 – SINGLE_USE_SQL / TOTAL_SQL
% Memory for SQL w/exec>1 執行2次以上的SQL所佔記憶體佔總的SQL記憶體的比率,資料來源DBA_HIST_SQL_SUMMARY 的SINGLE_USE_SQL_MEM和TOTAL_SQL_MEM:1 – SINGLE_USE_SQL_MEM / TOTAL_SQL_MEM
==》上面2個指標也可以用來大致瞭解shared pool中的記憶體碎片程式,因為SINGLE_USE_SQL 單次執行的SQL多的話,那麼顯然可能有較多的共享池記憶體碎片
SQL複用率低的原因一般來說就是硬繫結變數(hard Coding)未合理使用繫結變數(bind variable),對於這種現象短期無法修改代表使用繫結變數的可以ALTER SYSTEM SET CURSOR_SHARING=FORCE; 來繞過問題,對於長期來看還是要修改程式碼繫結變數。 Oracle 從11g開始宣稱今後將廢棄CURSOR_SHARING的SIMILAR選項,同時SIMILAR選項本身也造成了很多問題,所以一律不推薦用CURSOR_SHARING=SIMILAR。
如果memory usage%比率一直很高,則可以關注下後面sga breakdown中的shared pool free memory大小,一般推薦至少讓free memroy有個300~500MB 以避免隱患。
1-5 Top 5 Timed Events
Top 5 Timed Events Avg %Total ~~~~~~~~~~~~~~~~~~ wait Call Event Waits Time (s) (ms) Time Wait Class ------------------------------ ------------ ----------- ------ ------ ---------- gc buffer busy 79,083 73,024 923 65.4 Cluster enq: TX - row lock contention 35,068 17,123 488 15.3 Applicatio CPU time 12,205 10.9 gc current request 2,714 3,315 1221 3.0 Cluster gc cr multi block request 83,666 1,008 12 0.9 Cluster
基於Wait Interface的調優是目前的主流!每個指標都重要!
基於命中比例的調優,好比是統計局的報告, 張財主家財產100萬,李木匠家財產1萬, 平均財產50.5萬。
基於等待事件的調優,好比馬路上100輛汽車的行駛記錄表,上車用了幾分鐘, 紅燈等了幾分鐘,擁堵塞了幾分鐘。。。
豐富的等待事件以足夠的細節來描繪系統執行的效能瓶頸,這是Mysql夢寐以求的東西……
Waits : 該等待事件發生的次數, 對於DB CPU此項不可用
Times : 該等待事件消耗的總計時間,單位為秒, 對於DB CPU 而言是前臺程式所消耗CPU時間片的總和,但不包括Wait on CPU QUEUE
Avg Wait(ms) : 該等待事件平均等待的時間, 實際就是 Times/Waits,單位ms, 對於DB CPU此項不可用
% Total Call Time, 該等待事件佔總的call time的比率
total call time = total CPU time + total wait time for non-idle events
% Total Call Time = time for each timed event / total call time
Wait Class: 等待型別:
Concurrency,System I/O,User I/O,Administrative,Other,Configuration,Scheduler,Cluster,Application,Idle,Network,Commit
CPU 上在幹什麼?
邏輯讀? 解析?Latch spin? PL/SQL、函式運算?
DB CPU/CPU time是Top 1 是好事情嗎? 未必!
注意DB CPU不包含 wait on cpu queue!
SELECT e.event_name event, e.total_waits - NVL (b.total_waits, 0) waits, DECODE ( e.total_waits - NVL (b.total_waits, 0), 0, TO_NUMBER (NULL), DECODE ( e.total_timeouts - NVL (b.total_timeouts, 0), 0, TO_NUMBER (NULL), 100 * (e.total_timeouts - NVL (b.total_timeouts, 0)) / (e.total_waits - NVL (b.total_waits, 0)))) pctto, (e.time_waited_micro - NVL (b.time_waited_micro, 0)) / 1000000 time, DECODE ( (e.total_waits - NVL (b.total_waits, 0)), 0, TO_NUMBER (NULL), ( (e.time_waited_micro - NVL (b.time_waited_micro, 0)) / 1000) / (e.total_waits - NVL (b.total_waits, 0))) avgwt, DECODE (e.wait_class, 'Idle', 99, 0) idle FROM dba_hist_system_event b, dba_hist_system_event e WHERE b.snap_id(+) = &bid AND e.snap_id = &eid --AND b.dbid(+) = :dbid --AND e.dbid = :dbid AND b.instance_number(+) = 1 AND e.instance_number = 1 AND b.event_id(+) = e.event_id AND e.total_waits > NVL (b.total_waits, 0) AND e.event_name NOT IN ('smon timer', 'pmon timer', 'dispatcher timer', 'dispatcher listen timer', 'rdbms ipc message') ORDER BY idle, time DESC, waits DESC, event
幾種常見的等待事件
=========================>
db file sequential read ,該等待事件Avg wait time平均單次等待時間應當小於20ms
”db file sequential read”單塊讀等待是一種最為常見的物理IO等待事件,這裡的sequential指的是將資料塊讀入到相連的記憶體空間中(contiguous memory space),而不是指所讀取的資料塊是連續的。該wait event可能在以下情景中發生:
http://www.askmaclean.com/archives/db-file-sequential-read-wait-event.html
latch free 其實是未獲得latch ,而進入latch sleep,見《全面解析9i以後Oracle Latch閂鎖原理》
enq:XX 佇列鎖等待,視乎不同的佇列鎖有不同的情況:
- 你有多瞭解Oracle Enqueue lock佇列鎖機制?
- Oracle佇列鎖: Enqueue HW
- Oracle佇列鎖enq:US,Undo Segment
- enq: TX – row lock/index contention、allocate ITL等待事件
- enq: TT – contention等待事件
- Oracle佇列鎖enq:TS,Temporary Segment (also TableSpace)
- enq: JI – contention等待事件
- enq: US – contention等待事件
- enq: TM – contention等待事件
- enq: RO fast object reuse等待事件
- enq: HW – contention等待事件
free buffer waits:是由於無法找到可用的buffer cache 空閒區域,需要等待DBWR 寫入完成引起
- 一般是由於
- 低效的sql
- 過小的buffer cache
- DBWR 工作負荷過量
buffer busy wait/ read by other session 一般以上2個等待事件可以歸為一起處理,建議客戶都進行監控 。 以上等待時間可以由如下操作引起
- select/select —- read by other session: 由於需要從 資料檔案中將資料塊讀入 buffer cache 中引起,有可能是 大量的 邏輯/物理讀 ;或者過小的 buffer cache 引起
- select/update —- buffer busy waits/ read by other session 是由於更新某資料塊後 需要在undo 中 重建構建 過去時間的塊,有可能伴生 enq:cr-contention 是由於大量的物理讀/邏輯讀造成。
- update/update —- buffer busy waits 由於更新同一個資料塊(非同一行,同一行是enq:TX-contention) 此類問題是熱點塊造成
- insert/insert —- buffer busy waits 是由於freelist 爭用造成,可以將表空間更改為ASSM 管理 或者加大freelist 。
write complete waits :一般此類等待事件是由於 DBWR 將髒資料寫入 資料檔案,其他程式如果需要修改 buffer cache會引起此等待事件,一般是 I/O 效能問題或者是DBWR 工作負荷過量引起
Wait time 1 Seconds.
control file parallel write:頻繁的更新控制檔案會造成大量此類等待事件,如日誌頻繁切換,檢查點經常發生,nologging 引起頻繁的資料檔案更改,I/O 系統效能緩慢。
log file sync:一般此類等待時間是由於 LGWR 程式講redo log buffer 寫入redo log 中發生。如果此類事件頻繁發生,可以判斷為:
- commit 次數是否過多
- I/O 系統問題
- 重做日誌是否不必要被建立
- redo log buffer 是否過大
2-1 Time Model Statistics
Time Model Statistics DB/Inst: ITSCMP/itscmp2 Snaps: 70719-70723 -> Total time in database user-calls (DB Time): 883542.2s -> Statistics including the word "background" measure background process time, and so do not contribute to the DB time statistic -> Ordered by % or DB time desc, Statistic name Statistic Name Time (s) % of DB Time ------------------------------------------ ------------------ ------------ sql execute elapsed time 805,159.7 91.1 sequence load elapsed time 41,159.2 4.7 DB CPU 20,649.1 2.3 parse time elapsed 1,112.8 .1 hard parse elapsed time 995.2 .1 hard parse (sharing criteria) elapsed time 237.3 .0 hard parse (bind mismatch) elapsed time 227.6 .0 connection management call elapsed time 29.7 .0 PL/SQL execution elapsed time 9.2 .0 PL/SQL compilation elapsed time 6.6 .0 failed parse elapsed time 2.0 .0 repeated bind elapsed time 0.4 .0 DB time 883,542.2 background elapsed time 25,439.0 background cpu time 1,980.9 -------------------------------------------------------------
Time Model Statistics幾個特別有用的時間指標:
- parse time elapsed、hard parse elapsed time 結合起來看解析是否是主要矛盾,若是則重點是軟解析還是硬解析
- sequence load elapsed time sequence序列爭用是否是問題焦點
- PL/SQL compilation elapsed time PL/SQL物件編譯的耗時
- 注意PL/SQL execution elapsed time 純耗費在PL/SQL直譯器上的時間。不包括花在執行和解析其包含SQL上的時間
- connection management call elapsed time 建立資料庫session連線和斷開的耗時
- failed parse elapsed time 解析失敗,例如由於ORA-4031
- hard parse (sharing criteria) elapsed time 由於無法共享遊標造成的硬解析
- hard parse (bind mismatch) elapsed time 由於bind type or bind size 不一致造成的硬解析
注意該時間模型中的指標存在包含關係所以Time Model Statistics加起來超過100%再正常不過
1) background elapsed time 2) background cpu time 3) RMAN cpu time (backup/restore) 1) DB time 2) DB CPU 2) connection management call elapsed time 2) sequence load elapsed time 2) sql execute elapsed time 2) parse time elapsed 3) hard parse elapsed time 4) hard parse (sharing criteria) elapsed time 5) hard parse (bind mismatch) elapsed time 3) failed parse elapsed time 4) failed parse (out of shared memory) elapsed time 2) PL/SQL execution elapsed time 2) inbound PL/SQL rpc elapsed time 2) PL/SQL compilation elapsed time 2) Java execution elapsed time 2) repeated bind elapsed time
2-2 Foreground Wait Class
Foreground Wait Class -> s - second, ms - millisecond - 1000th of a second -> ordered by wait time desc, waits desc -> %Timeouts: value of 0 indicates value was < .5%. Value of null is truly 0 -> Captured Time accounts for 102.7% of Total DB time 883,542.21 (s) -> Total FG Wait Time: 886,957.73 (s) DB CPU time: 20,649.06 (s) Avg %Time Total Wait wait Wait Class Waits -outs Time (s) (ms) %DB time -------------------- ---------------- ----- ---------------- -------- --------- Cluster 9,825,884 1 525,134 53 59.4 Concurrency 688,375 0 113,782 165 12.9 User I/O 34,405,042 0 76,695 2 8.7 Commit 172,193 0 62,776 365 7.1 Application 11,422 0 57,760 5057 6.5 Configuration 19,418 1 48,889 2518 5.5 DB CPU 20,649 2.3 Other 1,757,896 94 924 1 0.1 System I/O 30,165 0 598 20 0.1 Network 171,955,673 0 400 0 0.0 Administrative 2 100 0 101 0.0 ------------------------------------------------------------- select distinct wait_class from v$event_name; WAIT_CLASS ---------------------------------------------------------------- Concurrency User I/O System I/O Administrative Other Configuration Scheduler Cluster Application Queueing Idle Network Commit
- Wait Class: 等待事件的型別,如上查詢所示,被分作12個型別。 10.2.0.5有916個等待事件,其中Other型別佔622個。
- Waits: 該型別所屬等待事件在快照時間內的等待次數
- %Time Out 等待超時的比率, 未 超時次數/waits * 100 (%)
- Total Wait Time: 該型別所屬等待事件總的耗時,單位為秒
- Avg Wait(ms) : 該型別所屬等待事件的平均單次等待時間,單位為ms ,實際這個指標對commit 和 user i/o 以及system i/o型別有點意義,其他等待型別由於等待事件差異較大所以看平均值的意義較小
- waits / txn: 該型別所屬等待事件的等待次數和事務比
Other 型別,遇到該型別等待事件 的話 常見的原因是Oracle Bug或者 網路、I/O存在問題, 一般推薦聯絡Maclean。
Concurrency 型別 並行爭用型別的等待事件, 典型的如 latch: shared pool、latch: library cache、row cache lock、library cache pin/lock
Cluster 型別 為Real Application Cluster RAC環境中的等待事件, 需要注意的是 如果啟用了RAC option,那麼即使你的叢集中只啟動了一個例項,那麼該例項也可能遇到 Cluster型別的等待事件, 例如gc buffer busy
System I/O 主要是後臺程式維護資料庫所產生的I/O,例如control file parallel write 、log file parallel write、db file parallel write。
User I/O 主要是前臺程式做了一些I/O操作,並不是說後臺程式不會有這些等待事件。 典型的如db file sequential/scattered read、direct path read
Configuration 由於配置引起的等待事件, 例如 日誌切換的log file switch completion (日誌檔案 大小/數目 不夠),sequence的enq: SQ – contention (Sequence 使用nocache) ; Oracle認為它們是由於配置不當引起的,但實際未必真是這樣的配置引起的。
Application 應用造成的等待事件, 例如enq: TM – contention和enq: TX – row lock contention; Oracle認為這是由於應用設計不當造成的等待事件, 但實際這些Application class 等待可能受到 Concurrency、Cluster、System I/O 、User I/O等多種型別等待的影響,例如本來commit只要1ms ,則某一行資料僅被鎖定1ms, 但由於commit變慢 從而釋放行鎖變慢,引發大量的enq: TX – row lock contention等待事件。
Commit 僅log file sync ,log file sync的影響十分廣泛,值得我們深入討論。
Network : 網路型別的等待事件 例如 SQL*Net more data to client 、SQL*Net more data to dblink
Idle 空閒等待事件 ,最為常見的是rdbms ipc message (等待例項內部的ipc通訊才幹活,即別人告知我有活幹,我才幹,否則我休息==》Idle), SQL*Net message from client(等待SQL*NET傳來資訊,否則目前沒事幹)
2-3 前臺等待事件
Foreground Wait Events Snaps: 70719-70723 -> s - second, ms - millisecond - 1000th of a second -> Only events with Total Wait Time (s) >= .001 are shown -> ordered by wait time desc, waits desc (idle events last) -> %Timeouts: value of 0 indicates value was < .5%. Value of null is truly 0 Avg %Time Total Wait wait Waits % DB Event Waits -outs Time (s) (ms) /txn time -------------------------- ------------ ----- ---------- ------- -------- ------ gc buffer busy acquire 3,274,352 3 303,088 93 13.3 34.3 gc buffer busy release 387,673 2 128,114 330 1.6 14.5 enq: TX - index contention 193,918 0 97,375 502 0.8 11.0 cell single block physical 30,738,730 0 63,606 2 124.8 7.2 log file sync 172,193 0 62,776 365 0.7 7.1 gc current block busy 146,154 0 53,027 363 0.6 6.0 enq: TM - contention 1,060 0 47,228 44555 0.0 5.3 enq: SQ - contention 17,431 0 35,683 2047 0.1 4.0 gc cr block busy 105,204 0 33,746 321 0.4 3.8 buffer busy waits 279,721 0 12,646 45 1.1 1.4 enq: HW - contention 1,201 3 12,192 10151 0.0 1.4 enq: TX - row lock content 9,231 0 10,482 1135 0.0 1.2 cell multiblock physical r 247,903 0 6,547 26 1.0 .7
Foreground Wait Events 前臺等待事件,資料主要來源於DBA_HIST_SYSTEM_EVENT
Event 等待事件名字
Waits 該等待事件在快照時間內等待的次數
%Timeouts : 每一個等待事件有其超時的設定,例如buffer busy waits 一般為3秒, Write Complete Waits的 timeout為1秒,如果等待事件 單次等待達到timeout的時間,則會進入下一次該等待事件
Total Wait Time 該等待事件 總的消耗的時間 ,單位為秒
Avg wait(ms): 該等待事件的單次平均等待時間,單位為毫秒
Waits/Txn: 該等待事件的等待次數和事務比
2-4 後臺等待事件
Background Wait Events Snaps: 70719-70723 -> ordered by wait time desc, waits desc (idle events last) -> Only events with Total Wait Time (s) >= .001 are shown -> %Timeouts: value of 0 indicates value was < .5%. Value of null is truly 0 Avg %Time Total Wait wait Waits % bg Event Waits -outs Time (s) (ms) /txn time -------------------------- ------------ ----- ---------- ------- -------- ------ db file parallel write 90,979 0 7,831 86 0.4 30.8 gcs log flush sync 4,756,076 6 4,714 1 19.3 18.5 enq: CF - contention 2,123 40 4,038 1902 0.0 15.9 control file sequential re 90,227 0 2,380 26 0.4 9.4 log file parallel write 108,383 0 1,723 16 0.4 6.8 control file parallel writ 4,812 0 988 205 0.0 3.9 Disk file operations I/O 26,216 0 731 28 0.1 2.9 flashback log file write 9,870 0 720 73 0.0 2.8 LNS wait on SENDREQ 202,747 0 600 3 0.8 2.4 ASM file metadata operatio 15,801 0 344 22 0.1 1.4 cell single block physical 39,283 0 341 9 0.2 1.3 LGWR-LNS wait on channel 183,443 18 203 1 0.7 .8 gc current block busy 122 0 132 1082 0.0 .5 gc buffer busy release 60 12 127 2113 0.0 .5 Parameter File I/O 592 0 116 195 0.0 .5 log file sequential read 1,804 0 104 58 0.0 .4
Background Wait Events 後臺等待事件, 資料主要來源於DBA_HIST_BG_EVENT_SUMMARY
Event 等待事件名字
Waits 該等待事件在快照時間內等待的次數
%Timeouts : 每一個等待事件有其超時的設定,例如buffer busy waits 一般為3秒, Write Complete Waits的 timeout為1秒,如果等待事件 單次等待達到timeout的時間,則會進入下一次該等待事件
Total Wait Time 該等待事件 總的消耗的時間 ,單位為秒
Avg wait(ms): 該等待事件的單次平均等待時間,單位為毫秒
Waits/Txn: 該等待事件的等待次數和事務比
2-5 Operating System Statistics
Operating System Statistics Snaps: 70719-70723 TIME statistic values are diffed. All others display actual values. End Value is displayed if different -> ordered by statistic type (CPU Use, Virtual Memory, Hardware Config), Name Statistic Value End Value ------------------------- ---------------------- ---------------- BUSY_TIME 2,894,855 IDLE_TIME 5,568,240 IOWAIT_TIME 18,973 SYS_TIME 602,532 USER_TIME 2,090,082 LOAD 8 13 VM_IN_BYTES 0 VM_OUT_BYTES 0 PHYSICAL_MEMORY_BYTES 101,221,343,232 NUM_CPUS 24 NUM_CPU_CORES 12 NUM_CPU_SOCKETS 2 GLOBAL_RECEIVE_SIZE_MAX 4,194,304 GLOBAL_SEND_SIZE_MAX 2,097,152 TCP_RECEIVE_SIZE_DEFAULT 87,380 TCP_RECEIVE_SIZE_MAX 4,194,304 TCP_RECEIVE_SIZE_MIN 4,096 TCP_SEND_SIZE_DEFAULT 16,384 TCP_SEND_SIZE_MAX 4,194,304 TCP_SEND_SIZE_MIN 4,096 -------------------------------------------------------------
Operating System Statistics 作業系統統計資訊
資料來源於V$OSSTAT / DBA_HIST_OSSTAT,, TIME相關的指標單位均為百分之一秒
統計項 | 描述 |
NUM_CPU_SOCKETS | 物理CPU的數目 |
NUM_CPU_CORES | CPU的核數 |
NUM_CPUS | 邏輯CPU的數目 |
SYS_TIME | 在核心態被消耗掉的CPU時間片,單位為百分之一秒 |
USER_TIME | 在使用者態被消耗掉的CPU時間片,單位為百分之一秒 |
BUSY_TIME | Busy_Time=SYS_TIME+USER_TIME 消耗的CPU時間片,單位為百分之一秒 |
AVG_BUSY_TIME | AVG_BUSY_TIME= BUSY_TIME/NUM_CPUS |
IDLE_TIME | 空閒的CPU時間片,單位為百分之一秒 |
所有CPU所能提供總的時間片 | BUSY_TIME + IDLE_TIME = ELAPSED_TIME * CPU_COUNT |
OS_CPU_WAIT_TIME | 程式等OS排程的時間,cpu queuing |
VM_IN_BYTES | 換入頁的位元組數 |
VM_OUT_BYTES | 換出頁的位元組數,部分版本下並不準確,例如Bug 11712010 Abstract: VIRTUAL MEMORY PAGING ON 11.2.0.2 DATABASES,僅供參考 |
IOWAIT_TIME | 所有CPU花費在等待I/O完成上的時間 單位為百分之一秒 |
RSRC_MGR_CPU_WAIT_TIME | 是指當resource manager控制CPU排程時,需要控制對應程式暫時不使用CPU而程式到內部執行佇列中,以保證該程式對應的consumer group(消費組)沒有消耗比指定resource manager指令更多的CPU。RSRC_MGR_CPU_WAIT_TIME指等在內部執行佇列上的時間,在等待時不消耗CPU |
2-6 Service Statistcs
Service Statistics Snaps: 70719-70723 -> ordered by DB Time Physical Logical Service Name DB Time (s) DB CPU (s) Reads (K) Reads (K) ---------------------------- ------------ ------------ ------------ ------------ itms-contentmasterdb-prod 897,099 20,618 35,668 1,958,580 SYS$USERS 4,312 189 5,957 13,333 itmscmp 1,941 121 14,949 18,187 itscmp 331 20 114 218 itscmp_dgmgrl 121 1 0 0 SYS$BACKGROUND 0 0 142 30,022 ITSCMP1_PR 0 0 0 0 its-reference-prod 0 0 0 0 itscmpXDB 0 0 0 0
按照Service Name來分組時間模型和 物理、邏輯讀取, 部分資料來源於 WRH$_SERVICE_NAME;
Service Name 對應的服務名 (v$services), SYS$BACKGROUND代表後臺程式, SYS$USERS一般是系統使用者登入
DB TIME (s): 本服務名所消耗的DB TIME時間,單位為秒
DB CPU(s): 本服務名所消耗的DB CPU 時間,單位為秒
Physical Reads : 本服務名所消耗的物理讀
Logical Reads : 本服務所消耗的邏輯讀
2-7 Service Wait Class Stats
Service Wait Class Stats Snaps: 70719-70723 -> Wait Class info for services in the Service Statistics section. -> Total Waits and Time Waited displayed for the following wait classes: User I/O, Concurrency, Administrative, Network -> Time Waited (Wt Time) in seconds Service Name ---------------------------------------------------------------- User I/O User I/O Concurcy Concurcy Admin Admin Network Network Total Wts Wt Time Total Wts Wt Time Total Wts Wt Time Total Wts Wt Time --------- --------- --------- --------- --------- --------- --------- --------- itms-contentmasterdb-prod 33321670 71443 678373 113759 0 0 1.718E+08 127 SYS$USERS 173233 3656 6738 30 2 0 72674 3 itmscmp 676773 1319 1831 0 0 0 2216 0 itscmp 219577 236 1093 0 0 0 18112 0 itscmp_dgmgrl 34 0 8 0 0 0 9 0 SYS$BACKGROUND 71940 1300 320677 56 0 0 442252 872 -------------------------------------------------------------
- User I/O Total Wts : 對應該服務名下 使用者I/O類等待的總的次數
- User I/O Wt Time : 對應該服務名下 使用者I/O累等待的總時間,單位為 1/100秒
- Concurcy Total Wts: 對應該服務名下 Concurrency 型別等待的總次數
- Concurcy Wt Time :對應該服務名下 Concurrency 型別等待的總時間, 單位為 1/100秒
- Admin Total Wts: 對應該服務名下Admin 類等待的總次數
- Admin Wt Time: 對應該服務名下Admin類等待的總時間,單位為 1/100秒
- Network Total Wts : 對應服務名下Network類等待的總次數
- Network Wt Time: 對應服務名下Network類等待的總事件, 單位為 1/100秒
2-8 Host CPU
Host CPU (CPUs: 24 Cores: 12 Sockets: 2) ~~~~~~~~ Load Average Begin End %User %System %WIO %Idle --------- --------- --------- --------- --------- --------- 8.41 12.84 24.7 7.1 0.2 65.8
“Load Average” begin/end值代表每個CPU的大致執行佇列大小。上例中快照開始到結束,平均 CPU負載增加了;與《2-5 Operating System Statistics》中的LOAD相呼應。
%User+%System=> 總的CPU使用率,在這裡是31.8%
Elapsed Time * NUM_CPUS * CPU utilization= 60.23 (mins) * 24 * 31.8% = 459.67536 mins=Busy Time
2-8 Instance CPU
Instance CPU ~~~~~~~~~~~~ % of total CPU for Instance: 26.7 % of busy CPU for Instance: 78.2 %DB time waiting for CPU - Resource Mgr: 0.0
%Total CPU,該例項所使用的CPU佔總CPU的比例 % of total CPU for Instance
%Busy CPU,該例項所使用的Cpu佔總的被使用CPU的比例 % of busy CPU for Instance
例如共4個邏輯CPU,其中3個被完全使用,3箇中的1個完全被該例項使用,則%Total CPU= ? =25%,而%Busy CPU= 1/3= 33%
當CPU高時一般看%Busy CPU可以確定CPU到底是否是本例項消耗的,還是主機上其他程式
% of busy CPU for Instance= (DB CPU+ background cpu time) / (BUSY_TIME /100)= (20,649.1 + 1,980.9)/ (2,894,855 /100)= 78.17%
% of Total CPU for Instance = ( DB CPU+ background cpu time)/( BUSY_TIME+IDLE_TIME/100) = (20,649.1 + 1,980.9)/ ((2,894,855+5,568,240) /100) = 26.73%
%DB time waiting for CPU (Resource Manager)= (RSRC_MGR_CPU_WAIT_TIME/100)/DB TIME
3 TOP SQL
TOP SQL 的資料部分來源於 dba_hist_sqlstat
3-1 SQL ordered by Elapsed Time ,按照SQL消耗的時間來排列TOP SQL
SQL ordered by Elapsed Time Snaps: 70719-70723 -> Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code. -> % Total DB Time is the Elapsed Time of the SQL statement divided into the Total Database Time multiplied by 100 -> %Total - Elapsed Time as a percentage of Total DB time -> %CPU - CPU Time as a percentage of Elapsed Time -> %IO - User I/O Time as a percentage of Elapsed Time -> Captured SQL account for 53.9% of Total DB Time (s): 883,542 -> Captured PL/SQL account for 0.5% of Total DB Time (s): 883,542 Elapsed Elapsed Time Time (s) Executions per Exec (s) %Total %CPU %IO SQL Id ---------------- -------------- ------------- ------ ------ ------ ------------- 181,411.3 38,848 4.67 20.5 .0 .1 g0yc9szpuu068
注意對於PL/SQL,SQL Statistics不僅會體現該PL/SQL的執行情況,還會包括該PL/SQL包含的SQL語句的情況。如上例一個TOP PL/SQL執行了448s,而這448s中絕大多數是這個PL/SQL下的一個SQL執行500次耗費的。
則該TOP PL/SQL和TOP SQL都上榜,一個執行一次耗時448s,一個執行500次耗時448s。 如此情況則Elapsed Time加起來可能超過100%的Elapsed Time,這是正常的。
對於鶴立雞群的SQL很有必要一探究竟,跑個@?/rdbms/admin/awrsqrpt看看吧!
Elapsed Time (s): 該SQL累計執行所消耗的時間,
Executions : 該SQL在快照時間內 總計執行的次數 ; 注意, 對於在快照時間內還沒有執行完的SQL 不計為1一次,所以如果看到executions=0而 又是TOP SQL,則很有可能是因為該SQL 執行較舊還沒執行完,需要特別關注一下。
Elapsed Time per Exec (s):平均每次執行該SQL耗費的時間 , 對於OLTP型別的SELECT/INSERT/UPDATE/DELETE而言平均單次執行時間應當非常短,如0.1秒 或者更短才能滿足其業務需求,如果這類輕微的OLTP操作單次也要幾秒鐘的話,是無法滿足對外業務的需求的; 例如你在ATM上提款,並不僅僅是對你的賬務庫的簡單UPDATE,而需要在類似風險控制的前置系統中記錄你本次的流水操作記錄,實際取一次錢可能要有幾十乃至上百個OLTP型別的語句被執行,但它們應當都是十分快速的操作; 如果這些操作也變得很慢,則會出現大量事務阻塞,系統負載升高,DB TIME急劇上升的現象。 對於OLTP資料庫而言 如果執行計劃穩定,那麼這些OLTP操作的效能應當是鐵板釘釘的,但是一旦某個因素 發生變化,例如儲存的明顯變慢、記憶體換頁的大量出現時 則上述的這些transaction操作很可能成數倍到幾十倍的變慢,這將讓此事務系統短期內不可用。
對於維護操作,例如載入或清除資料,大的跑批次、報表而言 Elapsed Time per Exec (s)高一些是正常的。
%Total 該SQL所消耗的時間佔總的DB Time的百分比, 即 (SQL Elapsed Time / Total DB TIME)
% CPU 該SQL 所消耗的CPU 時間 佔 該SQL消耗的時間裡的比例, 即 (SQL CPU Time / SQL Elapsed Time) ,該指標說明了該語句是否是CPU敏感的
%IO 該SQL 所消耗的I/O 時間 佔 該SQL消耗的時間裡的比例, 即(SQL I/O Time/SQL Elapsed Time) ,該指標說明了該語句是否是I/O敏感的
SQL Id : 通過計算SQL 文字獲得的SQL_ID ,不同的SQL文字必然有不同的SQL_ID, 對於10g~11g而言 只要SQL文字不變那麼在資料庫之間 該SQL 對應的SQL_ID應當不不變的, 12c中修改了SQL_ID的計算方法
Captured SQL account for 53.9% of Total DB Time (s) 對於不繫結變數的應用來說Top SQL有可能失準,所以要參考本項
3-2 SQL ordered by CPU Time
SQL ordered by CPU Time Snaps: 70719-70723 -> Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code. -> %Total - CPU Time as a percentage of Total DB CPU -> %CPU - CPU Time as a percentage of Elapsed Time -> %IO - User I/O Time as a percentage of Elapsed Time -> Captured SQL account for 34.9% of Total CPU Time (s): 20,649 -> Captured PL/SQL account for 0.5% of Total CPU Time (s): 20,649 CPU CPU per Elapsed Time (s) Executions Exec (s) %Total Time (s) %CPU %IO SQL Id ---------- ------------ ---------- ------ ---------- ------ ------ ------------- 1,545.0 1,864,424 0.00 7.5 4,687.8 33.0 65.7 8g6a701j83c8q Module: MZIndexer SELECT t0.BOOLEAN_VALUE, t0.CLASS_CODE, t0.CREATED, t0.END_DATE, t0.PRODUCT_ATTR IBUTE_ID, t0.LAST_MODIFIED, t0.OVERRIDE_FLAG, t0.PRICE, t0.PRODUCT_ATTRIBUTE_TYP E_ID, t0.PRODUCT_ID, t0.PRODUCT_PUB_RELEASE_TYPE_ID, t0.PRODUCT_VOD_TYPE_ID, t0. SAP_PRODUCT_ID, t0.START_DATE, t0.STRING_VALUE FROM mz_product_attribute t0 WHER
CPU TIME : 該SQL 在快照時間內累計執行所消耗的CPU 時間片,單位為s
Executions : 該SQL在快照時間內累計執行的次數
CPU per Exec (s) :該SQL 平均單次執行所消耗的CPU時間 , 即 ( SQL CPU TIME / SQL Executions )
%Total : 該SQL 累計消耗的CPU時間 佔 該時段總的 DB CPU的比例, 即 ( SQL CPU TIME / Total DB CPU)
% CPU 該SQL 所消耗的CPU 時間 佔 該SQL消耗的時間裡的比例, 即 (SQL CPU Time / SQL Elapsed Time) ,該指標說明了該語句是否是CPU敏感的
%IO 該SQL 所消耗的I/O 時間 佔 該SQL消耗的時間裡的比例, 即(SQL I/O Time/SQL Elapsed Time) ,該指標說明了該語句是否是I/O敏感的
3-3 Buffer Gets SQL ordered by Gets
SQL ordered by Gets DB/Inst: ITSCMP/itscmp2 Snaps: 70719-70723 -> Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code. -> %Total - Buffer Gets as a percentage of Total Buffer Gets -> %CPU - CPU Time as a percentage of Elapsed Time -> %IO - User I/O Time as a percentage of Elapsed Time -> Total Buffer Gets: 2,021,476,421 -> Captured SQL account for 68.2% of Total Buffer Gets Elapsed Gets Executions per Exec %Total Time (s) %CPU %IO SQL Id ----------- ----------- ------------ ------ ---------- ------ ------ ----------- 4.61155E+08 1,864,424 247.3 22.8 4,687.8 33.0 65.7 8g6a701j83c
注意 buffer gets 邏輯讀是消耗CPU TIME的重要源泉, 但並不是說消耗CPU TIME的只有buffer gets。 大多數情況下 SQL order by CPU TIME 和 SQL order by buffers gets 2個部分的TOP SQL 及其排列順序都是一樣的,此種情況說明消耗最多buffer gets的 就是消耗最多CPU 的SQL ,如果我們希望降低系統的CPU使用率,那麼只需要調優SQL 降低buffer gets 即可。
但也並不是100%的情況都是如此, CPU TIME的消耗者 還包括 函式運算、PL/SQL 控制、Latch /Mutex 的Spin等等, 所以SQL order by CPU TIME 和 SQL order by buffers gets 2個部分的TOP SQL 完全不一樣也是有可能的, 需要因地制宜來探究到底是什麼問題導致的High CPU,進而裁度解決之道。
Buffer Gets : 該SQL在快照時間內累計執行所消耗的buffer gets,包括了consistent read 和 current read
Executions : 該SQL在快照時間內累計執行的次數
Gets per Exec : 該SQL平均單次的buffer gets , 對於事務型transaction操作而言 一般該單次buffer gets小於2000
% Total 該SQL 累計執行所消耗的buffer gets佔 總的db buffer gets的比率, (SQL buffer gets / DB total buffer gets)
3-4 Physical Reads SQL ordered by Reads
SQL ordered by Reads DB/Inst: ITSCMP/itscmp2 Snaps: 70719-70723 -> %Total - Physical Reads as a percentage of Total Disk Reads -> %CPU - CPU Time as a percentage of Elapsed Time -> %IO - User I/O Time as a percentage of Elapsed Time -> Total Disk Reads: 56,839,035 -> Captured SQL account for 34.0% of Total Physical Reads Elapsed Reads Executions per Exec %Total Time (s) %CPU %IO SQL Id ----------- ----------- ---------- ------ ---------- ------ ------ ------------- 9,006,163 1 9.0062E+06 15.8 720.9 5.9 80.9 4g36tmp70h185
Physical reads : 該SQL累計執行所消耗的物理讀
Executions : 該SQL在快照時間內累計執行的次數
Reads per Exec : 該SQL 單次執行所消耗的物理讀, (SQL Physical reads/Executions) , 對於OLTP transaction 型別的操作而言單次一般不超過100
%Total : 該SQL 累計消耗的物理讀 佔 該時段總的 物理讀的比例, 即 ( SQL physical read / Total DB physical read )
3-5 Executions SQL ordered by Executions
SQL ordered by Executions Snaps: 70719-70723 -> %CPU - CPU Time as a percentage of Elapsed Time -> %IO - User I/O Time as a percentage of Elapsed Time -> Total Executions: 48,078,147 -> Captured SQL account for 50.4% of Total Elapsed Executions Rows Processed Rows per Exec Time (s) %CPU %IO SQL Id ------------ --------------- -------------- ---------- ------ ------ ----------- 6,327,963 11,249,645 1.8 590.5 47.8 52.7 1avv7759j8r
按照 執行次數來排序的話,也是效能報告對比時一個重要的參考因素,因為如果TOP SQL的執行次數有明顯的增長,那麼 效能問題的出現也是意料之中的事情了。 當然執行次數最多的,未必便是對效能影響最大的TOP SQL
Executions : 該SQL在快照時間內累計執行的次數
Rows Processed: 該SQL在快照時間內累計執行所處理的總行數
Rows per Exec: SQL平均單次執行所處理的行數, 這個指標在診斷一些 資料問題造成的SQL效能問題時很有用
3-6 Parse Calls SQL ordered by Parse Calls
SQL ordered by Parse Calls Snaps: 70719-70723 -> Total Parse Calls: 2,160,124 -> Captured SQL account for 58.3% of Total % Total Parse Calls Executions Parses SQL Id ------------ ------------ --------- ------------- 496,475 577,357 22.98 d07gaa3wntdff
Parse Calls : 解析呼叫次數, 與上文的 Load Profile中的Parse 數一樣 包括 軟解析soft parse和硬解析hard parse
Executions : 該SQL在快照時間內累計執行的次數
%Total Parses : 本SQL 解析呼叫次數 佔 該時段資料庫總解析次數的比率, 為 (SQL Parse Calls / Total DB Parse Calls)
3-7 SQL ordered by Sharable Memory
SQL ordered by Sharable Memory Snaps: 70719-70723 -> Only Statements with Sharable Memory greater than 1048576 are displayed Sharable Mem (b) Executions % Total SQL Id ---------------- ------------ -------- ------------- 8,468,359 39 0.08 au89sasqfb2yn Module: MZContentBridge SELECT t0.ASPECT_RATIO, t0.CREATED, t0.FILE_EXTENSION, t0.HEIGHT, t0.VIDEO_FILE_ DIMENSIONS_ID, t0.LAST_MODIFIED, t0.NAME, t0.WIDTH FROM MZ_VIDEO_FILE_DIMENSIONS t0 WHERE (t0.HEIGHT = :1 AND t0.WIDTH = :2 )
SQL ordered by Sharable Memory , 一般該部分僅列出Sharable Mem (b)為1 MB以上的SQL 物件 (Only Statements with Sharable Memory greater than 1048576 are displayed) 資料來源是 DBA_HIST_SQLSTAT.SHARABLE_MEM
Shareable Mem(b): SQL 物件所佔用的共享記憶體使用量
Executions : 該SQL在快照時間內累計執行的次數
%Total : 該SQL 物件鎖佔共享記憶體 佔總的共享記憶體的比率
3-8 SQL ordered by Version Count
Version Count Oracle中的執行計劃可以是多版本的,即對於同一個SQL語句有多個不同版本的執行計劃,這些執行計劃又稱作子游標, 而一個SQL語句的文字可以稱作一個父遊標。 一個父遊標對應多個子遊標,產生不同子游標的原因是 SQL在被執行時無法共享之前已經生成的子游標, 原因是多種多樣的,例如 在本session中做了一個優化器引數的修改 例如optimizer_index_cost_adj 從100 修改到99,則本session的優化環境optimizer env將不同於之前的子游標生成環境,這樣就需要生成一個新的子游標,例如:
SQL> create table emp as select * from scott.emp; Table created. SQL> select * from emp where empno=1; no rows selected SQL> select /*+ MACLEAN */ * from emp where empno=1; no rows selected SQL> select SQL_ID,version_count from V$SQLAREA WHERE SQL_TEXT like '%MACLEAN%' and SQL_TEXT not like '%like%'; SQL_ID VERSION_COUNT ------------- ------------- bxnnm7z1qmg26 1 SQL> select count(*) from v$SQL where SQL_ID='bxnnm7z1qmg26'; COUNT(*) ---------- 1 SQL> alter session set optimizer_index_cost_adj=99; Session altered. SQL> select /*+ MACLEAN */ * from emp where empno=1; no rows selected SQL> select SQL_ID,version_count from V$SQLAREA WHERE SQL_TEXT like '%MACLEAN%' and SQL_TEXT not like '%like%'; SQL_ID VERSION_COUNT ------------- ------------- bxnnm7z1qmg26 2 SQL> select count(*) from v$SQL where SQL_ID='bxnnm7z1qmg26'; COUNT(*) ---------- 2 SQL> select child_number ,OPTIMIZER_ENV_HASH_VALUE,PLAN_HASH_VALUE from v$SQL where SQL_ID='bxnnm7z1qmg26'; CHILD_NUMBER OPTIMIZER_ENV_HASH_VALUE PLAN_HASH_VALUE ------------ ------------------------ --------------- 0 3704128740 3956160932 1 3636478958 3956160932
可以看到上述 演示中修改optimizer_index_cost_adj=99 導致CBO 優化器的優化環境發生變化, 表現為不同的OPTIMIZER_ENV_HASH_VALUE,之後生成了2個子遊標,但是這2個子遊標的PLAN_HASH_VALUE同為3956160932,則說明了雖然是不同的子游標但實際子游標裡包含了的執行計劃是一樣的; 所以請注意 任何一個優化環境的變化 (V$SQL_SHARED_CURSOR)以及相關衍生的BUG 都可能導致子游標無法共享,雖然子游標無法共享但這些子游標扔可能包含完全一樣的執行計劃,這往往是一種浪費。
注意V$SQLAREA.VERSION_COUNT 未必等於select count(*) FROM V$SQL WHERE SQL_ID=” ,即 V$SQLAREA.VERSION_COUNT 顯示的子游標數目 未必等於當前例項中還存有的子游標數目, 由於shared pool aged out演算法和其他一些可能導致遊標失效的原因存在,所以子游標被清理掉是很常見的事情。 V$SQLAREA.VERSION_COUNT只是一個計數器,它告訴我們曾經生成了多少個child cursor,但不保證這些child 都還在shared pool裡面。
此外可以通過v$SQL的child_number欄位來分析該問題,如果child_number存在跳號則也說明了部分child被清理了。
子游標過多的影響, 當子游標過多(例如超過3000個時),程式需要去掃描長長的子游標列表child cursor list以找到一個合適的子游標child cursor,進而導致cursor sharing 效能問題 現大量的Cursor: Mutex S 和 library cache lock等待事件。
關於子游標的數量控制,可以參考《11gR2遊標共享新特性帶來的一些問題以及_cursor_features_enabled、_cursor_obsolete_threshold和106001 event》。
Executions : 該SQL在快照時間內累計執行的次數
Hash Value : 共享SQL 的雜湊值
Only Statements with Version Count greater than 20 are displayed 注意該環節僅列出version count > 20的語句
3-9 Cluster Wait Time SQL ordered by Cluster Wait Time
SQL ordered by Cluster Wait Time DB/Inst: ITSCMP/itscmp2 Snaps: 70719-70723 -> %Total - Cluster Time as a percentage of Total Cluster Wait Time -> %Clu - Cluster Time as a percentage of Elapsed Time -> %CPU - CPU Time as a percentage of Elapsed Time -> %IO - User I/O Time as a percentage of Elapsed Time -> Only SQL with Cluster Wait Time > .005 seconds is reported -> Total Cluster Wait Time (s): 525,480 -> Captured SQL account for 57.2% of Total Cluster Elapsed Wait Time (s) Executions %Total Time(s) %Clu %CPU %IO SQL Id -------------- ------------ ------ ---------- ------ ------ ------ ------------- 132,639.3 38,848 25.2 181,411.3 73.1 .0 .1 g0yc9szpuu068
Only SQL with Cluster Wait Time > .005 seconds is reported 這個環節僅僅列出Cluster Wait Time > 0.005 s的SQL
該環節的資料主要來源 於 DBA_HIST_SQLSTAT.CLWAIT_DELTA Delta value of cluster wait time
Cluster Wait Time : 該SQL語句累計執行過程中等待在叢集等待上的時間,單位為秒, 你可以理解為 當一個SQL 執行過程中遇到了gc buffer busy、gc cr multi block request 之類的Cluster等待,則這些等待消耗的時間全部算在 Cluster Wait Time裡。
Executions : 該SQL在快照時間內累計執行的次數
%Total: 該SQL所消耗的Cluster Wait time 佔 總的Cluster Wait time的比率, 為(SQL cluster wait time / DB total cluster Wait Time)
%Clu: 該SQL所消耗的Cluster Wait time 佔該SQL 總的耗時的比率,為(SQL cluster wait time / SQL elapsed Time),該指標說明了該語句是否是叢集等待敏感的
% CPU 該SQL 所消耗的CPU 時間 佔 該SQL消耗的時間裡的比例, 即 (SQL CPU Time / SQL Elapsed Time) ,該指標說明了該語句是否是CPU敏感的
%IO 該SQL 所消耗的I/O 時間 佔 該SQL消耗的時間裡的比例, 即(SQL I/O Time/SQL Elapsed Time) ,該指標說明了該語句是否是I/O敏感的
4 Instance Activity Stats
Instance Activity Stats DB/Inst: ITSCMP/itscmp2 Snaps: 70719-70723 -> Ordered by statistic name Statistic Total per Second per Trans -------------------------------- ------------------ -------------- ------------- Batched IO (bound) vector count 450,449 124.6 1.8 Batched IO (full) vector count 5,485 1.5 0.0 Batched IO (space) vector count 1,467 0.4 0.0 Batched IO block miss count 4,119,070 1,139.7 16.7 Batched IO buffer defrag count 39,710 11.0 0.2 Batched IO double miss count 297,357 82.3 1.2 Batched IO same unit count 1,710,492 473.3 7.0 Batched IO single block count 329,521 91.2 1.3 Batched IO slow jump count 47,104 13.0 0.2 Batched IO vector block count 2,069,852 572.7 8.4 Batched IO vector read count 262,161 72.5 1.1 Block Cleanout Optim referenced 37,574 10.4 0.2 CCursor + sql area evicted 1,457 0.4 0.0 ...............
Instance Activity Stats 的資料來自於 DBA_HIST_SYSSTAT,DBA_HIST_SYSSTAT來自於V$SYSSTAT。
這裡每一個指標都代表一種資料庫行為的活躍度,例如redo size 是指生成redo的量,sorts (disk) 是指磁碟排序的次數,table scans (direct read) 是指直接路徑掃描表的次數。
雖然這些指標均只有Total、per Second每秒、 per Trans每事務 三個維度,但對診斷問題十分有用。
我們來舉幾個例子:
1、 例如當 Top Event 中存在direct path read為Top 等待事件, 則需要分清楚是對普通堆表的direct read還是由於大量LOB讀造成的direct path read, 這個問題可以藉助 table scans (direct read)、table scans (long tables)、physical reads direct 、physical reads direct (lob) 、physical reads direct temporary幾個指標來分析, 假設 physical reads direct >> 遠大於 physical reads direct (lob)+physical reads direct temporary , 且有較大的table scans (direct read)、table scans (long tables) (注意這2個指標代表的是 掃描表的次數 不同於上面的phsical reads 的單位為 塊數*次數), 則說明了是 大表掃描引起的direct path read。
2、 例如當 Top Event中存在enq Tx:index contention等待事件, 則需要分析root node splits 、branch node splits 、leaf node 90-10 splits 、leaf node splits 、failed probes on index block rec 幾個指標,具體可以見文件《Oracle索引塊分裂split資訊彙總》
3、系統出現IO型別的等待事件為TOp Five 例如 db file sequential/scattered read ,我們需要通過AWR來獲得系統IO吞吐量和IOPS:
physical read bytes 主要是應用造成的物理讀取(Total size in bytes of all disk reads by application activity (and not other instance activity) only.) 而physical read total bytes則包括了 rman備份恢復 和後臺維護任務所涉及的物理讀位元組數,所以我們在研究IO負載時一般參考 physical read total bytes;以下4對指標均存在上述的關係
physical read bytes | physical read total bytes | 物理讀的吞吐量/秒 |
physical read IO requests | physical read total IO requests | 物理讀的IOPS |
physical write bytes | physical write total bytes | 物理寫的吞吐量/秒 |
physical write IO requests | physical write total IO requests | 物理寫的IOPS |
總的物理吞吐量/秒=physical read total bytes+physical write total bytes
總的物理IOPS= physical read total IO requests+ physical write total IO requests
IO的主要指標 吞吐量、IOPS和延遲 均可以從AWR中獲得了, IO延遲的資訊可以從 User I/O的Wait Class Avg Wait time獲得,也可以參考11g出現的IOStat by Function summary
Instance Activity Stats有大量的指標,但是對於這些指標的介紹 沒有那一份文件有完整詳盡的描述,即便在Oracle原廠內部要沒有(或者是Maclean沒找到),實際是開發人員要引入某一個Activity Stats是比較容易的,並不像申請引入一個新後臺程式那樣麻煩,Oracle對於新版本中新後臺程式的引入有嚴格的要求,但Activity Stats卻很容易,往往一個one-off patch中就可以引入了,實際上Activity Stats在原始碼層僅僅是一些計數器。’
較為基礎的statistics,大家可以參考官方文件的Statistics Descriptions描述,地址在這裡。
對於深入的指標 例如 “Batched IO (space) vector count”這種由於某些新特性被引入的,一般沒有很詳細的材料,需要到原始碼中去閱讀相關模組才能總結其用途,對於這個工作一般原廠是很延遲去完成的,所以沒有一個完整的列表。 如果大家有對此的疑問,請去t.askmaclean.com 發一個帖子提問。
Instance Activity Stats - Absolute Values Snaps: 7071 -> Statistics with absolute values (should not be diffed) Statistic Begin Value End Value -------------------------------- --------------- --------------- session pga memory max 1.157882826E+12 1.154290304E+12 session cursor cache count 157,042,373 157,083,136 session uga memory 5.496429019E+14 5.496775467E+14 opened cursors current 268,916 265,694 workarea memory allocated 827,704 837,487 logons current 2,609 2,613 session uga memory max 1.749481584E+13 1.749737418E+13 session pga memory 4.150306913E+11 4.150008177E+11
Instance Activity Stats – Absolute Values是顯示快照 起點 和終點的一些指標的絕對值
- logon current 當前時間點的登入數
- opened cursors current 當前開啟的遊標數
- session cursor cache count 當前存在的session快取遊標數
Instance Activity Stats - Thread ActivityDB/Inst: G10R25/G10R25 Snaps: 3663-3 -> Statistics identified by '(derived)' come from sources other than SYSSTAT Statistic Total per Hour -------------------------------- ------------------ --------- log switches (derived) 17 2,326.47
log switches (derived) 日誌切換次數 , 見 《理想的線上重做日誌切換時間是多長?》
5 IO 統計
5-1 Tablespace IO Stats 基於表空間分組的IO資訊
Tablespace IO Stats DB/Inst: ITSCMP/itscmp2 Snaps: 70719-70723 -> ordered by IOs (Reads + Writes) desc Tablespace ------------------------------ Av Av Av Av Buffer Av Buf Reads Reads/s Rd(ms) Blks/Rd Writes Writes/s Waits Wt(ms) -------------- ------- ------- ------- ------------ -------- ---------- ------- DATA_TS 17,349,398 4,801 2.3 1.5 141,077 39 4,083,704 5.8 INDEX_TS 9,193,122 2,544 2.0 1.0 238,563 66 3,158,187 46.1 UNDOTBS1 1,582,659 438 0.7 1.0 2 0 12,431 69.0
reads : 指 該表空間上發生的物理讀的次數(單位不是塊,而是次數)
Av Reads/s : 指該表空間上平均每秒的物理讀次數 (單位不是塊,而是次數)
Av Rd(ms): 指該表空間上每次讀的平均讀取延遲
Av Blks/Rd: 指該表空間上平均每次讀取的塊數目,因為一次物理讀可以讀多個資料塊;如果Av Blks/Rd>>1 則可能系統有較多db file scattered read 可能是診斷FULL TABLE SCAN或FAST FULL INDEX SCAN,需要關注table scans (long tables) 和index fast full scans (full) 2個指標
Writes : 該表空間上發生的物理寫的次數 ; 對於那些Writes總是等於0的表空間 不妨瞭解下是否資料為只讀,如果是可以通過read only tablespace來解決 RAC中的一些效能問題。
Av Writes/s : 指該表空間上平均每秒的物理寫次數
buffer Waits: 該表空間上發生buffer busy waits和read by other session的次數( 9i中buffer busy waits包含了read by other session)。
Av Buf Wt(ms): 該表空間上發生buffer Waits的平均等待時間,單位為ms
5-2 File I/O
File IO Stats Snaps: 70719-70723 -> ordered by Tablespace, File Tablespace Filename ------------------------ ---------------------------------------------------- Av Av Av Av Buffer Av Buf Reads Reads/s Rd(ms) Blks/Rd Writes Writes/s Waits Wt(ms) -------------- ------- ------- ------- ------------ -------- ---------- ------- AMG_ALBUM_IDX_TS +DATA/itscmp/plugged/data2/amg_album_idx_ts01.dbf 23,298 6 0.6 1.0 2 0 0 0.0 AMG_ALBUM_IDX_TS +DATA/itscmp/plugged/data3/amg_album_idx_ts02.dbf 3,003 1 0.6 1.0 2 0 0 0.0
Tablespace 表空間名
FileName 資料檔案的路徑
Reads: 該資料檔案上累計發生過的物理讀次數,不是塊數
Av Reads/s: 該資料檔案上平均每秒發生過的物理讀次數,不是塊數
Av Rd(ms): 該資料檔案上平均每次物理讀取的延遲,單位為ms
Av Blks/Rd: 該資料檔案上平均每次讀取涉及到的塊數,OLTP環境該值接近 1
Writes : 該資料檔案上累計發生過的物理寫次數,不是塊數
Av Writes/s: 該資料檔案上平均每秒發生過的物理寫次數,不是塊數
buffer Waits: 該資料檔案上發生buffer busy waits和read by other session的次數( 9i中buffer busy waits包含了read by other session)。
Av Buf Wt(ms): 該資料檔案上發生buffer Waits的平均等待時間,單位為ms
若某個表空間上有較高的IO負載,則有必要分析一下 是否其所屬的資料檔案上的IO 較為均勻 還是存在傾斜, 是否需要結合儲存特徵來 將資料均衡分佈到不同磁碟上的資料檔案上,以優化 I/O
6 緩衝池統計 Buffer Pool Statistics
Buffer Pool Statistics Snaps: 70719-70723 -> Standard block size Pools D: default, K: keep, R: recycle -> Default Pools for other block sizes: 2k, 4k, 8k, 16k, 32k Free Writ Buffer Number of Pool Buffer Physical Physical Buff Comp Busy P Buffers Hit% Gets Reads Writes Wait Wait Waits --- ---------- ---- ------------ ------------ ----------- ------ ------ -------- 16k 15,720 N/A 0 0 0 0 0 0 D 2,259,159 98 2.005084E+09 42,753,650 560,460 0 1 8.51E+06
該環節的資料主要來源於WRH$_BUFFER_POOL_STATISTICS, 而WRH$_BUFFER_POOL_STATISTICS是定期彙總v$SYSSTAT中的資料
P pool池的名字 D: 預設的緩衝池 default buffer pool , K : Keep Pool , R: Recycle Pool ; 2k 4k 8k 16k 32k: 代表各種非標準塊大小的緩衝池
Number of buffers: 實際的 緩衝塊數目, 約等於 池的大小 / 池的塊大小
Pool Hit % : 該緩衝池的命中率
Buffer Gets: 對該緩衝池的中塊的訪問次數 包括 consistent gets 和 db block gets
Physical Reads: 該緩衝池Buffer Cache引起了多少物理讀, 其實是physical reads cache ,單位為 塊數*次數
Physical Writes :該緩衝池中Buffer cache被寫的物理寫, 其實是physical writes from cache, 單位為 塊數*次數
Free Buffer Waits: 等待空閒緩衝的次數, 可以看做該buffer pool 發生free buffer waits 等待的次數
Write Comp Wait: 等待DBWR寫入髒buffer到磁碟的次數, 可以看做該buffer pool發生write complete waits等待的次數
Buffer Busy Waits: 該緩衝池發生buffer busy wait 等待的次數
7-1 Checkpoint Activity 檢查點與 Instance Recovery Stats 例項恢復
Checkpoint Activity Snaps: 70719-70723 -> Total Physical Writes: 590,563 Other Autotune Thread MTTR Log Size Log Ckpt Settings Ckpt Ckpt Writes Writes Writes Writes Writes Writes ----------- ----------- ----------- ----------- ----------- ----------- 0 0 0 0 12,899 0 ------------------------------------------------------------- Instance Recovery Stats Snaps: 70719-70723 -> B: Begin Snapshot, E: End Snapshot Estd Targt Estd Log Ckpt Log Ckpt Opt RAC MTTR MTTR Recovery Actual Target Log Sz Timeout Interval Log Avail (s) (s) Estd IOs RedoBlks RedoBlks RedoBlks RedoBlks RedoBlks Sz(M) Time - ----- ----- -------- -------- -------- -------- -------- -------- ------ ----- B 0 6 12828 477505 1786971 5096034 1786971 N/A N/A 3 E 0 7 16990 586071 2314207 5096034 2314207 N/A N/A 3 -------------------------------------------------------------
該環節的資料來源於WRH$_INSTANCE_RECOVERY
MTTR Writes : 為了滿足FAST_START_MTTR_TARGET 指定的MTTR值 而做出的物理寫 WRITES_MTTR
Log Size Writes :由於最小的redo log file而做出的物理寫 WRITES_LOGFILE_SIZE
Log Ckpt writes: 由於 LOG_CHECKPOINT_INTERVAL 和 LOG_CHECKPOINT_TIMEOUT 驅動的增量檢查點而做出的物理寫 WRITES_LOG_CHECKPOINT_SETTINGS
Other Settings Writes :由於其他設定(例如FAST_START_IO_TARGET)而引起的物理寫, WRITES_OTHER_SETTINGS
Autotune Ckpt Writes : 由於自動調優檢查點而引起的物理寫, WRITES_AUTOTUNE
Thread Ckpt Writes :由於thread checkpoint而引起的物理寫,WRITES_FULL_THREAD_CKPT
B 代表 開始點, E 代表結尾
Targt MTTR (s) : 目標MTTR (mean time to recover)意為有效恢復時間,單位為秒。 TARGET_MTTR 的計算基於 給定的引數FAST_START_MTTR_TARGET,而TARGET_MTTR作為內部使用。 實際在使用中 Target MTTR未必能和FAST_START_MTTR_TARGET一樣。 如果FAST_START_MTTR_TARGET過小,那麼TARGET_MTTR 將是系統條件所允許的最小估算值; 如果FAST_START_MTTR_TARGET過大,則TARGET_MTTR以保守演算法計算以獲得完成恢復的最長估算時間。
estimated_mttr (s): 當前基於 髒buffer和重做日誌塊的數量,而評估出的有效恢復時間 。 它的估算告訴使用者 以當下系統的負載若發生例項crash,則需要多久時間來做crash recovery的前滾操作,之後才能開啟資料庫。
Recovery Estd IOs :實際是當前buffer cache中的髒塊數量,一旦例項崩潰 這些髒塊要被前滾
Actual RedoBlks : 當前實際需要恢復的redo重做塊數量
Target RedoBlks :是 Log Sz RedoBlks 、Log Ckpt Timeout RedoBlks、 Log Ckpt Interval RedoBlks 三者的最小值
Log Sz RedoBlks : 代表 必須在log file switch日誌切換之前完成的 checkpoint 中涉及到的redo block,也叫max log lag; 資料來源select LOGFILESZ from X$targetrba; select LOG_FILE_SIZE_REDO_BLKS from v$instance_recovery;
Log Ckpt Timeout RedoBlks : 為了滿足LOG_CHECKPOINT_TIMEOUT 所需要處理的redo block數,lag for checkpoint timeout ; 資料來源select CT_LAG from x$targetrba;
Log Ckpt Interval RedoBlks :為了滿足LOG_CHECKPOINT_INTERVAL 所需要處理的redo block數, lag for checkpoint interval; 資料來源select CI_LAG from x$targetrba;
Opt Log Sz(M) : 基於FAST_START_MTTR_TARGET 而估算出來的redo logfile 的大小,單位為MB 。 Oracle官方推薦建立的重做日誌大小至少大於這個估算值
Estd RAC Avail Time :指評估的 RAC中節點失敗後 叢集從凍結到部分可用的時間, 這個指標僅在RAC中可用,單位為秒。 ESTD_CLUSTER_AVAILABLE_TIME
7-2 Buffer Pool Advisory 緩衝池建議
Buffer Pool Advisory DB/Inst: ITSCMP/itscmp2 Snap: 70723 -> Only rows with estimated physical reads >0 are displayed -> ordered by Block Size, Buffers For Estimate Est Phys Estimated Est Size for Size Buffers Read Phys Reads Est Phys %DBtime P Est (M) Factor (thousands) Factor (thousands) Read Time for Rds --- -------- ------ ------------ ------ -------------- ------------ ------- D 1,920 .1 227 4.9 1,110,565,597 1 1.0E+09 D 3,840 .2 454 3.6 832,483,886 1 7.4E+08 D 5,760 .3 680 2.8 634,092,578 1 5.6E+08 D 7,680 .4 907 2.2 500,313,589 1 4.3E+08 D 9,600 .5 1,134 1.8 410,179,557 1 3.5E+08 D 11,520 .6 1,361 1.5 348,214,283 1 2.9E+08 D 13,440 .7 1,588 1.3 304,658,441 1 2.5E+08 D 15,360 .8 1,814 1.2 273,119,808 1 2.2E+08 D 17,280 .9 2,041 1.1 249,352,943 1 2.0E+08 D 19,200 1.0 2,268 1.0 230,687,206 1 1.8E+08 D 19,456 1.0 2,298 1.0 228,664,269 1 1.8E+08 D 21,120 1.1 2,495 0.9 215,507,858 1 1.7E+08 D 23,040 1.2 2,722 0.9 202,816,787 1 1.6E+08 D 24,960 1.3 2,948 0.8 191,974,196 1 1.5E+08 D 26,880 1.4 3,175 0.8 182,542,765 1 1.4E+08 D 28,800 1.5 3,402 0.8 174,209,199 1 1.3E+08 D 30,720 1.6 3,629 0.7 166,751,631 1 1.2E+08 D 32,640 1.7 3,856 0.7 160,002,420 1 1.2E+08 D 34,560 1.8 4,082 0.7 153,827,351 1 1.1E+08 D 36,480 1.9 4,309 0.6 148,103,338 1 1.1E+08 D 38,400 2.0 4,536 0.6 142,699,866 1 1.0E+08
緩衝池的顆粒大小 可以參考 SELECT * FROM V$SGAINFO where name like(‘Granule%’);
P 指 緩衝池的名字 可能包括 有 D default buffer pool , K Keep Pool , R recycle Pool
Size For Est(M): 指以該尺寸的buffer pool作為評估的物件,一般是 目前current size的 10% ~ 200%,以便了解 buffer pool 增大 ~減小 對物理讀的影響
Size Factor : 尺寸因子, 只 對應buffer pool 大小 對 當前設定的比例因子, 例如current_size是 100M , 則如果評估值是110M 那麼 size Factor 就是 1.1
Buffers (thousands) :指這個buffer pool 尺寸下的buffer 數量, 要乘以1000才是實際值
Est Phys Read Factor :評估的物理讀因子, 例如當前尺寸的buffer pool 會引起100個物理讀, 則別的尺寸的buffer pool如果引起 120個物理讀, 那麼 對應尺寸的Est Phys Read Factor就是1.2
Estimated Phys Reads (thousands):評估的物理讀數目, 要乘以 1000才是實際值, 顯然不同尺寸的buffer pool對應不同的評估的物理讀數目
Est Phys Read Time : 評估的物理讀時間
Est %DBtime for Rds:評估的物理讀佔DB TIME的比率
我們 看buffer pool advisory 一般有2個目的:
- 在物理讀較多的情況下,希望通過增加buffer pool 大小來緩解物理讀等待,這是我們關注Size Factor > 1的buffer pool尺寸是否能共有效減少Est Phys Read Factor, 如果Est Phys Read Factor隨著Size Factor 增大 而顯著減少,那麼說明增大buffer cache 是可以有效減少物理讀的。
- 在記憶體緊張的情況下 ,希望從buffer pool中勻出部分記憶體來移作他用, 但是又不希望 buffer cache變小導致 物理讀增多 效能下降, 則此時 觀察Est Phys Read Factor 是否隨著Size Factor 減小而 顯著增大, 如果不是 則說明減少部分buffer cache 不會導致 物理讀大幅增加,也就可以安心 減少 buffer cache
注意 Size Factor 和 Est Phys Read Factor之間不是簡單的 線性關係,所以需要人為介入評估得失
7-3 PGA Aggr Summary
PGA Aggr Summary Snaps: 70719-70723 -> PGA cache hit % - percentage of W/A (WorkArea) data processed only in-memory PGA Cache Hit % W/A MB Processed Extra W/A MB Read/Written --------------- ------------------ -------------------------- 99.9 412,527 375
PGA Cache Hit % : 指 W/A WorkArea工作區的資料僅在記憶體中處理的比率, PGA快取命中率
workarea是PGA中負責處理 排序、雜湊連線和點陣圖合併操作的區域; workarea 也叫做 SQL 作業區域
W/A MB processes: 指 在Workarea中處理過的資料的量,單位為MB
Extra W/A MB Read/Written : 指額外從磁碟上 讀寫的 工作區資料, 單位為 MB
7-4 PGA Aggr Target Stats
Warning: pga_aggregate_target was set too low for current workload, as this value was exceeded during this interval. Use the PGA Advisory view to help identify a different value for pga_aggregate_target. PGA Aggr Target Stats Snaps: 70719-70723 -> B: Begin Snap E: End Snap (rows dentified with B or E contain data which is absolute i.e. not diffed over the interval) -> Auto PGA Target - actual workarea memory target -> W/A PGA Used - amount of memory used for all Workareas (manual + auto) -> %PGA W/A Mem - percentage of PGA memory allocated to workareas -> %Auto W/A Mem - percentage of workarea memory controlled by Auto Mem Mgmt -> %Man W/A Mem - percentage of workarea memory under manual control %PGA %Auto %Man PGA Aggr Auto PGA PGA Mem W/A PGA W/A W/A W/A Global Mem Target(M) Target(M) Alloc(M) Used(M) Mem Mem Mem Bound(K) - ---------- ---------- ---------- ---------- ------ ------ ------ ---------- B 8,192 512 23,690.5 150.1 .6 100.0 .0 838,860 E 8,192 512 23,623.6 156.9 .7 100.0 .0 838,860 -------------------------------------------------------------
此環節的資料來源主要是 WRH$_PGASTAT
PGA Aggr Target(M) :本質上就是pga_aggregate_target , 當然在AMM(memory_target)環境下 這個值可能會自動變化
Auto PGA Target(M) : 在自動PGA 管理模式下 實際可用的工作區記憶體 “aggregate PGA auto target “, 因為PGA還有其他用途 ,不能全部作為workarea memory
PGA Mem Alloc(M) :目前已分配的PGA記憶體, alloc 不等於 inuse 即分配的記憶體不等於在使用的記憶體,理論上PGA會將確實不使用的記憶體返回給OS(PGA memory freed back to OS) ,但是存在PGA佔用大量記憶體而不釋放的場景
在上例中 pga_aggregate_target 僅為8192M ,而實際processes 在 2,615~ 8000之間,如果一個程式耗費5MB的PGA 也需要 10000M的PGA ,而實際這裡 PGA Mem Alloc(M)是23,690 M ,這說明 存在PGA 的過載, 需要調整pga_aggregate_target
W/A PGA Used(M) :所有的工作區workarea(包括manual和 auto)使用的記憶體總和量, 單位為MB
%PGA W/A Mem: 分配給workarea的記憶體量佔總的PGA的比例, (W/A PGA Used)/PGA Mem Alloc
%Auto W/A Mem : AUTO 自動工作區管理所控制的記憶體(workarea_size_policy=AUTO) 佔總的workarea記憶體的比例
%Man W/A Mem : MANUAL 手動工作區管理所控制的記憶體(workarea_size_policy=MANUAL)佔總的workarea記憶體的比例
Global Mem Bound(K) : 指 在自動PGA管理模式下一個工作區所能分配的最大記憶體(注意 一個SQL執行過程中可能有多個工作區workarea)。 Global Mem Bound(K)這個指標在例項執行過程中將被持續性的修正,以反應資料庫當時工作區的負載情況。顯然在有眾多活躍工作區的系統負載下相應地Global Mem Bound將會下降。 但應當保持global bound值不要小於1 MB , 否則建議 調高pga_aggregate_target
7-5 PGA Aggr Target Histogram
PGA Aggr Target Histogram Snaps: 70719-70723 -> Optimal Executions are purely in-memory operations Low High Optimal Optimal Total Execs Optimal Execs 1-Pass Execs M-Pass Execs ------- ------- -------------- -------------- ------------ ------------ 2K 4K 262,086 262,086 0 0 64K 128K 497 497 0 0 128K 256K 862 862 0 0 256K 512K 368 368 0 0 512K 1024K 440,585 440,585 0 0 1M 2M 68,313 68,313 0 0 2M 4M 169 161 8 0 4M 8M 50 42 8 0 8M 16M 82 82 0 0 16M 32M 1 1 0 0 32M 64M 12 12 0 0 128M 256M 2 0 2 0 -------------------------------------------------------------
資料來源:WRH$_SQL_WORKAREA_HISTOGRAM
Low Optimal: 此行所包含工作區workarea最適合記憶體要求的下限
High Optimal: 此行所包含工作區workarea最適合記憶體要求的上限
Total Execs: 在 Low Optimal~High Optimal 範圍工作區內完成的總執行數
Optimal execs: optimal 執行是指完全在PGA記憶體中完成的執行次數
1-pass Execs : 指操作過程中僅發生1次磁碟讀取的執行次數
M-pass Execs: 指操作過程中發生了1次以上的磁碟讀取, 頻發磁碟讀取的執行次數
7-6 PGA Memory Advisory
PGA Memory Advisory Snap: 70723 -> When using Auto Memory Mgmt, minimally choose a pga_aggregate_target value where Estd PGA Overalloc Count is 0 Estd Extra Estd P Estd PGA PGA Target Size W/A MB W/A MB Read/ Cache Overallo Estd Est (MB) Factr Processed Written to Disk Hit % Count Time ---------- ------- ---------------- ---------------- ------ -------- ------- 1,024 0.1 2,671,356,938.7 387,531,258.9 87.0 1.07E+07 7.9E+11 2,048 0.3 2,671,356,938.7 387,529,979.1 87.0 1.07E+07 7.9E+11 4,096 0.5 2,671,356,938.7 387,518,881.8 87.0 1.07E+07 7.9E+11 6,144 0.8 2,671,356,938.7 387,420,749.5 87.0 1.07E+07 7.9E+11 8,192 1.0 2,671,356,938.7 23,056,196.5 99.0 1.07E+07 6.9E+11 9,830 1.2 2,671,356,938.7 22,755,192.6 99.0 6.81E+06 6.9E+11 11,469 1.4 2,671,356,938.7 20,609,438.5 99.0 4.15E+06 6.9E+11 13,107 1.6 2,671,356,938.7 19,021,139.1 99.0 581,362 6.9E+11 14,746 1.8 2,671,356,938.7 18,601,191.0 99.0 543,531 6.9E+11 16,384 2.0 2,671,356,938.7 18,561,361.1 99.0 509,687 6.9E+11 24,576 3.0 2,671,356,938.7 18,527,422.3 99.0 232,817 6.9E+11 32,768 4.0 2,671,356,938.7 18,511,872.6 99.0 120,180 6.9E+11 49,152 6.0 2,671,356,938.7 18,500,815.3 99.0 8,021 6.9E+11 65,536 8.0 2,671,356,938.7 18,498,733.0 99.0 0 6.9E+11
PGA Target Est (MB) 用以評估的 PGA_AGGREGATE _TARGET值
Size Factr , 當前用以評估的PGA_AGGREGATE _TARGET 和 當前實際設定的PGA_AGGREGATE _TARGET 之間的 比例因子 PGA Target Est / PGA_AGGREGATE_TARGE
W/A MB Processed :workarea中要處理的資料量, 單位為MB
Estd Extra W/A MB Read/ Written to Disk : 以 one-pass 、M-Pass方式處理的資料量預估值, 單位為MB
Estd P Cache Hit % : 預估的PGA快取命中率
Estd PGA Overalloc Count: 預估的PGA過載量, 如上文所述PGA_AGGREGATE _TARGET僅是一個目標值,無法真正限制PGA記憶體的使用,當出現 PGA記憶體硬性需求時會產生PGA overallocate 過載(When using Auto Memory Mgmt, minimally choose a pga_aggregate_target value where Estd PGA Overalloc Count is 0)
7-7 Shared Pool Advisory
Shared Pool Advisory Snap: 70723 -> SP: Shared Pool Est LC: Estimated Library Cache Factr: Factor -> Note there is often a 1:Many correlation between a single logical object in the Library Cache, and the physical number of memory objects associated with it. Therefore comparing the number of Lib Cache objects (e.g. in v$librarycache), with the number of Lib Cache Memory Objects is invalid. Est LC Est LC Est LC Est LC Shared SP Est LC Time Time Load Load Est LC Pool Size Size Est LC Saved Saved Time Time Mem Obj Size(M) Factr (M) Mem Obj (s) Factr (s) Factr Hits (K) -------- ----- -------- ------------ -------- ------ ------- ------ ------------ 304 .8 56 3,987 7,728 1.0 61 1.4 332 352 .9 101 6,243 7,745 1.0 44 1.0 334 400 1.0 114 7,777 7,745 1.0 44 1.0 334 448 1.1 114 7,777 7,745 1.0 44 1.0 334 496 1.2 114 7,777 7,745 1.0 44 1.0 334 544 1.4 114 7,777 7,745 1.0 44 1.0 334 592 1.5 114 7,777 7,745 1.0 44 1.0 334 640 1.6 114 7,777 7,745 1.0 44 1.0 334 688 1.7 114 7,777 7,745 1.0 44 1.0 334 736 1.8 114 7,777 7,745 1.0 44 1.0 334 784 2.0 114 7,777 7,745 1.0 44 1.0 334 832 2.1 114 7,777 7,745 1.0 44 1.0 334 -------------------------------------------------------------
Shared Pool Size(M) : 用以評估的shared pool共享池大小,在AMM /ASMM環境下 shared_pool 大小都可能浮動
SP Size Factr :共享池大小的比例因子, (Shared Pool Size for Estim / SHARED_POOL_SIZE)
Estd LC Size(M) : 評估的 library cache 大小 ,單位為MB , 因為是shared pool中包含 library cache 當然還有其他例如row cache
Est LC Mem Obj 指評估的指定大小的共享池內的library cache memory object的數量 ESTD_LC_MEMORY_OBJECTS
Est LC Time Saved(s): 指在 指定的共享池大小情況下可找到需要的library cache memory objects,從而節約的解析時間 。 這些節約的解析時間也是 花費在共享池內重複載入需要的物件(reload),這些物件可能因為共享池沒有足夠的free memory而被aged out. ESTD_LC_TIME_SAVED
Est LC Time Saved Factr : Est LC Time Saved(s)的比例因子,( Est LC Time Saved(s)/ Current LC Time Saved(s) ) ESTD_LC_TIME_SAVED_FACTOR
Est LC Load Time (s): 在指定的共享池大小情況下解析的耗時
Est LC Load Time Factr:Est LC Load Time (s)的比例因子, (Est LC Load Time (s)/ Current LC Load Time (s)) ESTD_LC_LOAD_TIME_FACTOR
Est LC Mem Obj Hits (K) : 在指定的共享池大小情況下需要的library cache memory object正好在共享池中被找到的次數 ESTD_LC_MEMORY_OBJECT_HITS;
對於想縮小 shared_pool_size 共享池大小的需求,可以關注Est LC Mem Obj Hits (K) ,如上例中共享池為352M時Est LC Mem Obj Hits (K) 就為334且之後不動,則可以考慮縮小shared_pool_size到該值,但要注意每個版本/平臺上對共享池的最低需求,包括RAC中gcs resource 、gcs shadow等資源均駐留在shared pool中,增大db_cache_size時要對應關注。
7-8 SGA Target Advisory
SGA Target Advisory Snap: 70723 SGA Target SGA Size Est DB Est Physical Size (M) Factor Time (s) Reads ---------- ---------- ------------ ---------------- 3,752 0.1 1.697191E+09 1.4577142918E+12 7,504 0.3 1.222939E+09 832,293,601,354 11,256 0.4 1.000162E+09 538,390,923,784 15,008 0.5 895,087,191 399,888,743,900 18,760 0.6 840,062,594 327,287,716,803 22,512 0.8 806,389,685 282,881,041,331 26,264 0.9 782,971,706 251,988,446,808 30,016 1.0 765,293,424 228,664,652,276 33,768 1.1 751,135,535 210,005,616,650 37,520 1.3 739,350,016 194,387,820,900 41,272 1.4 733,533,785 187,299,216,679 45,024 1.5 732,921,550 187,299,216,679 48,776 1.6 732,691,962 187,299,216,679 52,528 1.8 732,538,908 187,299,216,679 56,280 1.9 732,538,917 187,299,216,679 60,032 2.0 732,462,391 187,299,458,716 -------------------------------------------------------------
該環節資料來源於WRH$_SGA_TARGET_ADVICE
SGA target Size : 用以評估的sga target大小 (sga_target)
SGA Size Factor: SGA Size的比例因子, (est SGA target Size / Current SGA target Size )
Est DB Time (s): 評估對應於該指定sga target size會產生多少量的DB TIME,單位為秒
Est Physical Reads:評估對應該指定的sga target size 會產生多少的物理讀
7-9 Streams Pool Advisory
Streams Pool Advisory DB/Inst: ITSCMP/itscmp2 Snap: 70723 Size for Size Est Spill Est Spill Est Unspill Est Unspill Est (MB) Factor Count Time (s) Count Time (s) ---------- --------- ----------- ----------- ----------- ----------- 64 0.5 0 0 0 0 128 1.0 0 0 0 0 192 1.5 0 0 0 0 256 2.0 0 0 0 0 320 2.5 0 0 0 0 384 3.0 0 0 0 0 448 3.5 0 0 0 0 512 4.0 0 0 0 0 576 4.5 0 0 0 0 640 5.0 0 0 0 0 704 5.5 0 0 0 0 768 6.0 0 0 0 0 832 6.5 0 0 0 0 896 7.0 0 0 0 0 960 7.5 0 0 0 0 1,024 8.0 0 0 0 0 1,088 8.5 0 0 0 0 1,152 9.0 0 0 0 0 1,216 9.5 0 0 0 0 1,280 10.0 0 0 0 0
該環節只有當使用了Streams 流複製時才會有必要資料, 資料來源 WRH$_STREAMS_POOL_ADVICE
Size for Est (MB) : 用以評估的 streams pool大小
Size Factor :streams pool大小的比例因子
Est Spill Count :評估出的 當使用該大小的流池時 message溢位到磁碟的數量 ESTD_SPILL_COUNT
Est Spill Time (s): 評估出的 當使用該大小的流池時 message溢位到磁碟的耗時,單位為秒 ESTD_SPILL_TIME
Est Unspill Count:評估的 當使用該大小的流池時 message unspill 即從磁碟上讀取的數量 ESTD_UNSPILL_COUNT
Est Unspill Time (s) : 評估的 當使用該大小的流池時 message unspill 即從磁碟上讀取的耗時,單位為秒 ESTD_UNSPILL_TIME
7-10 Java Pool Advisory
java pool的相關指標與shared pool相似,不再鏖述
8 Wait Statistics
8-1 Buffer Wait Statistics
Buffer Wait Statistics Snaps: 70719-70723 -> ordered by wait time desc, waits desc Class Waits Total Wait Time (s) Avg Time (ms) ------------------ ----------- ------------------- -------------- data block 8,442,041 407,259 48 undo header 16,212 1,711 106 undo block 21,023 557 26 1st level bmb 1,038 266 256 2nd level bmb 540 185 342 bitmap block 90 25 276 segment header 197 13 66 file header block 132 6 43 bitmap index block 18 0 1 extent map 2 0 0
資料來源 : WRH$_WAITSTAT
該環節是對 緩衝池中各型別(class) 塊 等待的彙總資訊, wait的原因一般是 buffer busy waits 和 read by other session
class 資料塊的class, 一個oracle資料塊即有class 屬性 還有type 屬性,資料塊中記錄type屬性(KCBH), 而在buffer header裡存有class屬性(X$BH.class)
Waits: 該型別資料塊的等待次數
Total Wait Time (s) : 該型別資料塊的合計等待時間 單位為秒
Avg Time (ms) : 該型別資料塊 平均每次等待的耗時, 單位 ms
如果使用者正使用 undo_management=AUTO 的SMU 則一般不會因為rollback segment過少而引起undo header block類塊的等待
對於INSERT 而引起的 buffer爭用等待:
1、 對於手動segment 管理MSSM 考慮增加Freelists、Freelist Groups
2、 使用ASSM ,當然ASSM本身沒什麼引數可調
對於INSERT ON INDEX 引起的爭用:
- 使用反向索引key
- 使用HASH分割槽和本地索引
- 可能的情況下 減少index的density
8-2 Enqueue Activity
enqueue 佇列鎖等待
Enqueue Activity Snaps: 70719-70723 -> only enqueues with waits are shown -> Enqueue stats gathered prior to 10g should not be compared with 10g data -> ordered by Wait Time desc, Waits desc Enqueue Type (Request Reason) ------------------------------------------------------------------------------ Requests Succ Gets Failed Gets Waits Wt Time (s) Av Wt Time(ms) ------------ ------------ ----------- ----------- ------------ -------------- TX-Transaction (index contention) 201,270 201,326 0 193,948 97,517 502.80 TM-DML 702,731 702,681 4 1,081 46,671 43,174.08 SQ-Sequence Cache 28,643 28,632 0 17,418 35,606 2,044.19 HW-Segment High Water Mark 9,210 8,845 376 1,216 12,505 10,283.85 TX-Transaction (row lock contention) 9,288 9,280 0 9,232 10,486 1,135.80 CF-Controlfile Transaction 15,851 14,094 1,756 2,798 4,565 1,631.64 TX-Transaction (allocate ITL entry) 471 369 102 360 169 469.28
Enqueue Type (Request Reason) enqueue 佇列的型別,大家在研究 enqueue 問題前 至少搞清楚enqueue type 和enqueue mode , enqueue type是佇列鎖所要保護的資源 如 TM 表鎖 CF 控制檔案鎖, enqueue mode 是持有佇列鎖的模式 (SS、SX 、S、SSX、X)
Requests : 申請對應的enqueue type資源或者佇列轉換(enqueue conversion 例如 S 轉 SSX ) 的次數
Succ Gets :對應的enqueue被成功 申請或轉換的次數
Failed Gets :對應的enqueue的申請 或者轉換失敗的次數
Waits :由對應的enqueue的申請或者轉換而造成等待的次數
Wt Time (s) : 由對應的enqueue的申請或者轉換而造成等待的等待時間
Av Wt Time(ms) :由對應的enqueue的申請或者轉換而造成等待的平均等待時間 , Wt Time (s) / Waits ,單位為ms
主要的enqueue 等待事件:
enq: TX – row lock/index contention、allocate ITL等待事件
Oracle佇列鎖enq:TS,Temporary Segment (also TableSpace)
9-1 Undo Segment Summary
Undo Segment Summary Snaps: 70719-70723 -> Min/Max TR (mins) - Min and Max Tuned Retention (minutes) -> STO - Snapshot Too Old count, OOS - Out of Space count -> Undo segment block stats: -> uS - unexpired Stolen, uR - unexpired Released, uU - unexpired reUsed -> eS - expired Stolen, eR - expired Released, eU - expired reUsed Undo Num Undo Number of Max Qry Max Tx Min/Max STO/ uS/uR/uU/ TS# Blocks (K) Transactions Len (s) Concurcy TR (mins) OOS eS/eR/eU ---- ---------- --------------- -------- -------- --------- ----- -------------- 4 85.0 200,127 55,448 317 1040.2/10 0/0 0/0/0/0/0/0 ------------------------------------------------------------- Undo Segment Stats Snaps: 70719-70723 -> Most recent 35 Undostat rows, ordered by Time desc Num Undo Number of Max Qry Max Tx Tun Ret STO/ uS/uR/uU/ End Time Blocks Transactions Len (s) Concy (mins) OOS eS/eR/eU ------------ ----------- ------------ ------- ------- ------- ----- ------------ 29-Aug 05:52 11,700 35,098 55,448 234 1,070 0/0 0/0/0/0/0/0 29-Aug 05:42 12,203 24,677 54,844 284 1,065 0/0 0/0/0/0/0/0 29-Aug 05:32 14,132 37,826 54,241 237 1,060 0/0 0/0/0/0/0/0 29-Aug 05:22 14,379 32,315 53,637 317 1,050 0/0 0/0/0/0/0/0 29-Aug 05:12 15,693 34,157 53,033 299 1,045 0/0 0/0/0/0/0/0 29-Aug 05:02 16,878 36,054 52,428 250 1,040 0/0 0/0/0/0/0/0
資料來源: WRH$_UNDOSTAT , undo相關的使用資訊每10分鐘重新整理到v$undostat中
Undo Extent有三種狀態 active 、unexpired 、expired
active => extent中 包括了活動的事務 ,active的undo extent 一般不允許被其他事務重用覆蓋
unexpired => extent中沒有活動的事務,但相關undo 記錄從inactive到目前還未經過undo retention(注意 auto undo retention的問題 因為這個特性 可能在觀察dba_undo_extents時看到大部分block都是unexpired,這是正常的) 指定的時間,所以為unexpired。 對於沒有guarantee retention的undo tablespace而言,unexpired extent可能被 steal 為其他事物重用
expired => extent中沒有活動事務,且超過了undo retention的時間
Undo TS# 在使用的這個undo 表空間的表空間號, 一個例項 同一時間只能用1個undo tablespace , RAC不同節點可以用不同的undo tablespace
Num Undo Blocks (K) 指被消費的 undo 資料塊的數量, (K)代表要乘以1000才是實際值; 可以用該指標來評估系統對undo block的消費量, 以便基於實際負載情況來評估UNDO表空間的大小
Number of Transactions 指該段時間內該undo表空間上執行過的事務transaction總量
Max Qry Len (s) 該時段內 持續最久的查詢 時間, 單位為秒
Max Tx Concy 該時段內 最大的事務併發量
Min/Max TR (mins) 最小和最大的tuned undo retention ,單位為分鐘; tuned undo retention 是自動undo調優特性,見undo自動調優介紹。
STO/ OOS STO 指 ORA-01555 Snapshot Too Old錯誤出現的次數; OOS – 指Out of Space count 錯誤出現的次數
uS – unexpired Stolen 嘗試從未過期的undo extent中偷取undo space的次數
uR – unexpired Released 從未過期的undo extent中釋放的塊數目
uU – unexpired reUsed 未過期的undo extent中的block被其他事務重用的 塊數目
eS – expired Stolen 嘗試從過期的undo extent中偷取undo space的次數
eR – expired Released 從過期的undo extent中釋放的塊數目
eU – expired reUsed 過期的undo extent中的block被其他事務重用的 塊數目
UNXPSTEALCNT | NUMBER | Number of attempts to obtain undo space by stealing unexpired extents from other transactions |
UNXPBLKRELCNT | NUMBER | Number of unexpired blocks removed from certain undo segments so they can be used by other transactions |
UNXPBLKREUCNT | NUMBER | Number of unexpired undo blocks reused by transactions |
EXPSTEALCNT | NUMBER | Number of attempts to steal expired undo blocks from other undo segments |
EXPBLKRELCNT | NUMBER | Number of expired undo blocks stolen from other undo segments |
EXPBLKREUCNT | NUMBER | Number of expired undo blocks reused within the same undo segments |
SSOLDERRCNT | NUMBER | Identifies the number of times the error ORA-01555 occurred. You can use this statistic to decide whether or not the UNDO_RETENTION initialization parameter is set properly given the size of the undo tablespace. Increasing the value of UNDO_RETENTION can reduce the occurrence of this error. |
10-1 Latch Activity
Latch Activity Snaps: 70719-70723 -> "Get Requests", "Pct Get Miss" and "Avg Slps/Miss" are statistics for willing-to-wait latch get requests -> "NoWait Requests", "Pct NoWait Miss" are for no-wait latch get requests -> "Pct Misses" for both should be very close to 0.0 Pct Avg Wait Pct Get Get Slps Time NoWait NoWait Latch Name Requests Miss /Miss (s) Requests Miss ------------------------ -------------- ------ ------ ------ ------------ ------ AQ deq hash table latch 4 0.0 0 0 N/A ASM Keyed state latch 9,048 0.1 0.2 0 0 N/A ASM allocation 15,017 0.2 0.8 1 0 N/A ASM db client latch 72,745 0.0 0 0 N/A ASM map headers 5,860 0.6 0.6 1 0 N/A ASM map load waiting lis 1,462 0.0 0 0 N/A ASM map operation freeli 63,539 0.1 0.4 1 0 N/A ASM map operation hash t 76,484,447 0.1 1.0 66 0 N/A
latch name Latch閂的名字
Get Requests latch被以willing-to-wait模式申請並獲得的次數
Pct Get Miss miss是指latch被以willing-to-wait 模式申請但是申請者必須等待的次數, Pct Get Miss = Miss/Get Requests ; miss可以從後面的Latch Sleep Breakdown 獲得
Avg Slps /Miss Sleep 是指latch被以willing-to-wait模式申請最終導致session需要sleep以等待該latch的次數 ; Avg Slps /Miss = Sleeps/ Misses ; Sleeps可以從後面的Latch Sleep Breakdown 獲得
Wait Time (s) 指花費在等待latch上的時間,單位為秒
NoWait Requests 指latch被以no-wait模式來申請的次數
Pct NoWait Miss 以no-wait模式來申請latch但直接失敗的次數
對於高併發的latch例如cache buffers chains,其Pct Misses應當十分接近於0
一般的調優原則:
如果latch : cache buffers chains是 Top 5 事件,則需要考慮優化SQL減少 全表掃描 並減少Top buffer gets SQL語句的邏輯讀
如果latch : redo copy 、redo allocation 等待較多,則可以考慮增大LOG_BUFFER
如果latch:library cache 發生較多,則考慮增大shared_pool_size
10-2 Latch Sleep Breakdown
Latch Sleep Breakdown DB/Inst: ITSCMP/itscmp2 Snaps: 70719-70723 -> ordered by misses desc Get Spin Latch Name Requests Misses Sleeps Gets -------------------------- --------------- ------------ ----------- ----------- cache buffers chains 3,365,097,866 12,831,875 130,058 12,683,450 row cache objects 69,050,058 349,839 1,320 348,649 session idle bit 389,437,460 268,285 2,768 265,752 enqueue hash chains 8,698,453 239,880 22,476 219,950 ges resource hash list 8,388,730 158,894 70,728 91,104 gc element 100,383,385 135,759 6,285 129,742 gcs remastering latch 12,213,169 72,373 1 72,371 enqueues 4,662,545 46,374 259 46,155 ASM map operation hash tab 76,484,447 46,231 45,210 1,952 Lsod array latch 72,598 24,224 24,577 1,519
latch name Latch閂的名字
Get Requests latch被以willing-to-wait模式申請並獲得的次數
misses 是指latch被以willing-to-wait 模式申請但是申請者必須等待的次數
9i以後miss之後一般有2種情況 spin gets了 或者sleep一睡不醒直到 被post,具體見全面解析9i以後Oracle Latch閂鎖原理;
8i以前的latch演算法可以參考:Oracle Latch:一段描繪Latch運作的虛擬碼
所以一般來說9i以後的 misses= Sleeps+ Spin Gets ,雖然不是絕對如此
Sleeps 是指latch被以willing-to-wait模式申請最終導致session需要sleep以等待該latch的次數
Spin Gets 以willing-to-wait模式去申請latch,在miss之後以spin方式獲得了latch的次數
10-3 Latch Miss Sources
Latch Miss Sources Snaps: 70719-70723 -> only latches with sleeps are shown -> ordered by name, sleeps desc NoWait Waiter Latch Name Where Misses Sleeps Sleeps ------------------------ -------------------------- ------- ---------- -------- ASM Keyed state latch kfksolGet 0 1 1 ASM allocation kfgpnSetDisks2 0 17 0 ASM allocation kfgpnClearDisks 0 5 0 ASM allocation kfgscCreate 0 4 0 ASM allocation kfgrpGetByName 0 1 26 ASM map headers kffmUnidentify_3 0 7 8 ASM map headers kffmAllocate 0 6 0 ASM map headers kffmIdentify 0 6 11 ASM map headers kffmFree 0 1 0 ASM map operation freeli kffmTranslate2 0 15 8 ASM map operation hash t kffmUnidentify 0 44,677 36,784 ASM map operation hash t kffmTranslate 0 220 3,517
資料來源為DBA_HIST_LATCH_MISSES_SUMMARY
latch name Latch閂的名字
where : 指哪些程式碼路徑核心函式持有過這些該latch ,而不是哪些程式碼路徑要申請這些latch; 例如kcbgtcr函式的作用是Get a block for Consistent read,其持有latch :cache buffers chain是很正常的事情
NoWait Misses: 以no-wait模式來申請latch但直接失敗的次數
Sleeps: 指latch被以willing-to-wait模式申請最終導致session需要sleep以等待該latch的次數 time of sleeps resulted in making the latch request
Waiter Sleeps:等待者休眠的次數 times of sleeps that waiters did for each where; Sleep 是阻塞者等待的次數 , Waiter Sleeps是被阻塞者等待的次數
10-4 Mutex Sleep Summary
Mutex Sleep Summary Snaps: 70719-70723 -> ordered by number of sleeps desc Wait Mutex Type Location Sleeps Time (ms) --------------------- -------------------------------- ------------ ------------ Cursor Pin kksfbc [KKSCHLFSP2] 4,364 14,520 Cursor Pin kkslce [KKSCHLPIN2] 2,396 2,498 Library Cache kglpndl1 95 903 475 Library Cache kglpin1 4 800 458 Library Cache kglpnal2 91 799 259 Library Cache kglget1 1 553 1,697 Library Cache kglpnal1 90 489 88 Library Cache kgllkdl1 85 481 1,528 Cursor Pin kksLockDelete [KKSCHLPIN6] 410 666 Cursor Stat kkocsStoreBindAwareStats [KKSSTA 346 497 Library Cache kglhdgn2 106 167 348 Library Cache kglhdgh1 64 26 84 Library Cache kgldtin1 42 19 55 Cursor Pin kksfbc [KKSCHLPIN1] 13 34 Library Cache kglhdgn1 62 11 13 Library Cache kgllkal1 80 9 12 Library Cache kgllkc1 57 6 0 Cursor Pin kksSetBindType [KKSCHLPIN3] 5 5 Library Cache kglGetHandleReference 124 4 20 Library Cache kglUpgradeLock 119 4 0 Library Cache kglget2 2 3 0 Library Cache kglati1 45 1 0 Library Cache kglini1 32 1 0 Library Cache kglobld1 75 1 0 Library Cache kglobpn1 71 1 0
Mutex是10.2.0.2以後引入的新的記憶體鎖機制,具體對Mutex的描述見 《深入理解Oracle中的Mutex》:http://www.askmaclean.com/archives/understanding-oracle-mutex.html
Mutex Type
Mutex的型別其實就是 mutex對應的客戶的名字, 在版本10.2中基本只有KKS使用Mutex,所以僅有3種:
- Cursor Stat (kgx_kks1)
- Cursor Parent (kgx_kks2)
- Cursor Pin (kgx_kks3)
11g中增加了Library Cache
Location 發起對該Mutex申請的程式碼路徑code location,而不是還持有該Mutex的程式碼路徑或曰核心函式
10.2中最常見的下面的幾個函式
kkspsc0 -負責解析遊標 – 檢測我們正在解析的遊標是否有物件的parent cursor heap 0存在
kksfbc – 負責找到合適的子游標 或者建立一個新的子游標
kksFindCursorstat
Sleeps:
Mutex的Get和Sleep
當一個Mutex被申請時, 一般稱為一個get request。 若初始的申請未能得到授權, 則該程式會因為此次申請而進入到255次SPIN中(_mutex_spin_count Mutex spin count),每次SPIN迴圈迭代過程中該程式都會去看看Mutex被釋放了嗎。
若該Mutex在SPIN之後仍未被釋放,則該程式針對申請的mutex進入對應的mutex wait等待事件中。 實際程式的等待事件和等待方式由mutex的型別鎖決定,例如 Cursor pin、Cursor Parent。 舉例來說,這種等待可能是阻塞等待,也可以是sleep。
但是請注意在V$MUTEX_SLEEP_*檢視上的sleep列意味著等待的次數。相關程式碼函式在開始進入等待時自加這個sleep欄位。
等待計時從程式進入等待前開始計算等待時間, 當一個程式結束其等待,則等待的時間加入都總和total中。 該程式再次嘗試申請之前的Mutex,若該Mutex仍不可用,則它再次進入spin/wait的迴圈。
V$MUTEX_SLEEP_HISTORY檢視的GETS列僅在成功申請到一個Mutex時才增加。
Wait Time (ms) 類似於latch,spin time 不算做mutex的消耗時間,它只包含等待消耗的時間。
=====================================================================
11 segment statistics 段級統計
11-1 Segments by Logical Reads
Segments by Logical Reads DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> Total Logical Reads: 2,021,476,421 -> Captured Segments account for 83.7% of Total Tablespace Subobject Obj. Logical Owner Name Object Name Name Type Reads %Total ---------- ---------- -------------------- ---------- ----- ------------ ------- CONTENT_OW INDEX_TS MZ_PRODUCT_ATTRIBUTE INDEX 372,849,920 18.44 CONTENT_OW INDEX_TS MZ_PRODUCT__LS_PK INDEX 329,829,632 16.32 CONTENT_OW DATA_TS MZ_PRODUCT_ATTRIBUTE TABLE 218,419,008 10.80 CONTENT_OW PLAYLIST_A MZ_PLAYLIST_ARTIST TABLE 182,426,240 9.02 CONTENT_OW DATA_TS MZ_PRODUCT TABLE 108,597,376 5.37
owner : 資料段的所有者
Tablespace Name: 資料段所在表空間名
Object Name : 物件名
Subobject Name:子物件名,例如一個分割槽表的某個分割槽
obj Type: 物件型別 一般為TABLE /INDEX 或者分割槽或子分割槽
Logical Reads :該資料段上發生過的邏輯讀 , 單位為 塊數*次數
%Total : 佔總的邏輯讀的百分比 , (當前物件上發生過的邏輯讀/ Total DB 邏輯讀)
11-2 Segments by Physical Reads
Segments by Physical Reads DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> Total Physical Reads: 56,839,035 -> Captured Segments account for 51.9% of Total Tablespace Subobject Obj. Physical Owner Name Object Name Name Type Reads %Total ---------- ---------- -------------------- ---------- ----- ------------ ------- CONTENT_OW SONG_TS MZ_SONG TABLE 7,311,928 12.86 CONTENT_OW DATA_TS MZ_CS_WORK_PENDING_R TABLE 4,896,554 8.61 CONTENT_OW DATA_TS MZ_CONTENT_PROVIDER_ TABLE 3,099,387 5.45 CONTENT_OW DATA_TS MZ_PRODUCT_ATTRIBUTE TABLE 1,529,971 2.69 CONTENT_OW DATA_TS MZ_PUBLICATION TABLE 1,391,735 2.45
Physical Reads: 該資料段上發生過的物理讀 , 單位為 塊數*次數
%Total : 佔總的物理讀的百分比 , (當前物件上發生過的邏輯讀/ Total DB 邏輯讀)
11-3 Segments by Physical Read Requests
Segments by Physical Read Requests DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> Total Physical Read Requests: 33,936,360 -> Captured Segments account for 45.5% of Total Tablespace Subobject Obj. Phys Read Owner Name Object Name Name Type Requests %Total ---------- ---------- -------------------- ---------- ----- ------------ ------- CONTENT_OW DATA_TS MZ_CONTENT_PROVIDER_ TABLE 3,099,346 9.13 CONTENT_OW DATA_TS MZ_PRODUCT_ATTRIBUTE TABLE 1,529,950 4.51 CONTENT_OW DATA_TS MZ_PRODUCT TABLE 1,306,756 3.85 CONTENT_OW DATA_TS MZ_AUDIO_FILE TABLE 910,537 2.68 CONTENT_OW INDEX_TS MZ_PRODUCT_ATTRIBUTE INDEX 820,459 2.42
Phys Read Requests : 物理讀的申請次數
%Total : (該段上發生的物理讀的申請次數/ physical read IO requests)
11-4 Segments by UnOptimized Reads
Segments by UnOptimized Reads DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> Total UnOptimized Read Requests: 811,466 -> Captured Segments account for 58.5% of Total Tablespace Subobject Obj. UnOptimized Owner Name Object Name Name Type Reads %Total ---------- ---------- -------------------- ---------- ----- ------------ ------- CONTENT_OW DATA_TS MZ_CONTENT_PROVIDER_ TABLE 103,580 12.76 CONTENT_OW SONG_TS MZ_SONG TABLE 56,946 7.02 CONTENT_OW DATA_TS MZ_IMAGE TABLE 47,017 5.79 CONTENT_OW DATA_TS MZ_PRODUCT_ATTRIBUTE TABLE 40,950 5.05 CONTENT_OW DATA_TS MZ_PRODUCT TABLE 30,406 3.75
UnOptimized Reads UnOptimized Read Reqs = Physical Read Reqts – Optimized Read Reqs
Optimized Read Requests是指 哪些滿足Exadata Smart Flash Cache ( or the Smart Flash Cache in OracleExadata V2 (Note that despite same name, concept and use of
‘Smart Flash Cache’ in Exadata V2 is different from ‘Smart Flash Cache’ in Database Smart Flash Cache)).的物理讀 次數 。 滿足從smart flash cache走的讀取申請唄認為是optimized ,因為這些讀取要比普通從磁碟走快得多。
此外通過smart scan 讀取storage index的情況也被認為是’optimized read requests’ ,源於可以避免讀取不相關的資料。
當使用者不在使用Exadata時,則UnOptimized Read Reqs總是等於 Physical Read Reqts
%Total : (該段上發生的物理讀的UnOptimized Read Reqs / ( physical read IO requests – physical read requests optimized ))
11-5 Segments by Optimized Reads
Segments by Optimized Reads DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> Total Optimized Read Requests: 33,124,894 -> Captured Segments account for 45.2% of Total Tablespace Subobject Obj. Optimized Owner Name Object Name Name Type Reads %Total ---------- ---------- -------------------- ---------- ----- ------------ ------- CONTENT_OW DATA_TS MZ_CONTENT_PROVIDER_ TABLE 2,995,766 9.04 CONTENT_OW DATA_TS MZ_PRODUCT_ATTRIBUTE TABLE 1,489,000 4.50 CONTENT_OW DATA_TS MZ_PRODUCT TABLE 1,276,350 3.85 CONTENT_OW DATA_TS MZ_AUDIO_FILE TABLE 890,775 2.69 CONTENT_OW INDEX_TS MZ_AM_REQUEST_IX3 INDEX 816,067 2.46
關於optimizerd read 上面已經解釋過了,這裡的單位是 request 次數
%Total : (該段上發生的物理讀的 Optimized Read Reqs/ physical read requests optimized )
11-6 Segments by Direct Physical Reads
Segments by Direct Physical Reads DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> Total Direct Physical Reads: 14,118,552 -> Captured Segments account for 94.2% of Total Tablespace Subobject Obj. Direct Owner Name Object Name Name Type Reads %Total ---------- ---------- -------------------- ---------- ----- ------------ ------- CONTENT_OW SONG_TS MZ_SONG TABLE 7,084,416 50.18 CONTENT_OW DATA_TS MZ_CS_WORK_PENDING_R TABLE 4,839,984 34.28 CONTENT_OW DATA_TS MZ_PUBLICATION TABLE 1,361,133 9.64 CONTENT_OW DATA_TS SYS_LOB0000203660C00 LOB 5,904 .04 CONTENT_OW DATA_TS SYS_LOB0000203733C00 LOB 1,656 .01
Direct reads 直接路徑物理讀,單位為 塊數*次數
%Total (該段上發生的direct path reads /Total physical reads direct )
11-7 Segments by Physical Writes
Segments by Physical Writes DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> Total Physical Writes: 590,563 -> Captured Segments account for 38.3% of Total Tablespace Subobject Obj. Physical Owner Name Object Name Name Type Writes %Total ---------- ---------- -------------------- ---------- ----- ------------ ------- CONTENT_OW DATA_TS MZ_CS_WORK_PENDING_R TABLE 23,595 4.00 CONTENT_OW DATA_TS MZ_PODCAST TABLE 19,834 3.36 CONTENT_OW INDEX_TS MZ_IMAGE_IX2 INDEX 16,345 2.77 SYS SYSAUX WRH$_ACTIVE_SESSION_ 1367_70520 TABLE 14,173 2.40 CONTENT_OW INDEX_TS MZ_AM_REQUEST_IX3 INDEX 9,645 1.63
Physical Writes ,物理寫 單位為 塊數*次數
Total % (該段上發生的物理寫 /Total physical writes )
11-9 Segments by Physical Write Requests
Segments by Physical Write Requests DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> Total Physical Write Requestss: 436,789 -> Captured Segments account for 43.1% of Total Tablespace Subobject Obj. Phys Write Owner Name Object Name Name Type Requests %Total ---------- ---------- -------------------- ---------- ----- ------------ ------- CONTENT_OW DATA_TS MZ_CS_WORK_PENDING_R TABLE 22,581 5.17 CONTENT_OW DATA_TS MZ_PODCAST TABLE 19,797 4.53 CONTENT_OW INDEX_TS MZ_IMAGE_IX2 INDEX 14,529 3.33 CONTENT_OW INDEX_TS MZ_AM_REQUEST_IX3 INDEX 9,434 2.16 CONTENT_OW DATA_TS MZ_AM_REQUEST TABLE 8,618 1.97
Phys Write Requests 物理寫的請求次數 ,單位為次數
%Total (該段上發生的物理寫請求次數 /physical write IO requests )
11-10 Segments by Direct Physical Writes
Segments by Direct Physical Writes DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> Total Direct Physical Writes: 29,660 -> Captured Segments account for 18.3% of Total Tablespace Subobject Obj. Direct Owner Name Object Name Name Type Writes %Total ---------- ---------- -------------------- ---------- ----- ------------ ------- SYS SYSAUX WRH$_ACTIVE_SESSION_ 1367_70520 TABLE 4,601 15.51 CONTENT_OW DATA_TS SYS_LOB0000203733C00 LOB 620 2.09 CONTENT_OW DATA_TS SYS_LOB0000203660C00 LOB 134 .45 CONTENT_OW DATA_TS SYS_LOB0000203779C00 LOB 46 .16 CONTENT_OW DATA_TS SYS_LOB0000203796C00 LOB 41 .14
Direct Writes 直接路徑寫, 單位額為塊數*次數
%Total 為(該段上發生的直接路徑寫 /physical writes direct )
11-11 Segments by Table Scans
Segments by Table Scans DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> Total Table Scans: 10,713 -> Captured Segments account for 1.0% of Total Tablespace Subobject Obj. Table Owner Name Object Name Name Type Scans %Total ---------- ---------- -------------------- ---------- ----- ------------ ------- CONTENT_OW DATA_TS MZ_PUBLICATION TABLE 92 .86 CONTENT_OW DATA_TS MZ_CS_WORK_PENDING_R TABLE 14 .13 CONTENT_OW SONG_TS MZ_SONG TABLE 3 .03 CONTENT_OW DATA_TS MZ_AM_REQUEST TABLE 1 .01
Table Scans 來源為dba_hist_seg_stat.table_scans_delta 不過這個指標並不十分精確
11-12 Segments by DB Blocks Changes
Segments by DB Blocks Changes DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> % of Capture shows % of DB Block Changes for each top segment compared -> with total DB Block Changes for all segments captured by the Snapshot Tablespace Subobject Obj. DB Block % of Owner Name Object Name Name Type Changes Capture ---------- ---------- -------------------- ---------- ----- ------------ ------- CONTENT_OW INDEX_TS MZ_AM_REQUEST_IX8 INDEX 347,856 10.21 CONTENT_OW INDEX_TS MZ_AM_REQUEST_IX3A INDEX 269,504 7.91 CONTENT_OW INDEX_TS MZ_AM_REQUEST_PK INDEX 251,904 7.39 CONTENT_OW DATA_TS MZ_AM_REQUEST TABLE 201,056 5.90 CONTENT_OW INDEX_TS MZ_PRODUCT_ATTRIBUTE INDEX 199,888 5.86
DB Block Changes ,單位為塊數*次數
%Total : (該段上發生block changes / db block changes )
11-13 Segments by Row Lock Waits
Segments by Row Lock Waits DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> % of Capture shows % of row lock waits for each top segment compared -> with total row lock waits for all segments captured by the Snapshot Row Tablespace Subobject Obj. Lock % of Owner Name Object Name Name Type Waits Capture ---------- ---------- -------------------- ---------- ----- ------------ ------- CONTENT_OW LOB_8K_TS MZ_ASSET_WORK_EVENT_ INDEX 72,005 43.86 CONTENT_OW LOB_8K_TS MZ_CS_WORK_NOTE_RE_I _2013_1_36 INDEX 13,795 8.40 CONTENT_OW LOB_8K_TS MZ_CS_WORK_INFO_PART _2013_5_35 INDEX 12,383 7.54 CONTENT_OW INDEX_TS MZ_AM_REQUEST_IX3A INDEX 8,937 5.44 CONTENT_OW DATA_TS MZ_AM_REQUEST TABLE 8,531 5.20
Row Lock Waits 是指行鎖的等待次數 資料來源於 dba_hist_seg_stat.ROW_LOCK_WAITS_DELTA
11-14 Segments by ITL WAITS
Segments by ITL Waits DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> % of Capture shows % of ITL waits for each top segment compared -> with total ITL waits for all segments captured by the Snapshot Tablespace Subobject Obj. ITL % of Owner Name Object Name Name Type Waits Capture ---------- ---------- -------------------- ---------- ----- ------------ ------- CONTENT_OW LOB_8K_TS MZ_ASSET_WORK_EVENT_ INDEX 95 30.16 CONTENT_OW LOB_8K_TS MZ_CS_WORK_NOTE_RE_I _2013_1_36 INDEX 48 15.24 CONTENT_OW LOB_8K_TS MZ_CS_WORK_INFO_PART _2013_5_35 INDEX 21 6.67 CONTENT_OW INDEX_TS MZ_SALABLE_FIRST_AVA INDEX 21 6.67 CONTENT_OW DATA_TS MZ_CS_WORK_PENDING_R TABLE 20 6.35
關於 ITL的介紹詳見: http://www.askmaclean.com/archives/enqueue-tx-row-lock-index-itl-wait-event.html
ITL Waits 等待 ITL 的次數,資料來源為 dba_hist_seg_stat.itl_waits_delta
11-14 Segments by Buffer Busy Waits
Segments by Buffer Busy Waits DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> % of Capture shows % of Buffer Busy Waits for each top segment compared -> with total Buffer Busy Waits for all segments captured by the Snapshot Buffer Tablespace Subobject Obj. Busy % of Owner Name Object Name Name Type Waits Capture ---------- ---------- -------------------- ---------- ----- ------------ ------- CONTENT_OW LOB_8K_TS MZ_ASSET_WORK_EVENT_ INDEX 251,073 57.07 CONTENT_OW LOB_8K_TS MZ_CS_WORK_NOTE_RE_I _2013_1_36 INDEX 36,186 8.23 CONTENT_OW LOB_8K_TS MZ_CS_WORK_INFO_PART _2013_5_35 INDEX 31,786 7.23 CONTENT_OW INDEX_TS MZ_AM_REQUEST_IX3A INDEX 15,663 3.56 CONTENT_OW INDEX_TS MZ_CS_WORK_PENDING_R INDEX 11,087 2.52
Buffer Busy Waits 該資料段上發生 buffer busy wait的次數 資料來源 dba_hist_seg_stat.buffer_busy_waits_delta
11-15 Segments by Global Cache Buffer
Segments by Global Cache Buffer BusyDB/Inst: MAC/MAC2 Snaps: 70719-7072 -> % of Capture shows % of GC Buffer Busy for each top segment compared -> with GC Buffer Busy for all segments captured by the Snapshot GC Tablespace Subobject Obj. Buffer % of Owner Name Object Name Name Type Busy Capture ---------- ---------- -------------------- ---------- ----- ------------ ------- CONTENT_OW INDEX_TS MZ_AM_REQUEST_IX3 INDEX 2,135,528 50.07 CONTENT_OW DATA_TS MZ_CONTENT_PROVIDER_ TABLE 652,900 15.31 CONTENT_OW LOB_8K_TS MZ_ASSET_WORK_EVENT_ INDEX 552,161 12.95 CONTENT_OW LOB_8K_TS MZ_CS_WORK_NOTE_RE_I _2013_1_36 INDEX 113,042 2.65 CONTENT_OW LOB_8K_TS MZ_CS_WORK_INFO_PART _2013_5_35 INDEX 98,134 2.30
GC Buffer Busy 資料段上發揮僧gc buffer busy的次數, 資料來源 dba_hist_seg_stat.gc_buffer_busy_delta
11-15 Segments by CR Blocks Received
Segments by CR Blocks Received DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> Total CR Blocks Received: 763,037 -> Captured Segments account for 40.9% of Total CR Tablespace Subobject Obj. Blocks Owner Name Object Name Name Type Received %Total ---------- ---------- -------------------- ---------- ----- ------------ ------- CONTENT_OW DATA_TS MZ_AM_REQUEST TABLE 69,100 9.06 CONTENT_OW DATA_TS MZ_CS_WORK_PENDING_R TABLE 44,491 5.83 CONTENT_OW INDEX_TS MZ_AM_REQUEST_IX3A INDEX 36,830 4.83 CONTENT_OW DATA_TS MZ_PODCAST TABLE 36,632 4.80 CONTENT_OW INDEX_TS MZ_AM_REQUEST_PK INDEX 19,646 2.57
CR Blocks Received :是指RAC中本地節點接收到global cache CR blocks 的數量; 資料來源為 dba_hist_seg_stat.gc_cu_blocks_received_delta
%Total : (該段上在本節點接收的Global CR blocks / gc cr blocks received )
11-16 Segments by Current Blocks Received
Segments by Current Blocks ReceivedDB/Inst: MAC/MAC2 Snaps: 70719-70723 -> Total Current Blocks Received: 704,731 -> Captured Segments account for 61.8% of Total Current Tablespace Subobject Obj. Blocks Owner Name Object Name Name Type Received %Total ---------- ---------- -------------------- ---------- ----- ------------ ------- CONTENT_OW INDEX_TS MZ_AM_REQUEST_IX3 INDEX 56,287 7.99 CONTENT_OW INDEX_TS MZ_AM_REQUEST_IX3A INDEX 45,139 6.41 CONTENT_OW DATA_TS MZ_AM_REQUEST TABLE 40,350 5.73 CONTENT_OW DATA_TS MZ_CS_WORK_PENDING_R TABLE 22,808 3.24 CONTENT_OW INDEX_TS MZ_AM_REQUEST_IX8 INDEX 13,343 1.89
Current Blocks Received :是指RAC中本地節點接收到global cache Current blocks 的數量 ,資料來源DBA_HIST_SEG_STAT.gc_cu_blocks_received_delta
%Total : (該段上在本節點接收的 global cache current blocks / gc current blocks received)
12 Dictionary Cache Stats
Dictionary Cache Stats DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> "Pct Misses" should be very low (< 2% in most cases) -> "Final Usage" is the number of cache entries being used Get Pct Scan Pct Mod Final Cache Requests Miss Reqs Miss Reqs Usage ------------------------- ------------ ------ ------- ----- -------- ---------- dc_awr_control 87 2.3 0 N/A 6 1 dc_global_oids 1,134 7.8 0 N/A 0 13 dc_histogram_data 6,119,027 0.9 0 N/A 0 11,784 dc_histogram_defs 1,898,714 2.3 0 N/A 0 5,462 dc_object_grants 175 26.9 0 N/A 0 4 dc_objects 10,254,514 0.2 0 N/A 0 3,807 dc_profiles 8,452 0.0 0 N/A 0 2 dc_rollback_segments 3,031,044 0.0 0 N/A 0 1,947 dc_segments 1,812,243 1.4 0 N/A 10 3,595 dc_sequences 15,783 69.6 0 N/A 15,782 20 dc_table_scns 70 2.9 0 N/A 0 1 dc_tablespaces 1,628,112 0.0 0 N/A 0 37 dc_users 2,037,138 0.0 0 N/A 0 52 global database name 7,698 0.0 0 N/A 0 1 outstanding_alerts 264 99.6 0 N/A 8 1 sch_lj_oids 51 7.8 0 N/A 0 1
Dictionary Cache 字典快取也叫row cache
資料來源為dba_hist_rowcache_summary
Cache 字典快取類名kqrstcid <=> kqrsttxt cid=3(dc_rollback_segments)
Get Requests 申請獲取該資料字典快取物件的次數 gets
Miss : GETMISSES 申請獲取該資料字典快取物件但 miss的次數
Pct Miss : GETMISSES /Gets , Miss的比例 ,這個pct miss應當非常低 小於2%,否則有出現大量row cache lock的可能
Scan Reqs:掃描申請的次數 ,kqrssc 、kqrpScan 、kqrpsiv時發生scan 會導致掃描數增加 kqrstsrq++(scan requests) ,例如migrate tablespace 時呼叫 kttm2b函式 為了安全刪除uet$中的記錄會callback kqrpsiv (used extent cache),實際很少見
Pct Miss:SCANMISSES/SCANS
Mod Reqs: 申請修改字典快取物件的次數,從上面的資料可以看到dc_sequences的mod reqs很高,這是因為sequence是變化較多的字典物件
Final Usage :包含有有效資料的字典快取記錄的總數 也就是正在被使用的row cache記錄 USAGE Number of cache entries that contain valid data
Dictionary Cache Stats (RAC) DB/Inst: MAC/MAC2 Snaps: 70719-70723 GES GES GES Cache Requests Conflicts Releases ------------------------- ------------ ------------ ------------ dc_awr_control 14 2 0 dc_global_oids 88 0 102 dc_histogram_defs 43,518 0 43,521 dc_objects 21,608 17 21,176 dc_profiles 1 0 1 dc_segments 24,974 14 24,428 dc_sequences 25,178 10,644 347 dc_table_scns 2 0 2 dc_tablespaces 165 0 166 dc_users 119 0 119 outstanding_alerts 478 8 250 sch_lj_oids 4 0 4
GES Request kqrstilr total instance lock requests ,通過全域性佇列服務GES 來申請instance lock的次數
GES request 申請的原因可能是 dump cache object、kqrbfr LCK程式要background free some parent objects釋放一些parent objects 等
GES Conflicts kqrstifr instance lock forced-releases , LCK程式以AST方式 釋放鎖的次數 ,僅出現在kqrbrl中
GES Releases kqrstisr instance lock self-releases ,LCK程式要background free some parent objects釋放一些parent objects 時可能自增
上述資料中可以看到僅有dc_sequences 對應的GES Conflicts較多, 對於sequence 使用ordered和non-cache選項會導致RAC中的一個邊際效應,即”row cache lock”等待源於DC_SEQUENCES ROW CACHE。 DC_SEQUENCES 上的GETS request、modifications 、GES requests和GES conflict 與引發生成一個新的 sequence number的特定SQL執行頻率相關。
在Oracle 10g中,ORDERED Sequence還可能在高併發下造成大量DFS lock Handle 等待,由於bug 5209859
13 Library Cache Activity
Library Cache Activity DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> "Pct Misses" should be very low Get Pct Pin Pct Invali- Namespace Requests Miss Requests Miss Reloads dations --------------- ------------ ------ -------------- ------ ---------- -------- ACCOUNT_STATUS 8,436 0.3 0 N/A 0 0 BODY 8,697 0.7 15,537 0.7 49 0 CLUSTER 317 4.7 321 4.7 0 0 DBLINK 9,212 0.1 0 N/A 0 0 EDITION 4,431 0.0 8,660 0.0 0 0 HINTSET OBJECT 1,027 9.5 1,027 14.4 0 0 INDEX 792 18.2 792 18.2 0 0 QUEUE 10 0.0 1,733 0.0 0 0 RULESET 0 N/A 8 87.5 7 0 SCHEMA 8,169 0.0 0 N/A 0 0 SQL AREA 533,409 4.8 -4,246,727,944 101.1 44,864 576 SQL AREA BUILD 71,500 65.5 0 N/A 0 0 SQL AREA STATS 41,008 90.3 41,008 90.3 1 0 TABLE/PROCEDURE 320,310 0.6 1,033,991 3.6 25,378 0 TRIGGER 847 0.0 38,442 0.3 110 0
NameSpace library cache 的名稱空間
GETS Requests 該名稱空間所包含物件的library cache lock被申請的次數
GETHITS 物件的 library cache handle 正好在記憶體中被找到的次數
Pct Misses : ( 1- ( GETHITS /GETS Requests)) *100
Pin Requests 該名稱空間所包含物件上pin被申請的次數
PINHITS 要pin的物件的heap metadata正好在shared pool中的次數
Pct Miss ( 1- ( PINHITS /Pin Requests)) *100
Reloads 指從object handle 被重建開始不是第一次PIN該物件的PIN ,且該次PIN要求物件從磁碟上讀取載入的次數 ;Reloads值較高的情況 建議增大shared_pool_size
INVALIDATIONS 由於以來物件被修改導致該名稱空間所包含物件被標記為無效的次數
Library Cache Activity (RAC) DB/Inst: MAC/MAC2 Snaps: 70719-70723 GES Lock GES Pin GES Pin GES Inval GES Invali- Namespace Requests Requests Releases Requests dations --------------- ------------ ------------ ------------ ----------- ----------- ACCOUNT_STATUS 8,436 0 0 0 0 BODY 0 15,497 15,497 0 0 CLUSTER 321 321 321 0 0 DBLINK 9,212 0 0 0 0 EDITION 4,431 4,431 4,431 0 0 HINTSET OBJECT 1,027 1,027 1,027 0 0 INDEX 792 792 792 0 0 QUEUE 8 1,733 1,733 0 0 RULESET 0 8 8 0 0 SCHEMA 4,226 0 0 0 0 TABLE/PROCEDURE 373,163 704,816 704,816 0 0 TRIGGER 0 38,430 38,430 0 0
GES Lock Request: dlm_lock_requests Lock instance-lock ReQuests 申請獲得lock instance lock的次數
GES PIN request : DLM_PIN_REQUESTS Pin instance-lock ReQuests 申請獲得pin instance lock的次數
GES Pin Releases DLM_PIN_RELEASES release the pin instance lock 釋放pin instance lock的次數
GES Inval Requests DLM_INVALIDATION_REQUESTS get the invalidation instance lock 申請獲得invalidation instance lock的次數
GES Invali- dations DLM_INVALIDATIONS 接收到其他節點的invalidation pings次數
14 Process Memory Summary
Process Memory Summary DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> B: Begin Snap E: End Snap -> All rows below contain absolute values (i.e. not diffed over the interval) -> Max Alloc is Maximum PGA Allocation size at snapshot time -> Hist Max Alloc is the Historical Max Allocation for still-connected processes -> ordered by Begin/End snapshot, Alloc (MB) desc Hist Avg Std Dev Max Max Alloc Used Alloc Alloc Alloc Alloc Num Num Category (MB) (MB) (MB) (MB) (MB) (MB) Proc Alloc - -------- --------- --------- -------- -------- ------- ------- ------ ------ B Other 16,062.7 N/A 6.1 66.6 3,370 3,370 2,612 2,612 SQL 5,412.2 4,462.9 2.2 89.5 4,483 4,483 2,508 2,498 Freeable 2,116.4 .0 .9 6.3 298 N/A 2,266 2,266 PL/SQL 94.0 69.8 .0 .0 1 1 2,610 2,609 E Other 15,977.3 N/A 6.1 66.9 3,387 3,387 2,616 2,616 SQL 5,447.9 4,519.0 2.2 89.8 4,505 4,505 2,514 2,503 Freeable 2,119.9 .0 .9 6.3 297 N/A 2,273 2,273 PL/SQL 93.2 69.2 .0 .0 1 1 2,614 2,613
資料來源為dba_hist_process_mem_summary, 這裡是對PGA 使用的一個小結,幫助我們瞭解到底誰用掉了PGA
B: 開始快照 E: 結束快照
該環節列出 PGA中各分類的使用量
Category 分類名,包括”SQL”, “PL/SQL”, “OLAP” 和”JAVA”. 特殊分類是 “Freeable” 和”Other”. Free memory是指哪些 OS已經分配給程式,但沒有分配給任何分類的記憶體。 “Other”是已經分配給分類的記憶體,但不是已命名的分類
Alloc (MB) allocated_total 該分類被分配的總記憶體
Used (MB) used_total 該分類已使用的記憶體
Avg Alloc (MB) allocated_avg 平均每個程式中該分類分配的記憶體量
Std Dev Alloc (MB) :該分類分配的記憶體在每個程式之間的標準差
Max Alloc (MB) ALLOCATED_MAX :在快照時間內單個程式該分類最大分配過的記憶體量:Max Alloc is Maximum PGA Allocation size at snapshot time
Hist Max Alloc (MB) MAX_ALLOCATED_MAX: 目前仍連結著的程式該分類最大分配過的記憶體量:Hist Max Alloc is the Historical Max Allocation for still-connected processes
Num Proc num_processes 程式數目
Num Alloc NON_ZERO_ALLOCS 分配了該型別 記憶體的程式數目
14 SGA資訊
14 -1 SGA Memory Summary
SGA Memory Summary DB/Inst: MAC/MAC2 Snaps: 70719-70723 End Size (Bytes) SGA regions Begin Size (Bytes) (if different) ------------------------------ ------------------- ------------------- Database Buffers 20,669,530,112 Fixed Size 2,241,880 Redo Buffers 125,669,376 Variable Size 10,536,094,376 ------------------- sum 31,333,535,744
粗粒度的sga區域記憶體使用資訊, End Size僅在於begin size不同時列印
14-2 SGA breakdown difference
SGA breakdown difference DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> ordered by Pool, Name -> N/A value for Begin MB or End MB indicates the size of that Pool/Name was insignificant, or zero in that snapshot Pool Name Begin MB End MB % Diff ------ ------------------------------ -------------- -------------- ------- java free memory 64.0 64.0 0.00 large PX msg pool 7.8 7.8 0.00 large free memory 247.8 247.8 0.00 shared Checkpoint queue 140.6 140.6 0.00 shared FileOpenBlock 2,459.2 2,459.2 0.00 shared KGH: NO ACCESS 1,629.6 1,629.6 0.00 shared KGLH0 997.7 990.5 -0.71 shared KKSSP 312.2 308.9 -1.06 shared SQLA 376.6 370.6 -1.61 shared db_block_hash_buckets 178.0 178.0 0.00 shared dbktb: trace buffer 156.3 156.3 0.00 shared event statistics per sess 187.1 187.1 0.00 shared free memory 1,208.9 1,220.6 0.97 shared gcs resources 435.0 435.0 0.00 shared gcs shadows 320.6 320.6 0.00 shared ges enqueues 228.9 228.9 0.00 shared ges resource 118.3 118.3 0.00 shared init_heap_kfsg 1,063.6 1,068.1 0.43 shared kglsim object batch 124.3 124.3 0.00 shared ksunfy : SSO free list 174.7 174.7 0.00 stream free memory 128.0 128.0 0.00 buffer_cache 19,712.0 19,712.0 0.00 fixed_sga 2.1 2.1 0.00 log_buffer 119.8 119.8 0.00 -------------------------------------------------------------
Pool 記憶體池的名字
Name 記憶體池中細分元件的名字 例如KGLH0 存放KEL Heap 0 、SQLA存放SQL執行計劃等
Begin MB 快照開始時該元件的記憶體大小
End MB 快照結束時該元件的記憶體大小
% Diff 差異百分比
特別注意 由於AMM /ASMM引起的shared pool收縮 一般在sga breakdown中可以提現 例如SQLA 、KQR等元件大幅縮小 ,可能導致一系列的解析等待 cursor: Pin S on X 、row cache lock等
此處的free memory資訊也值得我們關注, 一般推薦shared pool應當有300~400 MB 的free memory為宜
15 Streams統計
Streams CPU/IO Usage DB/Inst: ORCL/orcl1 Snaps: 556-559 -> Streams processes ordered by CPU usage -> CPU and I/O Time in micro seconds Session Type CPU Time User I/O Time Sys I/O Time ------------------------- -------------- -------------- -------------- QMON Coordinator 101,698 0 0 QMON Slaves 63,856 0 0 ------------------------------------------------------------- Streams Capture DB/Inst: CATGT/catgt Snaps: 911-912 -> Lag Change should be small or negative (in seconds) Captured Enqueued Pct Pct Pct Pct Per Per Lag RuleEval Enqueue RedoWait Pause Capture Name Second Second Change Time Time Time Time ------------ -------- -------- -------- -------- -------- -------- -------- CAPTURE_CAT 650 391 93 0 23 0 71 ------------------------------------------------------------- Streams Apply DB/Inst: CATGT/catgt Snaps: 911-912 -> Pct DB is the percentage of all DB transactions that this apply handled -> WDEP is the wait for dependency -> WCMT is the wait for commit -> RBK is rollbacks -> MPS is messages per second -> TPM is time per message in milli-seconds -> Lag Change should be small or negative (in seconds) Applied Pct Pct Pct Pct Applied Dequeue Apply Lag Apply Name TPS DB WDEP WCMT RBK MPS TPM TPM Change ------------ -------- ---- ---- ---- --- -------- -------- -------- -------- APPLY_CAT 0 0 0 0 0 0 0 0 0 -------------------------------------------------------------
Capture Name : Streams捕獲程式名
Captured Per Second :每秒挖掘出來的message 條數
Enqueued Per Second: 每秒入隊的message條數
lag change: 指日誌生成的時間到挖掘到該日誌生成 message的時間延遲
Pct Enqueue Time: 入隊時間的比例
Pct redoWait Time : 等待redo的時間比例
Pct Pause Time : Pause 時間的比例
Apply Name Streams 應用Apply程式的名字
Applied TPS : 每秒應用的事務數
Pct DB: 所有的DB事務中 apply處理的比例
Pct WDEP: 由於等待依賴的資料而耗費的時間比例
Pct WCMT: 由於等待commit而耗費的時間比例
Pct RBK: 事務rollback 回滾的比例
Applied MPS: 每秒應用的message 數
Dequeue TPM: 每毫秒出隊的message數
Lag Change:指最新message生成的時間到其被Apply收到的延遲
16 Resource Limit
Resource Limit Stats DB/Inst: MAC/MAC2 Snap: 70723 -> only rows with Current or Maximum Utilization > 80% of Limit are shown -> ordered by resource name Current Maximum Initial Resource Name Utilization Utilization Allocation Limit ------------------------------ ------------ ------------ ---------- ---------- ges_procs 2,612 8,007 10003 10003 processes 2,615 8,011 10000 10000
資料來源於dba_hist_resource_limit
注意這裡僅列出當前使用或最大使用量>80% *最大限制的資源名,如果沒有列在這裡則說明 資源使用量安全
Current Utilization 當前對該資源(包括Enqueue Resource、Lock和processes)的使用量
Maximum Utilization 從最近一次例項啟動到現在該資源的最大使用量
Initial Allocation 初始分配值,一般等於引數檔案中指定的值
Limit 實際上限值
17 init.ora Parameters
init.ora Parameters DB/Inst: MAC/MAC2 Snaps: 70719-70723 End value Parameter Name Begin value (if different) ----------------------------- --------------------------------- -------------- _compression_compatibility 11.2.0 _kghdsidx_count 4 _ksmg_granule_size 67108864 _shared_pool_reserved_min_all 4100 archive_lag_target 900 audit_file_dest /u01/app/oracle/admin/MAC/adum audit_trail OS cluster_database TRUE compatible 11.2.0.2.0 control_files +DATA/MAC/control01.ctl, +RECO db_16k_cache_size 268435456 db_block_size 8192 db_cache_size 19327352832 db_create_file_dest +DATA
Parameter Name 引數名
Begin value 開始快照時的引數值
End value 結束快照時的引數值 (僅在發生變化時列印)
18 Global Messaging Statistics
Global Messaging Statistics DB/Inst: MAC/MAC2 Snaps: 70719-70723 Statistic Total per Second per Trans --------------------------------- ---------------- ------------ ------------ acks for commit broadcast(actual) 53,705 14.9 0.2 acks for commit broadcast(logical 311,182 86.1 1.3 broadcast msgs on commit(actual) 317,082 87.7 1.3 broadcast msgs on commit(logical) 317,082 87.7 1.3 broadcast msgs on commit(wasted) 263,332 72.9 1.1 dynamically allocated gcs resourc 0 0.0 0.0 dynamically allocated gcs shadows 0 0.0 0.0 flow control messages received 267 0.1 0.0 flow control messages sent 127 0.0 0.0 gcs apply delta 0 0.0 0.0 gcs assume cvt 55,541 15.4 0.2
全域性通訊統計資訊,資料來源WRH$_DLM_MISC;
20 Global CR Served Stats
Global CR Served Stats DB/Inst: MAC/MAC2 Snaps: 70719-70723 Statistic Total ------------------------------ ------------------ CR Block Requests 403,703 CURRENT Block Requests 444,896 Data Block Requests 403,705 Undo Block Requests 94,336 TX Block Requests 307,896 Current Results 652,746 Private results 21,057 Zero Results 104,720 Disk Read Results 69,418 Fail Results 508 Fairness Down Converts 102,844 Fairness Clears 15,207 Free GC Elements 0 Flushes 105,052 Flushes Queued 0 Flush Queue Full 0 Flush Max Time (us) 0 Light Works 71,793 Errors 117
LMS傳輸CR BLOCK的統計資訊,資料來源WRH$_CR_BLOCK_SERVER
21 Global CURRENT Served Stats
Global CURRENT Served Stats DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> Pins = CURRENT Block Pin Operations -> Flushes = Redo Flush before CURRENT Block Served Operations -> Writes = CURRENT Block Fusion Write Operations Statistic Total % <1ms % <10ms % <100ms % <1s % <10s ---------- ------------ -------- -------- -------- -------- -------- Pins 73,018 12.27 75.96 8.49 2.21 1.08 Flushes 79,336 5.98 50.17 14.45 19.45 9.95 Writes 102,189 3.14 35.23 19.34 33.26 9.03
資料來源dba_hist_current_block_server
Time to process current block request = (pin time + flush time + send time)
Pins CURRENT Block Pin Operations , PIN的內涵是處理一個BAST 不包含對global current block的flush和實際傳輸
The pin time represents how much time is required to process a BAST. It does not include the flush time and
the send time. The average pin time per block served should be very low because the processing consists
mainly of code path and should never be blocked.
Flush 指 髒塊被LMS程式傳輸出去之前,其相關的redo必須由LGWR已經flush 到磁碟上
Write 指fusion write number of writes which were mediated; 節點之間寫髒塊需求相互促成的行為 KJBL.KJBLREQWRITE gcs write request msgs 、gcs writes refused
% <1ms % <10ms % <100ms % <1s % <10s 分別對應為pin、flush、write行為耗時的比例
例如在上例中flush和 write 在1s 到10s之間的有9%,在100ms 和1s之間的有19%和33%,因為flush和write都是IO操作 所以這裡可以預見IO存在問題,延遲較高
22 Global Cache Transfer Stats
Global Cache Transfer Stats DB/Inst: MAC/MAC2 Snaps: 70719-70723 -> Immediate (Immed) - Block Transfer NOT impacted by Remote Processing Delays -> Busy (Busy) - Block Transfer impacted by Remote Contention -> Congested (Congst) - Block Transfer impacted by Remote System Load -> ordered by CR + Current Blocks Received desc CR Current ----------------------------- ----------------------------- Inst Block Blocks % % % Blocks % % % No Class Received Immed Busy Congst Received Immed Busy Congst ---- ----------- -------- ------ ------ ------ -------- ------ ------ ------ 1 data block 133,187 76.3 22.6 1.1 233,138 75.2 23.0 1.7 4 data block 143,165 74.1 24.9 1.0 213,204 76.6 21.8 1.6 3 data block 122,761 75.9 23.0 1.1 220,023 77.7 21.0 1.3 1 undo header 104,219 95.7 3.2 1.1 941 93.4 5.8 .7 4 undo header 95,823 95.2 3.7 1.1 809 93.4 5.3 1.2 3 undo header 95,592 95.6 3.3 1.1 912 94.6 4.5 .9 1 undo block 25,002 95.8 3.4 .9 0 N/A N/A N/A 4 undo block 23,303 96.0 3.1 .9 0 N/A N/A N/A 3 undo block 21,672 95.4 3.7 .9 0 N/A N/A N/A 1 Others 1,909 92.0 6.8 1.2 6,057 89.6 8.9 1.5 4 Others 1,736 92.4 6.1 1.5 5,841 88.8 9.9 1.3 3 Others 1,500 92.4 5.9 1.7 4,405 87.7 10.8 1.6
資料來源DBA_HIST_INST_CACHE_TRANSFER
Inst No 節點號
Block Class 塊的型別
CR Blocks Received 該節點上 該型別CR 塊的接收數量
CR Immed %: CR塊請求立即接收到的比例
CR Busy%:CR塊請求由於遠端爭用而沒有立即接收到的比例
CR Congst%: CR塊請求由於遠端負載高而沒有立即接收到的比例
Current Blocks Received 該節點上 該型別Current 塊的接收數量
Current Immed %: Current塊請求立即接收到的比例
Current Busy%:Current塊請求由於遠端爭用而沒有立即接收到的比例
Current Congst%: Current塊請求由於遠端負載高而沒有立即接收到的比例
Congst%的比例應當非常低 不高於2%, Busy%很大程度受到IO的影響,如果超過10% 一般會有嚴重的gc buffer busy acquire/release
補充 RAC 相關指標 內容由 tong.wang@parnassusdata.com 整理
RAC相關指標
Global Cache Load Profile
Per Second | Per Transaction | |
Global Cache blocks received: | 12.06 | 2.23 |
Global Cache blocks served: | 8.18 | 1.51 |
GCS/GES messages received: | 391.19 | 72.37 |
GCS/GES messages sent: | 368.76 | 68.22 |
DBWR Fusion writes: | 0.10 | 0.02 |
Estd Interconnect traffic (KB) | 310.31 |
指標 | 指標說明 |
Global Cache blocks received | 通過硬體連線收到遠端例項的資料塊的數量。發生在一個程式請求一致性讀一個資料塊不是在本地快取中。Oracle傳送一個請求到另外的例項。一旦緩衝區收到,這個統計值就會增加。這個統計值是另兩個統計值的和:Global Cache blocks received = gc current blocks received + gc cr blocks received |
Global Cache blocks served | 通過硬體連線傳送到遠端例項的資料塊的數量。這個統計值是另外兩個統計值的和:Global Cache blocks served = gc current blocks served + gc cr blocks served |
GCS/GES messages received | 通過硬體連線收到遠端例項的訊息的數量。這個統計值通常代表RAC服務引起的開銷。這個統計值是另外兩個統計值的和:GCS/GES messages received = gcs msgs received + ges msgs received |
GCS/GES messages sent | 通過硬體連線傳送到遠端例項的訊息的數量。這個統計值通常代表RAC服務引起的開銷。這個統計值是另外兩個統計值的和:GCS/GES messages sent = gcs messages sent + ges messages sent |
DBWR Fusion writes | 這個統計值顯示融合寫入的次數。在RAC中,單例項Oracle資料庫,資料塊只被寫入磁碟因為資料過期,緩衝替換或者發生檢查點。當一個資料塊在快取中被替換因為資料過期或發生檢查點但在另外的例項沒有寫入磁碟,Global Cache Service會請求例項將資料塊寫入磁碟。因此融合寫入不包括在第一個例項中的額外寫入磁碟。大量的融合寫入表明一個持續的問題。例項產生的融合寫入請求佔總的寫入請求的比率用於效能分析。高比率表明DB cache大小不合適或者檢查點效率低。 |
Estd Interconnect traffic (KB) |
連線傳輸的KB大小。計算公式如下:Estd Interconnect traffic (KB) = ((‘gc cr blocks received’+ ‘gc current blocks received’ + ‘gc cr blocksserved’+ ‘gc current blocks served’) * Block size)
+ ((‘gcs messages sent’ + ‘ges messages sent’ + ‘gcs msgs received’+ ‘gcs msgs received’)*200)/1024/Elapsed Time |
Global Cache Efficiency Percentages (Target local+remote 100%)
Buffer access – local cache %: | 91.05 |
Buffer access – remote cache %: | 0.03 |
Buffer access – disk %: | 8.92 |
指標 | 指標說明 |
Buffer access – local cache % | 資料塊從本地快取命中佔會話總的資料庫請求次數的比例。在OLTP應用中最希望的是儘可能維持這個比率較高,因為這是最低成本和最快速的獲得資料庫資料塊的方法。計算公式:Local Cache Buffer Access Ratio = 1 – ( physical reads cache + Global Cache blocks received ) / Logical Reads |
Buffer access – remote cache % | 資料塊從遠端例項快取命中佔會話總的資料塊請求的比例。在OLTP應用中這個比率和Buffer access – local cache的和應該儘可能的高因為這兩種方法訪問資料庫資料塊是最快速最低成本的。這個比率的計算方法:Remote Cache Buffer Access Ratio = Global Cache blocks received / Logical Reads |
Buffer access – disk % | 從磁碟上讀資料塊到快取佔會話總的資料塊請求次數的比例。在OLTP應用中希望維持這個比例低因為物理讀是最慢的訪問資料庫資料塊的方式。這個比率計算方法:1 – physical reads cache / Logical Reads |
Global Cache and Enqueue Services – Workload Characteristics
Avg global enqueue get time (ms): | 0.0 |
Avg global cache cr block receive time (ms): | 0.3 |
Avg global cache current block receive time (ms): | 0.2 |
Avg global cache cr block build time (ms): | 0.0 |
Avg global cache cr block send time (ms): | 0.0 |
Global cache log flushes for cr blocks served %: | 1.2 |
Avg global cache cr block flush time (ms): | 1.8 |
Avg global cache current block pin time (ms): | 1,021.7 |
Avg global cache current block send time (ms): | 0.0 |
Global cache log flushes for current blocks served %: | 6.9 |
Avg global cache current block flush time (ms): | 0.9 |
本文永久地址http://www.askmaclean.com/archives/rac-awr-statistics.html
指標 | 指標說明 |
Avg global enqueue get time (ms) | 通過interconnect傳送訊息,為爭奪資源開啟一個新的全域性佇列或者對已經開啟的佇列轉換訪問模式所花費的時間。如果大於20ms,你的系統可能會出現超時。 |
Avg global cache cr block receive time (ms) | 從請求例項傳送訊息到mastering instance(2-way get)和一些到holding instance (3-way get)花費的時間。這個時間包括在holding instance生成資料塊一致性讀映像的時間。CR資料塊獲取耗費的時間不應該大於15ms。 |
Avg global cache current block receive time (ms) | 從請求例項傳送訊息到mastering instance(2-way get)和一些到holding instance (3-way get)花費的時間。這個時間包括holding instance日誌重新整理花費的時間。Current Block獲取耗費的時間不大於30ms |
Avg global cache cr block build time (ms) | CR資料塊建立耗費的時間 |
Avg global cache cr block send time (ms) | CR資料塊傳送耗費的時間 |
Global cache log flushes for cr blocks served % | 需要日誌重新整理的CR資料塊佔總的需要服務的CR資料塊的比例。 |
Avg global cache cr block flush time (ms) | CR資料塊重新整理耗費的時間 |
Avg global cache current block pin time (ms) | Current資料塊pin耗費的時間 |
Avg global cache current block send time (ms) | Current資料塊傳送耗費的時間 |
Global cache log flushes for current blocks served % | 需要日誌重新整理的Current資料塊佔總的需要服務的Current資料塊的比例 |
Avg global cache current block flush time (ms) | Current資料塊重新整理耗費的時間 |
Global Cache and Enqueue Services – Messaging Statistics
Avg message sent queue time (ms): | 2,367.6 |
Avg message sent queue time on ksxp (ms): | 0.1 |
Avg message received queue time (ms): | 0.3 |
Avg GCS message process time (ms): | 0.0 |
Avg GES message process time (ms): | 0.0 |
% of direct sent messages: | 54.00 |
% of indirect sent messages: | 44.96 |
% of flow controlled messages: | 1.03 |
指標 | 指標說明 |
Avg message sent queue time (ms) | 一條資訊進入佇列到傳送它的時間 |
Avg message sent queue time on ksxp (ms) | 對端收到該資訊並返回ACK的時間,這個指標很重要,直接反應了網路延遲,一般小於1ms |
Avg message received queue time (ms) | 一條資訊進入佇列到收到它的時間 |
Avg GCS message process time (ms) | |
Avg GES message process time (ms) | |
% of direct sent messages | 直接傳送資訊佔的比率 |
% of indirect sent messages | 間接傳送資訊佔的比率,一般是排序或大的資訊,流控制也可能引起 |
% of flow controlled messages | 流控制資訊佔的比率,流控制最常見的原因是網路狀況不佳, % of flowcontrolled messages應當小於1% |
Wait Event Histogram
% of Waits | |||||||||
Event | Total Waits | <1ms | <2ms | <4ms | <8ms | <16ms | <32ms | <=1s | >1s |
ADR block file read | 208 | 38.0 | 3.4 | 44.7 | 13.9 | ||||
ADR block file write | 40 | 100.0 | |||||||
ADR file lock | 48 | 100.0 | |||||||
ARCH wait for archivelog lock | 3 | 100.0 | |||||||
ASM file metadata operation | 12.8K | 99.7 | .1 | .0 | .0 | .0 | .2 | .0 | |
Backup: MML write backup piece | 310.5K | 7.6 | .1 | .1 | 1.3 | 10.4 | 30.2 | 50.2 | .0 |
CGS wait for IPC msg | 141.7K | 100.0 | |||||||
CSS initialization | 34 | 50.0 | 47.1 | 2.9 | |||||
CSS operation: action | 110 | 48.2 | 20.9 | 28.2 | 2.7 | ||||
CSS operation: query | 102 | 88.2 | 3.9 | 7.8 | |||||
DFS lock handle | 6607 | 93.9 | .5 | .2 | .0 | .0 | 5.3 | .0 | |
Disk file operations I/O | 1474 | 100.0 | |||||||
IPC send completion sync | 21.9K | 99.5 | .1 | .1 | .1 | .0 | .2 | ||
KJC: Wait for msg sends to complete | 13 | 100.0 | |||||||
LGWR wait for redo copy | 16.3K | 100.0 | .0 | ||||||
Log archive I/O | 3 | 33.3 | 66.7 | ||||||
PX Deq: Signal ACK EXT | 2256 | 99.8 | .1 | .1 | |||||
PX Deq: Signal ACK RSG | 2124 | 99.9 | .1 | .0 | |||||
PX Deq: Slave Session Stats | 7997 | 94.6 | .9 | .9 | 2.5 | .8 | .4 | ||
PX Deq: Table Q qref | 2355 | 99.9 | .1 | ||||||
PX Deq: reap credit | 1215.7K | 100.0 | .0 | .0 | |||||
PX qref latch | 1366 | 100.0 | |||||||
Parameter File I/O | 194 | 94.8 | 1.0 | 1.0 | 1.0 | 1.5 | .5 |
Wait Event Histogram:等待時間直方圖
Event:等待事件名字
Total Waits:該等待事件在快照時間內等待的次數
%of Waits < 1ms :小於1ms的等待次數
%of Waits < 2ms :小於2ms的等待次數
%of Waits < 4ms :小於4ms的等待次數
%of Waits < 8ms :小於8ms的等待次數
%of Waits < 16ms :小於16ms的等待次數
%of Waits < 32ms :小於32ms的等待次數
%of Waits < =1s :小於等於1s的等待次數
%of Waits > 1s :大於1s的等待次數
Parent Latch Statistics
- only latches with sleeps are shown
- ordered by name
Latch Name | Get Requests | Misses | Sleeps | Spin & Sleeps 1->3+ |
Real-time plan statistics latch | 77,840 | 136 | 20 | 116/0/0/0 |
active checkpoint queue latch | 321,023 | 20,528 | 77 | 20451/0/0/0 |
active service list | 339,641 | 546 | 132 | 424/0/0/0 |
call allocation | 328,283 | 550 | 148 | 440/0/0/0 |
enqueues | 1,503,525 | 217 | 14 | 203/0/0/0 |
ksuosstats global area | 2,605 | 1 | 1 | 0/0/0/0 |
messages | 2,608,863 | 141,380 | 29 | 141351/0/0/0 |
name-service request queue | 155,047 | 43 | 15 | 28/0/0/0 |
qmn task queue latch | 2,368 | 90 | 78 | 12/0/0/0 |
query server process | 268 | 30 | 30 | 0/0/0/0 |
redo writing | 910,703 | 11,623 | 50 | 11573/0/0/0 |
resmgr:free threads list | 14,454 | 190 | 4 | 186/0/0/0 |
space background task latch | 11,209 | 15 | 7 | 8/0/0/0 |
Latch Name:閂名稱
Get Requests:申請獲得父閂的次數
本文永久地址http://www.askmaclean.com/archives/rac-awr-statistics.html
Child Latch Statistics
- only latches with sleeps/gets > 1/100000 are shown
- ordered by name, gets desc
Latch Name | Child Num | Get Requests | Misses | Sleeps | Spin & Sleeps 1->3+ |
KJC message pool free list | 1 | 96,136 | 82 | 20 | 62/0/0/0 |
Lsod array latch | 10 | 2,222 | 153 | 118 | 58/0/0/0 |
Lsod array latch | 13 | 2,151 | 43 | 14 | 29/0/0/0 |
Lsod array latch | 4 | 2,066 | 154 | 124 | 59/0/0/0 |
Lsod array latch | 5 | 1,988 | 105 | 44 | 63/0/0/0 |
Lsod array latch | 9 | 1,734 | 95 | 32 | 64/0/0/0 |
Lsod array latch | 2 | 1,707 | 88 | 38 | 55/0/0/0 |
Lsod array latch | 11 | 1,695 | 88 | 32 | 57/0/0/0 |
Lsod array latch | 6 | 1,680 | 158 | 126 | 64/0/0/0 |
Lsod array latch | 12 | 1,657 | 155 | 111 | 65/0/0/0 |
Lsod array latch | 7 | 1,640 | 90 | 34 | 59/0/0/0 |
Lsod array latch | 1 | 1,627 | 169 | 153 | 46/0/0/0 |
Lsod array latch | 3 | 1,555 | 87 | 36 | 54/0/0/0 |
Lsod array latch | 8 | 1,487 | 127 | 88 | 57/0/0/0 |
cache buffers chains | 47418 | 354,313 | 391 | 4 | 387/0/0/0 |
cache buffers chains | 8031 | 337,135 | 250 | 8 | 242/0/0/0 |
cache buffers chains | 78358 | 305,022 | 528 | 9 | 519/0/0/0 |
cache buffers chains | 6927 | 241,808 | 129 | 4 | 125/0/0/0 |
Latch Name:閂名稱
Child Num:
Get Requests:
Misses:
Sleeps:
Spin&Sleeps 1->3+:
Dictionary Cache Stats (RAC)
Cache | GES Requests | GES Conflicts | GES Releases |
dc_awr_control | 11 | 5 | 0 |
dc_global_oids | 5 | 0 | 0 |
dc_histogram_defs | 215 | 1 | 707 |
dc_objects | 90 | 9 | 0 |
dc_segments | 79 | 10 | 73 |
dc_sequences | 35,738 | 37 | 0 |
dc_table_scns | 6 | 0 | 0 |
dc_tablespace_quotas | 907 | 77 | 0 |
dc_users | 10 | 0 | 0 |
outstanding_alerts | 576 | 288 | 0 |
Cache:字典快取類名
GES Requests:
GES Conflicts:
GES Releases:
Library Cache Activity (RAC)
Namespace | GES Lock Requests | GES Pin Requests | GES Pin Releases | GES Inval Requests | GES Invali- dations |
ACCOUNT_STATUS | 242 | 0 | 0 | 0 | 0 |
BODY | 0 | 1,530,013 | 1,530,013 | 0 | 0 |
CLUSTER | 74 | 74 | 74 | 0 | 0 |
DBLINK | 246 | 0 | 0 | 0 | 0 |
EDITION | 311 | 311 | 311 | 0 | 0 |
HINTSET OBJECT | 186 | 186 | 186 | 0 | 0 |
INDEX | 152,360 | 152,360 | 152,360 | 0 | 0 |
QUEUE | 223 | 9,717 | 9,717 | 0 | 0 |
SCHEMA | 255 | 0 | 0 | 0 | 0 |
SUBSCRIPTION | 0 | 26 | 26 | 0 | 0 |
TABLE/PROCEDURE | 275,215 | 3,023,083 | 3,023,083 | 0 | 0 |
TRIGGER | 0 | 384,493 | 384,493 | 0 | 0 |
Namespace:library cache 的名稱空間
GES Lock Requests:
GES Pin Requests:
GES Inval Requests:
GES Invali-dations:
Interconnect Ping Latency Stats
- Ping latency of the roundtrip of a message from this instance to
- target instances.
- The target instance is identified by an instance number.
- Average and standard deviation of ping latency is given in miliseconds
- for message sizes of 500 bytes and 8K.
- Note that latency of a message from the instance to itself is used as
- control, since message latency can include wait for CPU
Target Instance | 500B Ping Count | Avg Latency 500B msg | Stddev 500B msg | 8K Ping Count | Avg Latency 8K msg | Stddev 8K msg |
1 | 1,138 | 0.20 | 0.03 | 1,138 | 0.20 | 0.03 |
2 | 1,138 | 0.17 | 0.04 | 1,138 | 0.20 | 0.05 |
3 | 1,138 | 0.19 | 0.22 | 1,138 | 0.23 | 0.22 |
4 | 1,138 | 0.18 | 0.04 | 1,138 | 0.21 | 0.04 |
Target Instance:目標例項
500B Ping Count:
Avg Latency 500B msg:
Stddev 500B msg:
8K Ping Count:
Avg Latency 8K msg:
Stddev 8K msg:
Interconnect Throughput by Client
- Throughput of interconnect usage by major consumers
- All throughput numbers are megabytes per second
Used By | Send Mbytes/sec | Receive Mbytes/sec |
Global Cache | 0.10 | 0.20 |
Parallel Query | 0.02 | 0.06 |
DB Locks | 0.09 | 0.09 |
DB Streams | 0.00 | 0.00 |
Other | 0.02 | 0.01 |
Used By:主要消費者
Send Mbytes/sec:傳送Mb/每秒
Receive Mbytes/sec:接收Mb/每秒
Interconnect Device Statistics
- Throughput and errors of interconnect devices (at OS level)
- All throughput numbers are megabytes per second
Device Name | IP Address | Public | Source | Send Mbytes/sec | Send Errors | Send Dropped | Send Buffer Overrun | Send Carrier Lost | Receive Mbytes/sec | Receive Errors | Receive Dropped | Receive Buffer Overrun | Receive Frame Errors |
bondib0 | 192.168.10.8 | NO | cluster_interconnects parameter | 0.00 | 0 | 0 | 0 | 0 | 0.00 | 0 | 0 | 0 |
Device Name:裝置名稱
IP Address:IP地址
Public:是否為公用網路
Source:來源
Send Mbytes/sec:傳送MB/每秒
Send Errors:傳送錯誤
Send Dropped:
Send Buffer Overrun:
Send Carrier Lost:
Receive Mbytes/sec:
Receive Errors:
Receive Dropped:
Receive Buffer Overrun:
Receive Frame Errors:
Dynamic Remastering Stats
- times are in seconds
- Affinity objects – objects mastered due to affinity at begin/end snap
Name | Total | per Remaster Op | Begin Snap | End Snap |
remaster ops | 29 | 1.00 | ||
remastered objects | 40 | 1.38 | ||
replayed locks received | 1,990 | 68.62 | ||
replayed locks sent | 877 | 30.24 | ||
resources cleaned | 0 | 0.00 | ||
remaster time (s) | 5.0 | 0.17 | ||
quiesce time (s) | 1.7 | 0.06 | ||
freeze time (s) | 0.6 | 0.02 | ||
cleanup time (s) | 0.7 | 0.02 | ||
replay time (s) | 0.2 | 0.01 | ||
fixwrite time (s) | 1.3 | 0.04 | ||
sync time (s) | 0.5 | 0.02 | ||
affinity objects | 365 | 367 |
Name:
Total:
Per Remaster Op:
Begin Snap:
End Snap:
參考文件
Statistics Descriptions http://docs.oracle.com/cd/B19306_01/server.102/b14237/stats002.htm
Memory Configuration and Use http://docs.oracle.com/cd/B19306_01/server.102/b14211/memory.htm
Library Cache Hit (%) http://docs.oracle.com/cd/B16240_01/doc/doc.102/e16282/oracle_database_help/oracle_database_instance_efficiency_libcache_hit_pct.html
Oracle? Database Performance Tuning Guide 12c Release 1 (12.1)
How to Interpret the “SQL ordered by Physical Reads (UnOptimized)” Section in AWR Reports (11.2 onwards) [ID 1466035.1]