[20170324]cpu 100%,latch free等待分析

lfree發表於2017-03-24

[20170324]cpu 100%,latch free等待分析.txt

--//這幾天在ITPUB的討論,連結如下:http://www.itpub.net/thread-2085574-7-1.html
--//很明顯程式存在大量的sql最佳化問題.

--//lz上傳了20160926,20160927,20170322,20170323,20170324的AWR報表.

--//20160926
Top 10 Foreground Events by Total Wait Time

Event                  Waits    Total Wait Time (sec)    Wait Avg(ms)    % DB time    Wait Class
DB CPU                                         71.1K                     101.3    
direct path read       120,510                 2518.5              21    3.6          User I/O

--//第1個主要是DB CPU(佔了100%),而第2個direct path read.很明顯sql語句存在大量的全表掃描.

--//20160927:
Top 10 Foreground Events by Total Wait Time

Event                   Waits    Total Wait Time (sec)    Wait Avg(ms)   % DB time    Wait Class
DB CPU                                          4010                     108.2    
read by other session   35,597                  72.3                2    2.0           User I/O
db file sequential read 36,028                  62.9                2    1.7           User I/O
db file scattered read  29,435                  60.8                2    1.6           User I/O
direct path read        191,418                 44.3                0    1.2           User I/O

--//可以看出在記憶體充足的情況下,即使直接路徑讀,IO還是很快的,可以參考我的連結:
http://blog.itpub.net/267265/viewspace-2134041/=> [20170221]資料檔案與檔案系統快取.txt

--//也就是檔案系統的快取掩蓋了sql語句執行計劃的缺陷,當OS記憶體緊張時,問題就暴露無遺了.

--//20170322:
--//這是出現cpu 100%,latch free等待的情況:
Top 10 Foreground Events by Total Wait Time

Event                        Waits    Total Wait Time (sec)    Wait Avg(ms)   % DB time    Wait Class
DB CPU                                         124.7K                            66.6    
latch free                    4,969            822.1                     165       .4      Other
latch: cache buffers chains  22,886            138.1                       6       .1      Concurrency
db file sequential read      56,230            111                         2       .1      User I/O
log file sync                35,309             93                         3       .0      Commit

--//這個時候direct path read並沒有出現,估計進入資料快取,這樣CBC latch就出現,雖然僅僅138.1秒.但是平均等待已經165ms,
--//但是CPU已經忙不過來.實際上你看看IO 相關等待時間很小.
--//從另外一個側面說明CPU 100%,有時候比IO 繁忙更可怕.

--//20170323:
--//20170322重啟後的情況,情況緩解,你仔細看:15:00:44-18:00:53情況

IOStat by Function summary

'Data' columns suffixed with M,G,T,P are in multiples of 1024 other columns suffixed with K,M,G,T,P are in multiples of 1000
ordered by (Data Read + Write) desc

Function Name      Reads: Data  Reqs per sec  Data per sec  Writes: Data Reqs per sec Data per sec Waits: Count Avg Tm(ms)
Direct Reads       17.7G        1.70          1.673M        0M           0.00         0M           18.3K        0.09
Others             290M         1.21          .027M         390M         1.05         .036M        24.4K        0.02
Buffer Cache Reads 319M         3.50          .03M          0M           0.00         0M           34.6K        0.80
DBWR                 0M         0.00          0M            235M         1.86         .022M        20.1K        0.03
LGWR                 0M         0.00          0M            124M         5.03         .011M        108.7K       0.01
Direct Writes        0M         0.00          0M            110M         1.09         .01M         11.8K        0.00
TOTAL:            18.3G         6.40          1.729M        859M         9.04         .079M        217.9K       0.14

--//可以發現直接路徑讀總量達到17.7G.實際上你注意看Avg Tm(ms)才0.09ms非常塊.說明檔案系統快取掩蓋了不良sql語句的執行計劃.

--//20170324,已經建立2個索引.
Top 10 Foreground Events by Total Wait Time

Event                        Waits    Total Wait Time (sec)    Wait Avg(ms)   % DB time    Wait Class
DB CPU                                                6010                    86.3    
db file sequential read     320,152                  790.3          2         11.3         User I/O
log file sync                29,891                  111.5          4         1.6          Commit
db file parallel read         1,175                   60.2         51         .9           User I/O
read by other session        28,873                   53.7          2         .8           User I/O
direct path read             14,374                   15.2          1         .2           User I/O

--//DB CPU佔百分比已經下降.看看IO的情況:
IOStat by Function summary

'Data' columns suffixed with M,G,T,P are in multiples of 1024 other columns suffixed with K,M,G,T,P are in multiples of 1000
ordered by (Data Read + Write) desc

Function Name      Reads: Data  Reqs per sec  Data per sec  Writes: Data Reqs per sec Data per sec Waits: Count Avg Tm(ms)
Direct Reads       13.9G        2.00          1.977M        0M           0.00         0M           14.4K        0.75
Buffer Cache Reads  8.5G        50.96          1.209M       0M           0.00         0M          333.5K        2.38
Others              395M        1.35          .055M         656M         1.30         .091M        19.1K        0.14
DBWR                  0M        0.00          0M            375M         3.21         .052M        23.1K        0.12
LGWR                  1M        0.01          0M            227M         5.47         .032M        78.5K        1.16
Direct Writes         0M        0.00          0M             69M         1.10         .01M         7907         0.00
TOTAL:             22.8G        54.31         3.241M        1.3G         11.08        .184M       476.6K        1.89

--//可以發現Direct Reads依舊很高.說明給繼續最佳化,只要OS記憶體,問題就會掩蓋住,一旦不足問題就暴露.
--//sql最佳化才是解決問題的王道..

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/267265/viewspace-2136024/,如需轉載,請註明出處,否則將追究法律責任。

相關文章