記一次資料庫高CPU佔用率處理過程

abstractcyj發表於2016-08-21
接到同事電話之後,反饋說資料庫CPU佔用率居高不下, 上午9點以後基本在90%以上。零售前端操作較卡。
第一反應獲取了一份AWR報告。

報告顯示,DB Time與平時相比,達到了2,3倍之多。


從中可以看到有嚴重的I.O等待事件。
read by other session事件的相關資訊:
轉自:http://blog.csdn.net/tianlesoftware/article/details/7734551

關於Read by other session 的說明如下:
When informationis requested from the database, Oracle will first read the data from disk intothe database buffer cache. If two or more sessions request the sameinformation, the first session will read the data into the buffer cache whileother sessions wait. In previous versions this wait was classified under the"buffer busy waits" event. However, in Oracle 10.1 and higher thiswait time is now broken out into the "read by other session" waitevent. Excessive waits for this event are typically due to several processesrepeatedly reading the same blocks, e.g. many sessions scanning the same indexor performing full table scans on the same table. Tuning this issue is a matterof finding and eliminating this contention.
--當我們請求一個資料時,Oracle 第一次會從磁碟將資料讀入buffer cache。如果有兩個或者多個session 請求相同的資訊,那麼第一個session 會將這個資訊讀入buffer cache,其他的session 就會出現等待。 在10g之前,該等待事件還是在bufferbusy waits 之下,在Oracle 10g之後,單獨將該事件拿出並命令為read by other session。
一般來說出現這種等待事件是因為多個程式重複的讀取相同的blocks,比如一些session 掃描相同的index或者在相同的block上執行full table scan。解決這個等待事件最好是找到並優化相關的SQL語句。
 
       Readby other session 等待的出現也說明資料庫存在讀的競爭,所以該等待事件通常和db file sequential read或db file scattered read 同時出現。

但是從AWR報告當中未能定位到相關的Top SQL。

於是嘗試查詢檢視:
v$session_wait, 從中也能夠定位到大量的read by other session的相關等待。

緊接著,使用從v$session_wait中定位的sid, serial#資訊定位到了幾個SQL語句。
如:
SELECT (t_memcard_cash.cashno) AS cashno,
       (t_memcard_cash.memcardno) AS memcardno,
       (t_memcard_cash.busno) AS busno,
       (t_memcard_cash.cardlevel) AS cardlevel,
       (t_memcard_cash.integrala) AS integrala,
       (t_memcard_cash.integral) AS integral,
       (t_memcard_cash.status) AS status,
       (t_memcard_cash.execdate) AS execdate,
       (t_memcard_cash.createuser) AS createuser,
       (t_memcard_cash.createtime) AS createtime,
       (t_memcard_cash.notes) AS notes,
       (t_memcard_cash.maxsaleno) AS maxsaleno,
       (t_memcard_cash.pstid) AS pstid,
       (t_memcard_cash.pstname) AS pstname,
       (t_memcard_cash.integral_pst) AS integral_pst,
       (t_memcard_cash.pstqty) AS pstqty,
       (t_memcard_cash.lastmodify) AS lastmodify,
       (t_memcard_cash.lasttime) AS lasttime,
       (t_memcard_cash.pstcode) AS pstcode,
       (t_memcard_cash.makeno) AS makeno,
       (t_memcard_cash.stallno) AS stallno,
       (t_memcard_cash.pstbatno) AS pstbatno,
       (t.cardholder) AS cardholder,
       (t.mobile) AS mobile,
       (t.tel) AS tel,
       (t.cardaddress) AS cardaddress,
       t_ware.lastpurprice AS lastpurprice,
       t_memcard_cash.compid
  FROM t_memcard_cash t_memcard_cash
 INNER JOIN t_memcard_reg t
    ON t_memcard_cash.memcardno = t.memcardno
  LEFT JOIN t_ware t_ware
    ON t_ware.compid = t_memcard_cash.compid
   AND t_ware.wareid = t_memcard_cash.pstid
 WHERE t_memcard_cash.memcardno = '000001034236';

SELECT t_memcard_reg.compid as compid,
  t_memcard_reg.memcardno as memcardno,
  t_memcard_reg.busno as busno,
  t_memcard_reg.introducer as introducer,
  t_memcard_reg.cardtype as cardtype,
  t_memcard_reg.cardlevel as cardlevel,
  t_memcard_reg.password as password,
  t_memcard_reg.cardstatus as cardstatus,
  t_memcard_reg.saleamount as saleamount,
  t_memcard_reg.realamount as realamount,
  t_memcard_reg.puramount as puramount,
  t_memcard_reg.integral as integral,
  t_memcard_reg.integrala as integrala,
  t_memcard_reg.integralflag as integralflag,
  t_memcard_reg.cardholder as cardholder,
  t_memcard_reg.cardaddress as cardaddress,
  t_memcard_reg.sex as sex,
  t_memcard_reg.tel as tel,
  t_memcard_reg.mobile as mobile,
  t_memcard_reg.fax as fax,
  t_memcard_reg.createuser as createuser,
  t_memcard_reg.createtime as createtime,
  t_memcard_reg.notes as notes,
  t_memcard_reg.idcard as idcard,
  t_memcard_reg.birthday as birthday,
  t_memcard_reg.allowintegral as allowintegral,
  t_memcard_reg.apptype as apptype,
  t_memcard_reg.applytime as applytime,
  t_memcard_reg.invalidate as invalidate,
  t_memcard_reg.lastdate as lastdate,
  t_memcard_reg.changeamt as changeamt,
  t_memcard_reg.lastmodify as lastmodify,
  t_memcard_reg.lasttime as lasttime,
  t_memcard_reg.invalidate as invalidate FROM h2.t_memcard_reg t_memcard_reg 
 WHERE ((memcardno='13220839717' or idcard = '13220839717' or tel = '13220839717'  or mobile = '13220839717' ));


 delete from temp_printtxt where saleno ='1608211018028696'

這三個SQL均是零售前端零售時需要呼叫的SQL。看到SQL之後,突然恍然大悟,因為系統前端的SQL大量採用的是硬編碼,而並未採用繫結變數,
故而以上這幾個SQL, 儘管相關表上都缺乏需要的索引,但是在AWR報告上不能反映出來。

解決方式其實很簡單,針對這幾個SQL建立合適的索引即可
create index h2.idx_t_memcard_reg_idcard on h2.t_memcard_reg(idcard) tablespace h2_busi_indx;
create index h2.idx_t_memcard_reg_tel on h2.t_memcard_reg(tel) tablespace h2_busi_indx;
create index h2.idx_t_memcard_reg_mobile on h2.t_memcard_reg(mobile) tablespace h2_busi_indx;
create index h2.idx_temp_printtxt on h2.temp_printtxt(saleno) tablespace h2_busi_indx online;
create index idx_t_memcard_cash_memcardno on t_memcard_cash(memcardno) tablespace h2_busi_indx; 

建立完索引之後,系統CPU佔用率回到20-30%左右。

這個例子進一步引導我從AWR的SQL部分轉到讀取的segments部分, 截圖如下:


這裡也可以說明,這幾個表的相關查詢,對系統造成的壓力最大。

優化之後:




可以看到優化之後DB TIME下降很明顯。


從這個例子得到的教訓是:
1. 對於硬編碼的SQL, AWR並不一定能反映到TOP SQL中,cursor_sharing = extract的緣故, SQL儘管相似,卻是不同的SQL
2. 對與I/O相關的等待, 這裡是read by other session,AWR報告可以先看segments statistics中,哪些段被訪問得最多




來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/8520577/viewspace-2123752/,如需轉載,請註明出處,否則將追究法律責任。

相關文章