【故障解決】enq: PS - contention

lhrbest發表於2016-02-01

【故障解決】enq: PS - contention

一.1  BLOG文件結構圖

 

wps5DEC.tmp 

 

一.2  前言部分

 

一.2.1  導讀和注意事項

各位技術愛好者,看完本文後,你可以掌握如下的技能,也可以學到一些其它你所不知道的知識,~O(∩_∩)O~:

等待事件 enq: PS - contention的解決辦法

② 一般等待事件的解決辦法

 

  Tips:

       ① 若文章程式碼格式有錯亂,推薦使用QQ360瀏覽器,也可以下載pdf格式的文件來檢視,pdf文件下載地址:http://yunpan.cn/cdEQedhCs2kFz(提取碼:ed9b 

       ② 本篇BLOG中程式碼部分需要特別關注的地方我都用黃色背景和紅色字型來表示,比如下邊的例子中,thread 1的最大歸檔日誌號為33thread 2的最大歸檔日誌號為43是需要特別關注的地方,命令一般使用粉紅顏色標註,註釋一般採用藍色字型表示。

 

  List of Archived Logs in backup set 11

  Thrd Seq     Low SCN    Low Time            Next SCN   Next Time

  ---- ------- ---------- ------------------- ---------- ---------

  1    32      1621589    2015-05-29 11:09:52 1625242    2015-05-29 11:15:48

  1    33      1625242    2015-05-29 11:15:48 1625293    2015-05-29 11:15:58

  2    42      1613951    2015-05-29 10:41:18 1625245    2015-05-29 11:15:49

  2    43      1625245    2015-05-29 11:15:49 1625253    2015-05-29 11:15:53

 

 

 

[ZFXDESKDB1:root]:/>lsvg -o

T_XDESK_APP1_vg

rootvg

[ZFXDESKDB1:root]:/>

[ZFXDESKDB1:root]:/>lsvg rootvg

 

 

====》2097152*512/1024/1024/1024=1G 

 

 

 

 

 

本文如有錯誤或不完善的地方請大家多多指正,ITPUB留言或QQ皆可,您的批評指正是我寫作的最大動力。

 

 

 

一.2.2  相關參考文章連結

 

 

一.2.3  本文簡介

 

 

 

 

一.3  相關知識點掃盲

 

 

 

 

 

 

一.4  故障分析及解決過程

 

一.4.1  故障環境介紹

 

 專案

source db

db 型別

RAC

db version

10.2.0.4.0

db 儲存

RAW

ORACLE_SID

XXX

db_name

XXX

主機IP地址:

XXX

OS版本及kernel版本

AIX 5.3.0.0

OS hostname

XXX

 

 

 

 

一.4.2  故障發生現象及報錯資訊

開發人員反饋資料庫很慢,讓幫忙查查原因,那首當其衝的就是看主機的情況了,主機是AIX系統,採用TOPAS檢視主機的情況,如下圖,從圖中可以看出的確有一個oracle的程式非常佔用CPU資源:

 wps5DFD.tmp

 

 

 

 

 

一.4.3  故障分析及解決過程

根據os的程式號到資料庫中檢視相關的會話:

 

 SELECT a.INST_ID, a.SQL_ID, a.EVENT, a.PREV_SQL_ID, a.STATUS,a.USERNAME,a.OSUSER

   FROM gv$session a, gv$process b

  WHERE a.PADDR = b.ADDR

    and b.SPID = 3109012;

 

 

wps5DFE.tmp 

可以看到該會話的等待事件是enq: PS - contention,並且有相關的SQLOSUSER,可以聯絡到當時的開發人員,據說已經跑了1個小時了,我們先來看看具體的sql內容:

 

 

 SELECT *

   FROM gv$sqlarea a

  WHERE a.SQL_ID = 'cg7q9tn7u5vyx'

    and a.INST_ID = 1;

 

wps5DFF.tmp 

SQL文字copy出來:

SELECT t.*, s.sid, s.serial#, s.machine, s.program, s.osuser

  FROM (SELECT b.INST_ID,

               c.USERNAME,

               a.event,

               to_char(a.cnt) AS seconds,

               a.sql_id,

               dbms_lob.substr(b.sql_fulltext, 100, 1) sqltext

          FROM (SELECT rownum rn, t.*

                  FROM (SELECT s.INST_ID,

                               decode(s.session_state,

                                      'WAITING',

                                      s.event,

                                      'Cpu + Wait For Cpu') Event,

                               s.sql_id,

                               s.user_id,

                               COUNT(*) CNT

                          FROM gv$active_session_history s

                         WHERE sample_time > SYSDATE - 30 / 1440

                         GROUP BY INST_ID,

                                  s.user_id,

                                  decode(s.session_state,

                                         'WAITING',

                                         s.event,

                                         'Cpu + Wait For Cpu'),

                                  s.sql_id

                         ORDER BY CNT DESC) t

                 WHERE rownum < 20) a,

               gv$sqlarea b,

               dba_users c

         WHERE a.sql_id = b.sql_id

           AND a.user_id = c.user_id

           AND a.INST_ID = b.INST_ID

         ORDER BY CNT DESC) t,

       gv$session s

 WHERE t.sql_id = s.sql_id(+)

   AND t.INST_ID = s.INST_ID(+)

 ORDER BY t.INST_ID

 

從文字中可以看出該sql查詢的是資料字典,估計是從網上copy過來的,以哥多年的開發經驗瞅了一眼就發現一個特殊的地方dbms_lob.substr(b.sql_fulltext, 100, 1)

這類clob型別的都比較耗費資源,因為比較忙就不深入的分析了,簡單看了下把該句修改為b.SQL_TEXT,滿足要求即可,沒有必要去查詢clob

 

簡單修改後:

 

SELECT t.*, s.sid, s.serial#, s.machine, s.program, s.osuser

  FROM (SELECT b.INST_ID,

               c.USERNAME,

               a.event,

               to_char(a.cnt) AS seconds,

               a.sql_id,

               --dbms_lob.substr(b.sql_fulltext, 100, 1) sqltext ,

               b.SQL_TEXT

          FROM (SELECT rownum rn, t.*

                  FROM (SELECT s.INST_ID,

                               decode(s.session_state,

                                      'WAITING',

                                      s.event,

                                      'Cpu + Wait For Cpu') Event,

                               s.sql_id,

                               s.user_id,

                               COUNT(*) CNT

                          FROM gv$active_session_history s

                         WHERE sample_time > SYSDATE - 30 / 1440

                         GROUP BY INST_ID,

                                  s.user_id,

                                  decode(s.session_state,

                                         'WAITING',

                                         s.event,

                                         'Cpu + Wait For Cpu'),

                                  s.sql_id

                         ORDER BY CNT DESC) t

                 WHERE rownum < 20) a,

               gv$sqlarea b,

               dba_users c

         WHERE a.sql_id = b.sql_id

           AND a.user_id = c.user_id

           AND a.INST_ID = b.INST_ID

         ORDER BY CNT DESC) t,

       gv$session s

 WHERE t.sql_id = s.sql_id(+)

   AND t.INST_ID = s.INST_ID(+)

 ORDER BY t.INST_ID;

 

 

 

執行一下:

wps5E0F.tmp 

效率還是可以的,從之前的1個小時沒有跑出來到現在的6秒,還是很不錯的,主要是需要找出SQL中的瓶頸部分,這個就需要經驗和多讀書、多看報。少吃零食多睡覺了。^_^

 

下來問了下開發人員說可以停掉的,那我就kill掉了,kill掉後主機的情況如下:

 

wps5E10.tmp 

wps5E21.tmp 

一.4.3.1  metalink解釋

參考:MetalinkBug 5476091
Description
If a session is waiting on a mutex wait (eg: 'cursor: pin X')
then interrupts to the session are ignored.
eg: Ctrl-C does not have any effect.

This issue can show up as a deadlock in a Parallel Query
between the QC (Query coordinator) and one of its slaves
with the QC waiting on "enq: PS - contention" deadlocked
against the slave holding the requested PS enqueue.
   

Bug 5476091 - Ctrl-C ignored for sessions waiting for mutexes / Deadlock with "enq: PS" - superceded (文件 ID 5476091.8)

 

Bug 5476091  Ctrl-C ignored for sessions waiting for mutexes / Deadlock with "enq: PS" - superceded

This note gives a brief overview of bug 5476091. 
The content was last updated on: 21-JUL-2015
Click here for details of each of the sections below.

Affects:

Product (Component)

Oracle Server (Rdbms)

Range of versions believed to be affected

Versions BELOW 11.2

Versions confirmed as being affected

· 10.2.0.4

· 10.2.0.3

Platforms affected

Generic (all / most platforms affected)

Note that this fix has been superseded by the fix in Bug:10214450 

Fixed:

This fix has been superseded - please see the fixed version information for Bug:10214450 . The box below only shows versions where the code change/s for 5476091 are first included - those versions may not contain the later improved fix.

The fix for 5476091 is first included in

· 11.2.0.1 (Base Release)

· 11.1.0.7 (Server Patch Set)

· 10.2.0.5 (Server Patch Set)

 

Symptoms:

Related To:

· Deadlock

· Hang (Process Hang)

· Mutex Contention

· Waits for "cursor: pin X"

· Waits for "enq: PS - contention"

· Parallel Query (PQO)

Description

If a session is waiting on a mutex wait (eg: 'cursor: pin X')

then interrupts to the session are ignored.

eg: Ctrl-C does not have any effect.

 

This issue can show up as a deadlock in a Parallel Query

between the QC (Query coordinator) and one of its slaves

with the QC waiting on "enq: PS - contention" deadlocked

against the slave holding the requested PS enqueue.

 Note:

  This fix is superceded by the fix in bug 10214450

 

Please note: The above is a summary description only. Actual symptoms can vary. Matching to any symptoms here does not confirm that you are encountering this problem. For questions about this bug please consult Oracle Support.

References

Bug:5476091 (This link will only work for PUBLISHED bugs)
Note:245840.1
 Information on the sections in this article

 

 

一.5  故障處理總結

 

到此所有的處理算是基本完畢,過程很簡單,但是不同的場景處理方式有很多種,我們應該學會靈活變通。

 

 

 

一.6  About Me

 

...........................................................................................................................................................................................

本文作者:小麥苗,只專注於資料庫的技術,更注重技術的運用

ITPUB BLOG:http://blog.itpub.net/26736162

本文地址:http://blog.itpub.net/26736162/viewspace-1985380/

本文pdf版:http://yunpan.cn/cdEQedhCs2kFz (提取碼:ed9b)

QQ:642808185 若加QQ請註明所正在讀的文章標題

2016-01-28 10:00~ 2016-01-28 19:00 在中行完成

<版權所有,文章允許轉載,但須以連結方式註明源地址,否則追究法律責任!>

...........................................................................................................................................................................................

 

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/26736162/viewspace-1985380/,如需轉載,請註明出處,否則將追究法律責任。

相關文章