enq: TM - contention解決之道——外來鍵無索引導致鎖爭用(上)

liupzmin發表於2016-04-28
近日,開發負責人反映某生產環境業務處理緩慢,主要業務操作就是修改會員資訊,登入查詢後發現大量的session正在等待enq: TM - contention,且waiting的語句幾乎都是update

session的即時資訊沒有保留,現在附上ash檢視的一些統計資訊,可以大概瞭解一下當時爭用的場景


  1. SQL> @ash_wait_chains.sql username||':'||program2||event2 session_type='FOREGROUND' sysdate-6/24 sysdate-5/24
  2. -- Display ASH Wait Chain Signatures script v0.2 BETA by Tanel Poder ( http://blog.tanelpoder.com )
  3. %This SECONDS AAS
  4. ------ ---------- ----------
  5. WAIT_CHAIN
  6. ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  7. 72% 20073 5.6
  8. -> JSCHPROD:(JDBC Thin Client) ON CPU
  9. 8% 2293 .6
  10. -> JSCHPROD:(JDBC Thin Client) enq: TM - contention -> JSCHPROD:(JDBC Thin Client) ON CPU
  11. 8% 2141 .6
  12. -> JSCHPROD:(JDBC Thin Client) log file sync -> SYS:(LGWR) log file parallel write
  13. 7% 1879 .5
  14. -> JSCHPROD:(JDBC Thin Client) log file sync -> SYS:(LGWR) LGWR-LNS wait on channel
  15. 2% 654 .2
  16. -> JSCHPROD:(JDBC Thin Client) enq: TM - contention -> JSCHPROD:(JDBC Thin Client) enq: TM - contention -> JSCHPROD:(JDBC Thin Client) ON CPU
  17. 1% 288 .1
  18. -> JSCHPROD:(JDBC Thin Client) enq: TM - contention -> JSCHPROD:(JDBC Thin Client) enq: TM - contention -> JSCHPROD:(JDBC Thin Client) enq: TM - contention -> JSCHPROD:(JDBC Thin Client) ON CPU
  19. 1% 149 0
  20. -> JSCHPROD:(JDBC Thin Client) log file sync -> SYS:(LGWR) ON CPU
  21. 0% 128 0
  22. -> JSCHPROD:(JDBC Thin Client) enq: TM - contention
  23. 0% 112 0
  24. -> JSCHPROD:(JDBC Thin Client) log file sync
  25. 0% 86 0
  26. -> JSCHPROD:(JDBC Thin Client) db file scattered read
  27. 0% 43 0
  28. -> JSCHPROD:(JDBC Thin Client) enq: TM - contention -> JSCHPROD:(JDBC Thin Client) enq: TM - contention -> JSCHPROD:(JDBC Thin Client) enq: TM - contention -> JSCHPROD:(JDBC Thin Client) enq: TM - contention -> JSCHPROD:(JDBC Thin Client) ON CPU
  29. 0% 37 0
  30. -> JSCHPROD:(JDBC Thin Client) enq: TM - contention -> JSCHPROD:(JDBC Thin Client) enq: TM - contention
  31. 0% 25 0
  32. -> JSCHPROD:(JDBC Thin Client) log file sync -> SYS:(LGWR) LGWR wait on LNS
  33. 0% 13 0
  34. -> JSCHPROD:(plsqldev.exe) ON CPU
  35. 0% 11 0
  36. -> SYS:(plsqldev.exe) ON CPU
  37. 0% 10 0
  38. -> JSCHPROD:(JDBC Thin Client) SQL*Net more data from client
  39. 0% 9 0
  40. -> JSCHPROD:(JDBC Thin Client) db file sequential read
  41. 0% 9 0
  42. -> JSCHPROD:(JDBC Thin Client) enq: TM - contention -> JSCHPROD:(JDBC Thin Client) enq: TM - contention -> JSCHPROD:(JDBC Thin Client) enq: TM - contention -> JSCHPROD:(JDBC Thin Client) enq: TM - contention -> JSCHPROD:(JDBC Thin Client) enq: TM - contention -> JSCHPROD:(JDBC Thin Client) ON
  43. CPU
  44. 0% 6 0
  45. -> JSCHPROD:(JDBC Thin Client) read by other session -> JSCHPROD:(JDBC Thin Client) ON CPU
  46. 0% 4 0
  47. -> JSCHPROD:(JDBC Thin Client) enq: TM - contention -> JSCHPROD:(JDBC Thin Client) enq: TM - contention -> JSCHPROD:(JDBC Thin Client) enq: TM - contention
  48. 0% 3 0
  49. -> JSCHPROD:(JDBC Thin Client) SQL*Net more data to client
  50. 0% 3 0
  51. -> JSCHPROD:(JDBC Thin Client) buffer busy waits [data block]
  52. 0% 3 0
  53. -> SYS:(oraagent.bin) Disk file operations I/O
  54. 0% 3 0
  55. -> JSCHPROD:(JDBC Thin Client) log file sync -> SYS:(LGWR) LGWR wait for redo copy
  56. 0% 2 0
  57. -> JSCHPROD:(JDBC Thin Client) enq: TX - row lock contention
  58. 0% 2 0
  59. -> JSCHPROD:(JDBC Thin Client) log file sync -> SYS:(LGWR) LGWR wait for redo copy -> JSCHPROD:(JDBC Thin Client) ON CPU
  60. 0% 2 0
  61. -> JSCHPROD:(JDBC Thin Client) enq: TX - index contention -> JSCHPROD:(JDBC Thin Client) ON CPU
  62. 0% 1 0
  63. -> JSCHPROD:(plsqldev.exe) log file sync -> SYS:(LGWR) log file parallel write
  64. 0% 1 0
  65. -> SYS:(plsqldev.exe) Disk file operations I/O
  66. 0% 1 0
  67. -> JSCHPROD:(JDBC Thin Client) enq: TX - row lock contention -> JSCHPROD:(JDBC Thin Client) ON CPU
可以看到,TM鎖的爭用很多,再看一份當時awr報告的top10

Top 10 Foreground Events by Total Wait Time


Event Waits Total Wait Time (sec) Wait Avg(ms) % DB time Wait Class
DB CPU   20K   87.6  
log file sync 467,706 2795.1 6 12.2 Commit
enq: TM - contention 2,042 401.7 197 1.8 Application
db file scattered read 593,258 99 0 .4 User I/O
db file sequential read 274,234 10.7 0 .0 User I/O
SQL*Net more data from client 370,989 9.7 0 .0 Network
read by other session 26,041 6.9 0 .0 User I/O
SQL*Net message to client 3,955,830 6.1 0 .0 Network
log file switch completion 19 3.3 176 .0 Configuration
SQL*Net more data to client 30,490 2 0 .0 Network

雖然佔DBTIME不多,但本來是很快的操作,短時間內給人的感覺就是業務處理緩慢,訊息都堆在redis裡

查一下當時等待事件的p1,p2,p3的值


  1. select ash.SAMPLE_TIME,
  2.        ash.EVENT,
  3.        ash.SESSION_ID,
  4.        ash.BLOCKING_SESSION,
  5.        ash.P1TEXT,
  6.        ash.P1,
  7.        ash.P2TEXT,
  8.        ash.p2,
  9.        ash.p3text,
  10.        ash.p3,
  11.        ash.SESSION_STATE,
  12.        ash.SQL_OPNAME,
  13.        ash.SQL_ID
  14.        --ash.*
  15.   from v$active_session_history ash
  16.  where ash.SAMPLE_TIME >
  17.        to_date('20160425 10:00:00', 'yyyymmdd HH24:MI:SS')
  18.    and ash.SAMPLE_TIME <
  19.        to_date('20160425 12:10:00', 'yyyymmdd HH24:MI:SS')
  20.    and ash.WAIT_CLASS <> 'Idle'
  21.    and ash.EVENT like 'enq: TM - contention'
  22.  order by sample_time desc;
下面是部分結果


  1. enq: TM - contention    391    2457    name|mode    1414332419    object #    110434    table/partition    0    WAITING    INSERT    7w0tma5up32wt
  2. enq: TM - contention    2213    297    name|mode    1414332420    object #    110433    table/partition    0    WAITING    UPDATE    ak25v8q8p6fzd
  3. enq: TM - contention    2457    2402    name|mode    1414332420    object #    110434    table/partition    0    WAITING    UPDATE    ak25v8q8p6fzd
  4. enq: TM - contention    2641    297    name|mode    1414332420    object #    110433    table/partition    0    WAITING    UPDATE    ak25v8q8p6fzd
  5. enq: TM - contention    194    297    name|mode    1414332420    object #    110433    table/partition    0    WAITING    UPDATE    9gd6xhd0xyhph
  6. enq: TM - contention    297    2402    name|mode    1414332419    object #    110433    table/partition    0    WAITING    INSERT    7w0tma5up32wt
  7. enq: TM - contention    341    2402    name|mode    1414332419    object #    110433    table/partition    0    WAITING    INSERT    7w0tma5up32wt
  8. enq: TM - contention    391    2213    name|mode    1414332420    object #    110433    table/partition    0    WAITING    UPDATE    6nmp0bp3uuqtr
  9. enq: TM - contention    2402    2213    name|mode    1414332420    object #    110433    table/partition    0    WAITING    UPDATE    44nzxnp9wwk3z
  10. enq: TM - contention    2457    2213    name|mode    1414332420    object #    110433    table/partition    0    WAITING    UPDATE    ak25v8q8p6fzd
  11. enq: TM - contention    297    2213    name|mode    1414332420    object #    110433    table/partition    0    WAITING    UPDATE    ak25v8q8p6fzd
  12. enq: TM - contention    391    2213    name|mode    1414332420    object #    110433    table/partition    0    WAITING    UPDATE    ak25v8q8p6fzd
  13. enq: TM - contention    4    2504    name|mode    1414332421    object #    110415    table/partition    0    WAITING    UPDATE    ak25v8q8p6fzd
  14. enq: TM - contention    148    2504    name|mode    1414332421    object #    110415    table/partition    0    WAITING    UPDATE    9gd6xhd0xyhph
可以看到紅色的p2的值為產生TM爭用的物件id,經過查證,這些object均是session正在更新的表的子表,而且通過v$sql檢視update語句均更改了主表的主鍵,問題到這裡已經很明朗了,由於外來鍵沒加索引,導致了主表在更新主表主鍵或刪除主表記錄時對子表的鎖定,而且這張主表被大量的子表引用,此時子表上也同時進行事務處理,所以造成了更新主表的session 不時hang住。

通過對所有子表的外來鍵加索引,消除了爭用,檢測未加索引的外來鍵語句:

  1. SELECT TABLE_NAME,
  2.        CONSTRAINT_NAME,
  3.        CNAME1 || NVL2(CNAME2, ',' || CNAME2, NULL) ||
  4.        NVL2(CNAME3, ',' || CNAME3, NULL) ||
  5.        NVL2(CNAME4, ',' || CNAME4, NULL) ||
  6.        NVL2(CNAME5, ',' || CNAME5, NULL) ||
  7.        NVL2(CNAME6, ',' || CNAME6, NULL) ||
  8.        NVL2(CNAME7, ',' || CNAME7, NULL) ||
  9.        NVL2(CNAME8, ',' || CNAME8, NULL) COLUMNS
  10.   FROM (SELECT B.TABLE_NAME,
  11.                B.CONSTRAINT_NAME,
  12.                MAX(DECODE(POSITION, 1, COLUMN_NAME, NULL)) CNAME1,
  13.                MAX(DECODE(POSITION, 2, COLUMN_NAME, NULL)) CNAME2,
  14.                MAX(DECODE(POSITION, 3, COLUMN_NAME, NULL)) CNAME3,
  15.                MAX(DECODE(POSITION, 4, COLUMN_NAME, NULL)) CNAME4,
  16.                MAX(DECODE(POSITION, 5, COLUMN_NAME, NULL)) CNAME5,
  17.                MAX(DECODE(POSITION, 6, COLUMN_NAME, NULL)) CNAME6,
  18.                MAX(DECODE(POSITION, 7, COLUMN_NAME, NULL)) CNAME7,
  19.                MAX(DECODE(POSITION, 8, COLUMN_NAME, NULL)) CNAME8,
  20.                COUNT(*) COL_CNT
  21.           FROM (SELECT SUBSTR(TABLE_NAME, 1, 30) TABLE_NAME,
  22.                        SUBSTR(CONSTRAINT_NAME, 1, 30) CONSTRAINT_NAME,
  23.                        SUBSTR(COLUMN_NAME, 1, 30) COLUMN_NAME,
  24.                        POSITION
  25.                   FROM USER_CONS_COLUMNS) A,
  26.                USER_CONSTRAINTS B
  27.          WHERE A.CONSTRAINT_NAME = B.CONSTRAINT_NAME
  28.            AND B.CONSTRAINT_TYPE = 'R'
  29.          GROUP BY B.TABLE_NAME, B.CONSTRAINT_NAME) CONS
  30.  WHERE COL_CNT > ALL
  31.  (SELECT COUNT(*)
  32.           FROM USER_IND_COLUMNS I
  33.          WHERE I.TABLE_NAME = CONS.TABLE_NAME
  34.            AND I.COLUMN_NAME IN (CNAME1, CNAME2, CNAME3, CNAME4, CNAME5,
  35.                 CNAME6, CNAME7, CNAME8)
  36.            AND I.COLUMN_POSITION <= CONS.COL_CNT
  37.          GROUP BY I.INDEX_NAME);

這是摘自TOM大師的語句,外來鍵不加索引也是導致死鎖的常見原因之一,因此對於主表經常進行更新刪除操作的情況,外來鍵一定要加索引。
至於外來鍵未加索引是如何導致鎖定的,以及為何加了索引後爭用就消失了?

請關注enq: TM - contention解決之道——外來鍵無索引導致鎖爭用 (下)

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/26838672/viewspace-2090105/,如需轉載,請註明出處,否則將追究法律責任。

相關文章