系統cpu佔用超高故障分析一例
下班的時候突然接到電話,通知一臺主機資源佔用超過,同時手機簡訊也不停的報session數,趕快連上伺服器檢視具體原因:
1、 通過top可以發現系統CPU資源佔用100%。
查當前程式數,發現比平時多了100多
Select * from v$license;
SESSIONS_MAX SESSIONS_WARNING SESSIONS_CURRENT SESSIONS_HIGHWATER USERS_MAX
------------ ---------------- ---------------- ------------------ ----------
0 0 234 246 0
查session等待事件,大部分session都在做latch free等待
SQL> select sid,event,P1TEXT,state from v$session_wait;
SID EVENT P1TEXT STATE
------ ------------------------------ ------------------------------ -------------------
32 latch free address WAITED KNOWN TIME
38 latch free address WAITED KNOWN TIME
41 latch free address WAITING
57 latch free address WAITING
89 latch free address WAITED KNOWN TIME
93 latch free address WAITED KNOWN TIME
111 latch free address WAITED KNOWN TIME
118 latch free address WAITING
137 latch free address WAITED KNOWN TIME
201 latch free address WAITING
200 latch free address WAITED KNOWN TIME
194 latch free address WAITED KNOWN TIME
186 latch free address WAITED KNOWN TIME
182 latch free address WAITED KNOWN TIME
177 latch free address WAITING
163 latch free address WAITING
147 latch free address WAITED KNOWN TIME
146 latch free address WAITED KNOWN TIME
238 latch free address WAITED KNOWN TIME
236 latch free address WAITED KNOWN TIME
233 latch free address WAITED KNOWN TIME
225 latch free address WAITED KNOWN TIME
221 latch free address WAITED KNOWN TIME
211 latch free address WAITING
209 latch free address WAITED KNOWN TIME
207 latch free address WAITED KNOWN TIME
204 latch free address WAITING
261 latch free address WAITED KNOWN TIME
257 latch free address WAITED KNOWN TIME
255 latch free address WAITED KNOWN TIME
253 latch free address WAITED KNOWN TIME
251 latch free address WAITED KNOWN TIME
241 latch free address WAITING
239 latch free address WAITED KNOWN TIME
119 latch free address WAITED KNOWN TIME
113 latch free address WAITED KNOWN TIME
97 latch free address WAITING
92 latch free address WAITED KNOWN TIME
88 latch free address WAITED KNOWN TIME
87 latch free address WAITED KNOWN TIME
68 latch free address WAITED KNOWN TIME
2、 查詢佔用cpu的程式情況,大量程式佔用都很高,如3720.
SQL> SELECT /*+ ordered */ p.spid, s.sid, s.serial#, s.username, s.program,s.status,TO_CHAR(s.logon_time, 'mm-dd-yyyy hh24:mi') logon_time, s.last_call_et, st.value, s.sql_hash_value, s.sql_address, sq.child_number ,sq.sql_text
2 FROM v$statname sn, v$sesstat st, v$process p, v$session s, v$sql sq
3 WHERE s.paddr=p.addr
4 AND s.sql_hash_value = sq.hash_value and s.sql_Address = sq.address
5 AND s.sid = st.sid
6 AND st.STATISTIC# = sn.statistic#
7 AND sn.NAME = 'CPU used by this session'
8 AND p.spid = &osPID -- parameter to restrict for a specific PID
9 -- AND s.status = 'ACTIVE'
10 ORDER BY st.value desc;
Enter value for ospid: 3720
SPID SID SERIAL# USERNAME PROGRAM STATUS LOGON_TIME LAST_CALL_ET VALUE SQL_HASH_VALUE SQL_ADDRESS CHILD_NUMBER
------------ ------ ------- ---------- -------------------------------- -------- ---------------- ------------ ---------- -------------- ---------------- ------------
SQL_TEXT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
3720 97 17856 CCATSUPT JDBC Thin Client ACTIVE 02-21-2011 15:28 3413 16088 1113672989 C000000354738950 0
SELECT /*+ INDEX (C, I_SVR_PUB_DA_MAINQUEUE_HIS_FRT) */ COUNT(1) FROM Svr_pub_da_MainQueue_his c,pub_Specialty k WHERE c.Business in ( 'D46C2BC08404D1211DFA6F7BA8DCB9DB', '293C7B04CD7B0FFD56B255CC2585E16F') AND c.specialty = k.specialtyid AND c.FirstReceptTime BETWEEN TO_DATE('2011-01-21 00:00:00', 'yyyy-MM-dd HH24:MI:SS') AND TO_DATE('2011-02-22 00:00:00', 'yyyy-MM-dd HH24:MI:SS') AND ( k.TreeCode LIKE '0109%') AND EXISTS (SELECT 1 FROM org_unit ou WHERE c.sourcedept = ou.unitid AND ou.TreeCode LIKE '0001000301970003%')
3、檢視該sql執行計劃,存在開銷極大的nested loop,檢查各個表的資料量後,懷疑是統計資訊出問題,走了不該走的索引。
-------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart | Pstop |
-------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 34498 | | |
| 1 | SORT AGGREGATE | | 1 | 203 | | | |
| * 2 | TABLE ACCESS BY GLOBAL INDEX ROWID | SVR_PUB_DA_MAINQUEUE_HIS | 1 | 101 | 34493 | ROW L | ROW L |
| 3 | NESTED LOOPS | | 1 | 203 | 34498 | | |
| 4 | MERGE JOIN CARTESIAN | | 1 | 102 | 5 | | |
| 5 | TABLE ACCESS BY INDEX ROWID | PUB_SPECIALTY | 1 | 46 | 3 | | |
| * 6 | INDEX RANGE SCAN | I_PUB_SPECIALTY_TR | 1 | | 2 | | |
| 7 | BUFFER SORT | | 1 | 56 | 2 | | |
| 8 | SORT UNIQUE | | | | | | |
| 9 | TABLE ACCESS BY INDEX ROWID | ORG_UNIT | 1 | 56 | 2 | | |
| * 10 | INDEX RANGE SCAN | ORG_UNIT_I01 | 1 | | 1 | | |
| * 11 | INDEX RANGE SCAN | I_SVR_PUB_DA_MAINQUEUE_HIS_FRT | 40241 | | 119 | | |
-------------------------------------------------------------------------------------------------------------------------
4、更新統計資訊
analyze table PUB_SPECIALTY compute statistics for table for all indexed columns for all indexes;
analyze table org_unit compute statistics for table for all indexed columns for all indexes;
5、此時再次執行sql,檢視執行計劃,發現已經已無nested loop
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT ptimizer=CHOOSE (Cost=39980 Card=1 Bytes=2
00)
1 0 SORT (AGGREGATE)
2 1 HASH JOIN (Cost=39980 Card=1811 Bytes=362200)
3 2 TABLE ACCESS (FULL) OF 'PUB_SPECIALTY' (Cost=4 Card=17
0 Bytes=7480)
4 2 HASH JOIN (Cost=39975 Card=2059 Bytes=321204)
5 4 SORT (UNIQUE)
6 5 TABLE ACCESS (FULL) OF 'ORG_UNIT' (Cost=10 Card=21
Bytes=1155)
7 4 TABLE ACCESS (BY GLOBAL INDEX ROWID) OF 'SVR_PUB_DA_
MAINQUEUE_HIS' (Cost=39951 Card=16097 Bytes=1625797)
8 7 INDEX (RANGE SCAN) OF 'I_SVR_PUB_DA_MAINQUEUE_HIS_
FRT' (NON-UNIQUE) (Cost=158 Card=40241)
6、查詢等待事件,latch free的session仍然存在,由於走錯了執行計劃,所以決定殺掉這些session,殺掉session後系統恢復正常
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/11088128/viewspace-687675/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 故障分析 | MySQL鎖等待超時一例分析MySql
- win10 2004系統cpu佔用高怎麼辦_win10 2004系統佔用cpu過高解決教程Win10
- win10系統perl.exe佔用cpu怎麼解決_win10系統perl.exe佔用cpu如何處理Win10
- hive 故障一例Hive
- 故障分析 | MySQL 耗盡主機記憶體一例分析MySql記憶體
- win10系統explorer佔用cpu很高的解決方法Win10
- windows10系統下backgroundTaskHost佔用大量CPU如何解決Windows
- 執行sed命令卡死CPU消耗100%一例分析
- win10系統cpu溫度高怎麼辦 w10系統cpu溫度超高如何處理Win10
- 故障分析 | MySQL 備份檔案靜默損壞一例分析MySql
- 在Linux中,如何查詢系統中佔用CPU最高的程序?Linux
- win10系統玩流放之路cpu佔用100%如何解決Win10
- win10系統BackgroundTransferHost.exe佔用CPU高怎麼辦Win10
- Node.js 應用高 CPU 佔用率的分析方法Node.js
- win10系統下sppsvc.exe佔用cpu高如何關閉Win10
- CPU飆升?教你1分鐘抓取佔用系統資源的程式
- win10系統windows defender佔用CPU100怎麼解決Win10Windows
- 系統空閒程式佔用大量CPU是什麼原因_系統空閒程式CPU使用率高怎麼解決
- 麒麟系統修改網路卡名步驟和網路卡佔用故障處理
- Win10系統下Microsoft Compatibility Telemetry程式佔用cpu高怎麼禁用Win10ROS
- Win10系統Runtimebroker.exe程式佔用cpu非常高如何解決Win10
- win10系統Antimalware Service Executable程式佔用cpu過高如何解決Win10
- win10系統執行dnf時cpu佔用100怎麼解決Win10
- win10系統下wmiproviderhost程式佔用大量cpu過高如何解決Win10IDE
- Dubbo Hession反序列化導致CPU佔用飆高用例分析
- win10系統下Conhost.exe程式佔用cpu率很高如何解決Win10
- 故障解決:埠已被佔用 1080
- win10系統中WMI程式佔用cpu使用率高怎麼解決Win10
- Win10系統下火狐瀏覽器佔用CPU過高的解決方法Win10瀏覽器
- windows10系統下Windows event log佔用CPU使用率高怎麼辦Windows
- 在Docker中,如何控制容器佔用系統資源(CPU,記憶體)的份額?Docker記憶體
- win10系統keyshot佔用率高cpu怎麼辦_win10 keyshot佔用率100%cpu處理方法Win10
- java程式佔用cpu異常升高Java
- enq: TX - index contention故障修復一例ENQIndex
- Win10系統cpu佔有率很高的解決方法Win10
- win10系統中.NET Runtime Optimization Service佔用CPU使用率很高怎麼辦Win10
- win10系統中“來自microsoft download、upload host”程式佔用cpu高如何關閉Win10ROS
- MacPro系統佔用空間太大Mac
- java應用CPU佔用率過高排查Java