系統cpu佔用超高故障分析一例
下班的時候突然接到電話,通知一臺主機資源佔用超過,同時手機簡訊也不停的報session數,趕快連上伺服器檢視具體原因:
1、 通過top可以發現系統CPU資源佔用100%。
查當前程式數,發現比平時多了100多
Select * from v$license;
SESSIONS_MAX SESSIONS_WARNING SESSIONS_CURRENT SESSIONS_HIGHWATER USERS_MAX
------------ ---------------- ---------------- ------------------ ----------
0 0 234 246 0
查session等待事件,大部分session都在做latch free等待
SQL> select sid,event,P1TEXT,state from v$session_wait;
SID EVENT P1TEXT STATE
------ ------------------------------ ------------------------------ -------------------
32 latch free address WAITED KNOWN TIME
38 latch free address WAITED KNOWN TIME
41 latch free address WAITING
57 latch free address WAITING
89 latch free address WAITED KNOWN TIME
93 latch free address WAITED KNOWN TIME
111 latch free address WAITED KNOWN TIME
118 latch free address WAITING
137 latch free address WAITED KNOWN TIME
201 latch free address WAITING
200 latch free address WAITED KNOWN TIME
194 latch free address WAITED KNOWN TIME
186 latch free address WAITED KNOWN TIME
182 latch free address WAITED KNOWN TIME
177 latch free address WAITING
163 latch free address WAITING
147 latch free address WAITED KNOWN TIME
146 latch free address WAITED KNOWN TIME
238 latch free address WAITED KNOWN TIME
236 latch free address WAITED KNOWN TIME
233 latch free address WAITED KNOWN TIME
225 latch free address WAITED KNOWN TIME
221 latch free address WAITED KNOWN TIME
211 latch free address WAITING
209 latch free address WAITED KNOWN TIME
207 latch free address WAITED KNOWN TIME
204 latch free address WAITING
261 latch free address WAITED KNOWN TIME
257 latch free address WAITED KNOWN TIME
255 latch free address WAITED KNOWN TIME
253 latch free address WAITED KNOWN TIME
251 latch free address WAITED KNOWN TIME
241 latch free address WAITING
239 latch free address WAITED KNOWN TIME
119 latch free address WAITED KNOWN TIME
113 latch free address WAITED KNOWN TIME
97 latch free address WAITING
92 latch free address WAITED KNOWN TIME
88 latch free address WAITED KNOWN TIME
87 latch free address WAITED KNOWN TIME
68 latch free address WAITED KNOWN TIME
2、 查詢佔用cpu的程式情況,大量程式佔用都很高,如3720.
SQL> SELECT /*+ ordered */ p.spid, s.sid, s.serial#, s.username, s.program,s.status,TO_CHAR(s.logon_time, 'mm-dd-yyyy hh24:mi') logon_time, s.last_call_et, st.value, s.sql_hash_value, s.sql_address, sq.child_number ,sq.sql_text
2 FROM v$statname sn, v$sesstat st, v$process p, v$session s, v$sql sq
3 WHERE s.paddr=p.addr
4 AND s.sql_hash_value = sq.hash_value and s.sql_Address = sq.address
5 AND s.sid = st.sid
6 AND st.STATISTIC# = sn.statistic#
7 AND sn.NAME = 'CPU used by this session'
8 AND p.spid = &osPID -- parameter to restrict for a specific PID
9 -- AND s.status = 'ACTIVE'
10 ORDER BY st.value desc;
Enter value for ospid: 3720
SPID SID SERIAL# USERNAME PROGRAM STATUS LOGON_TIME LAST_CALL_ET VALUE SQL_HASH_VALUE SQL_ADDRESS CHILD_NUMBER
------------ ------ ------- ---------- -------------------------------- -------- ---------------- ------------ ---------- -------------- ---------------- ------------
SQL_TEXT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
3720 97 17856 CCATSUPT JDBC Thin Client ACTIVE 02-21-2011 15:28 3413 16088 1113672989 C000000354738950 0
SELECT /*+ INDEX (C, I_SVR_PUB_DA_MAINQUEUE_HIS_FRT) */ COUNT(1) FROM Svr_pub_da_MainQueue_his c,pub_Specialty k WHERE c.Business in ( 'D46C2BC08404D1211DFA6F7BA8DCB9DB', '293C7B04CD7B0FFD56B255CC2585E16F') AND c.specialty = k.specialtyid AND c.FirstReceptTime BETWEEN TO_DATE('2011-01-21 00:00:00', 'yyyy-MM-dd HH24:MI:SS') AND TO_DATE('2011-02-22 00:00:00', 'yyyy-MM-dd HH24:MI:SS') AND ( k.TreeCode LIKE '0109%') AND EXISTS (SELECT 1 FROM org_unit ou WHERE c.sourcedept = ou.unitid AND ou.TreeCode LIKE '0001000301970003%')
3、檢視該sql執行計劃,存在開銷極大的nested loop,檢查各個表的資料量後,懷疑是統計資訊出問題,走了不該走的索引。
-------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart | Pstop |
-------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 34498 | | |
| 1 | SORT AGGREGATE | | 1 | 203 | | | |
| * 2 | TABLE ACCESS BY GLOBAL INDEX ROWID | SVR_PUB_DA_MAINQUEUE_HIS | 1 | 101 | 34493 | ROW L | ROW L |
| 3 | NESTED LOOPS | | 1 | 203 | 34498 | | |
| 4 | MERGE JOIN CARTESIAN | | 1 | 102 | 5 | | |
| 5 | TABLE ACCESS BY INDEX ROWID | PUB_SPECIALTY | 1 | 46 | 3 | | |
| * 6 | INDEX RANGE SCAN | I_PUB_SPECIALTY_TR | 1 | | 2 | | |
| 7 | BUFFER SORT | | 1 | 56 | 2 | | |
| 8 | SORT UNIQUE | | | | | | |
| 9 | TABLE ACCESS BY INDEX ROWID | ORG_UNIT | 1 | 56 | 2 | | |
| * 10 | INDEX RANGE SCAN | ORG_UNIT_I01 | 1 | | 1 | | |
| * 11 | INDEX RANGE SCAN | I_SVR_PUB_DA_MAINQUEUE_HIS_FRT | 40241 | | 119 | | |
-------------------------------------------------------------------------------------------------------------------------
4、更新統計資訊
analyze table PUB_SPECIALTY compute statistics for table for all indexed columns for all indexes;
analyze table org_unit compute statistics for table for all indexed columns for all indexes;
5、此時再次執行sql,檢視執行計劃,發現已經已無nested loop
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT ptimizer=CHOOSE (Cost=39980 Card=1 Bytes=2
00)
1 0 SORT (AGGREGATE)
2 1 HASH JOIN (Cost=39980 Card=1811 Bytes=362200)
3 2 TABLE ACCESS (FULL) OF 'PUB_SPECIALTY' (Cost=4 Card=17
0 Bytes=7480)
4 2 HASH JOIN (Cost=39975 Card=2059 Bytes=321204)
5 4 SORT (UNIQUE)
6 5 TABLE ACCESS (FULL) OF 'ORG_UNIT' (Cost=10 Card=21
Bytes=1155)
7 4 TABLE ACCESS (BY GLOBAL INDEX ROWID) OF 'SVR_PUB_DA_
MAINQUEUE_HIS' (Cost=39951 Card=16097 Bytes=1625797)
8 7 INDEX (RANGE SCAN) OF 'I_SVR_PUB_DA_MAINQUEUE_HIS_
FRT' (NON-UNIQUE) (Cost=158 Card=40241)
6、查詢等待事件,latch free的session仍然存在,由於走錯了執行計劃,所以決定殺掉這些session,殺掉session後系統恢復正常
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/11088128/viewspace-687675/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- cpu故障現象分析 CPU常見故障案例
- cisco4506 cpu佔用達99%解決一例
- win10系統perl.exe佔用cpu怎麼解決_win10系統perl.exe佔用cpu如何處理Win10
- win10 2004系統cpu佔用高怎麼辦_win10 2004系統佔用cpu過高解決教程Win10
- ORACLE程式佔用CPU情況分析(轉載)Oracle
- 如何解決Windows8系統開機CPU佔用率高硬碟燈長閃故障Windows硬碟
- windows10系統下backgroundTaskHost佔用大量CPU如何解決Windows
- win10系統cpu溫度高怎麼辦 w10系統cpu溫度超高如何處理Win10
- 故障分析 | MySQL鎖等待超時一例分析MySql
- 程式佔用cpu排序排序
- win10系統explorer佔用cpu很高的解決方法Win10
- win10系統BackgroundTransferHost.exe佔用CPU高怎麼辦Win10
- Node.js 應用高 CPU 佔用率的分析方法Node.js
- CPU飆升?教你1分鐘抓取佔用系統資源的程式
- win10系統玩流放之路cpu佔用100%如何解決Win10
- 在Linux中,如何查詢系統中佔用CPU最高的程序?Linux
- 一例資料倉儲執行update引起CPU佔用超過90%問題
- 系統空閒程式佔用大量CPU是什麼原因_系統空閒程式CPU使用率高怎麼解決
- 10g中佔用CPU很高異常oracle程式分析Oracle
- win10系統windows defender佔用CPU100怎麼解決Win10Windows
- win10系統Antimalware Service Executable程式佔用cpu過高如何解決Win10
- win10系統下sppsvc.exe佔用cpu高如何關閉Win10
- Win10系統chrome瀏覽器佔用CPU很高的解決方法Win10Chrome瀏覽器
- 麒麟系統修改網路卡名步驟和網路卡佔用故障處理
- 故障分析 | MySQL 耗盡主機記憶體一例分析MySql記憶體
- Dubbo Hession反序列化導致CPU佔用飆高用例分析
- ORA-00904故障分析與解決一例
- Win10系統Runtimebroker.exe程式佔用cpu非常高如何解決Win10
- win10系統執行dnf時cpu佔用100怎麼解決Win10
- win10系統下wmiproviderhost程式佔用大量cpu過高如何解決Win10IDE
- 故障分析 | MySQL 備份檔案靜默損壞一例分析MySql
- C#獲取CPU佔用率、記憶體佔用、磁碟佔用、程式資訊C#記憶體
- java程式佔用cpu異常升高Java
- db2sysc CPU佔用90%DB2
- 執行sed命令卡死CPU消耗100%一例分析
- MySQL SLAVE故障一例MySql
- 網路故障一例
- win10系統下Conhost.exe程式佔用cpu率很高如何解決Win10