用oradebug short_stack及strace -p分析oracle程式是否dead或出現故障
1,可以採用oradebug或者strace -p跟蹤後臺或前臺程式是否dead或hang住
2,如果程式出現故障,必會在對應的TRC檔案寫入最新資訊,基於此可以獲取非常重要的資訊進一步分析與診斷
日誌檔案在background_dump_dest
3,採用 ll -lhrt *lgwr*|tail -10f 獲取最新的程式的TRC檔案
4,而且出現故障時,多半會在ALERT日誌記錄相關資訊,此是排除故障重要且首要的方法及思路
5,oradebug setospid ospid
oradebug short_stack
會顯示程式的堆疊資訊,注意:可以間隔多次執行,如果多次顯示的堆疊資訊一致,可以肯定此程式肯定是dead或出現故障了
6,可以用strace -p ospid跟蹤分析,
---hang或故障時的類似資訊如下
semtimedop(9273344, 0x7fffe66199d0, 1, {1, 0}) = -1 EAGAIN (Resource temporarily unavailable)
---正常時的類似資訊如下
times({tms_utime=12, tms_stime=13, tms_cutime=0, tms_cstime=0}) = 440015944
semtimedop(9273344, 0x7fffe661b1f0, 1, {1, 800000000}) = -1 EAGAIN (Resource temporarily unavailable)
getrusage(RUSAGE_SELF, {ru_utime={0, 123981}, ru_stime={0, 132979}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 123981}, ru_stime={0, 132979}, ...}) = 0
times({tms_utime=12, tms_stime=13, tms_cutime=0, tms_cstime=0}) = 440016124
times({tms_utime=12, tms_stime=13, tms_cutime=0, tms_cstime=0}) = 440016124
times({tms_utime=12, tms_stime=13, tms_cutime=0, tms_cstime=0}) = 440016124
times({tms_utime=12, tms_stime=13, tms_cutime=0, tms_cstime=0}) = 440016124
semtimedop(9273344, 0x7fffe661b1f0, 1, {3, 0}) = -1 EAGAIN (Resource temporarily unavailable)
getrusage(RUSAGE_SELF, {ru_utime={0, 123981}, ru_stime={0, 132979}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 123981}, ru_stime={0, 132979}, ...}) = 0
times({tms_utime=12, tms_stime=13, tms_cutime=0, tms_cstime=0}) = 440016424
times({tms_utime=12, tms_stime=13, tms_cutime=0, tms_cstime=0}) = 440016424
times({tms_utime=12, tms_stime=13, tms_cutime=0, tms_cstime=0}) = 440016424
times({tms_utime=12, tms_stime=13, tms_cutime=0, tms_cstime=0}) = 440016424
semtimedop(9273344, 0x7fffe661b1f0, 1, {3, 0}) = -1 EAGAIN (Resource temporarily unavailable)
getrusage(RUSAGE_SELF, {ru_utime={0, 123981}, ru_stime={0, 132979}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 123981}, ru_stime={0, 132979}, ...}) = 0
times({tms_utime=12, tms_stime=13, tms_cutime=0, tms_cstime=0}) = 440016725
times({tms_utime=12, tms_stime=13, tms_cutime=0, tms_cstime=0}) = 440016725
times({tms_utime=12, tms_stime=13, tms_cutime=0, tms_cstime=0}) = 440016725
times({tms_utime=12, tms_stime=13, tms_cutime=0, tms_cstime=0}) = 440016725
semtimedop(9273344, 0x7fffe661b1f0, 1, {3, 0}) = -1 EAGAIN (Resource temporarily unavailable)
getrusage(RUSAGE_SELF, {ru_utime={0, 123981}, ru_stime={0, 132979}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 123981}, ru_stime={0, 132979}, ...}) = 0
times({tms_utime=12, tms_stime=13, tms_cutime=0, tms_cstime=0}) = 440017025
open("/proc/4385/stat", O_RDONLY) = 35
read(35, "4385 (oracle) S 1 4385 4385 0 -1"..., 999) = 225
說白了,就是看資訊有沒有變化,有變化就說明程式是正常的,否則就說明是不正常的
測試
SQL> select * from v$version where rownum=1;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
檢視後臺程式
SQL> select pid,spid,pname,username from v$process order by 1;
PID SPID PNAME USERNAME
---------- ---------- ---------- ------------------------------
1
2 4385 PMON oracle
3 4387 VKTM oracle
4 4391 GEN0 oracle
5 4393 DIAG oracle
6 4395 DBRM oracle
7 4397 PSP0 oracle
8 4399 DIA0 oracle
9 4401 MMAN oracle
10 4403 DBW0 oracle
11 4405 LGWR oracle
PID SPID PNAME USERNAME
---------- ---------- ---------- ------------------------------
12 4407 CKPT oracle
13 4409 SMON oracle
14 4411 RECO oracle
15 4413 MMON oracle
16 4415 MMNL oracle
17 4417 D000 oracle
18 4419 S000 oracle
19 4652 SMCO oracle
20 5266 W000 oracle
21 4936 oracle
27 4468 ARC0 oracle
PID SPID PNAME USERNAME
---------- ---------- ---------- ------------------------------
28 4481 ARC1 oracle
29 4486 ARC2 oracle
30 4489 ARC3 oracle
31 4496 QMNC oracle
32 4549 Q000 oracle
33 4551 Q001 oracle
34 4568 oracle
29 rows selected.
SQL>
---檢視TRC檔案目錄
[oracle@seconary trace]$ ll -lhrt *lgwr*|tail -10f
-rw-r----- 1 oracle oinstall 213 Dec 14 19:05 guowang_lgwr_5297.trm
-rw-r----- 1 oracle oinstall 2.4K Dec 14 19:05 guowang_lgwr_5297.trc
-rw-r----- 1 oracle oinstall 2.3K Dec 15 01:05 guowang_lgwr_22295.trm
-rw-r----- 1 oracle oinstall 27K Dec 15 01:05 guowang_lgwr_22295.trc
-rw-r----- 1 oracle oinstall 63 Dec 15 02:18 guowang_lgwr_31280.trm
-rw-r----- 1 oracle oinstall 903 Dec 15 02:18 guowang_lgwr_31280.trc
-rw-r----- 1 oracle oinstall 63 Dec 15 02:44 guowang_lgwr_32077.trm
-rw-r----- 1 oracle oinstall 906 Dec 15 02:44 guowang_lgwr_32077.trc
-rw-r----- 1 oracle oinstall 62 Dec 15 03:27 guowang_lgwr_1032.trm
-rw-r----- 1 oracle oinstall 887 Dec 15 03:27 guowang_lgwr_1032.trc
---HANG LGWR
SQL> oradebug setospid 4405
Oracle pid: 11, Unix process pid: 4405, image: oracle@seconary (LGWR)
SQL> oradebug suspend
Statement processed.
--ALERT同步記錄上述資訊
Tue Dec 15 04:46:15 2015
Unix process pid: 4405, image: oracle@seconary (LGWR) flash frozen [ command #1 ]
---TRC目錄同步記錄上述資訊
[oracle@seconary trace]$ ll -lhrt *lgwr*|tail -10f
-rw-r----- 1 oracle oinstall 2.3K Dec 15 01:05 guowang_lgwr_22295.trm
-rw-r----- 1 oracle oinstall 27K Dec 15 01:05 guowang_lgwr_22295.trc
-rw-r----- 1 oracle oinstall 63 Dec 15 02:18 guowang_lgwr_31280.trm
-rw-r----- 1 oracle oinstall 903 Dec 15 02:18 guowang_lgwr_31280.trc
-rw-r----- 1 oracle oinstall 63 Dec 15 02:44 guowang_lgwr_32077.trm
-rw-r----- 1 oracle oinstall 906 Dec 15 02:44 guowang_lgwr_32077.trc
-rw-r----- 1 oracle oinstall 62 Dec 15 03:27 guowang_lgwr_1032.trm
-rw-r----- 1 oracle oinstall 887 Dec 15 03:27 guowang_lgwr_1032.trc
-rw-r----- 1 oracle oinstall 63 Dec 15 04:46 guowang_lgwr_4405.trm
-rw-r----- 1 oracle oinstall 896 Dec 15 04:46 guowang_lgwr_4405.trc
[oracle@seconary trace]$
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/31383567/viewspace-2144755/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 用strace跟蹤分析oracle 10.2.0.1 rac lmd程式系列二Oracle
- oradebug分析oracle hang或慢_sqlplus_prelimOracleSQL
- oradebug分析oracle hangOracle
- 用strace除錯程式(zt)除錯
- 用oradebug掛起程式
- 故障排查工具-strace,tcpdump的簡單使用TCP
- 使用Linux Strace跟蹤除錯Oracle程式程式Linux除錯Oracle
- Dead lock - oracleOracle
- sqlplus"strace: exec: Exec format error"故障處理SQLORMError
- 使用oradebug dump hanganalyze分析oracle hang系列一Oracle
- 使用oradebug dump hanganalyze 分析oracle hang系列二Oracle
- 使用oradebug dump hanganalyze 分析oracle hang系列三Oracle
- ORACLE EVENT && ORADEBUGOracle
- Oracle oradebug命令Oracle
- Oradebug使用淺談--生成Hang或Locking問題分析檔案
- oracle dead lock與效能Oracle
- cpu故障現象分析 CPU常見故障案例
- 由研究oracle rac lms程式引發10708 event及oradebug dump bufferOracle
- SQL Server 2000 中使用指令碼或procedure查詢dead lock及killSQLServer指令碼
- (轉)Oracle EVENT && ORADEBUGOracle
- Oracle Debug ---- oradebugOracle
- 使用truss、strace或ltrace診斷軟體問題
- 執行Tensorboard出現kernel is dead的解決方法ORB
- 使用strace分析exp的奇怪問題
- Visual C#中P2P應用程式的實現C#
- Oracle 常見故障及日常規劃Oracle
- 造輪子-strace(二)實現
- Postgres是否合適替代Redis或Kafka實現釋出訂閱作業? - HNRedisKafka
- oracle oradebug使用詳解Oracle
- 使用Oradebug修改Oracle SCNOracle
- oracle實用工具:oradebugOracle
- oracle之 oradebug 命令用法Oracle
- Oracle oradebug命令詳解Oracle
- 【故障】“ORACLE使用者被鎖定”故障處理和分析Oracle
- 基於oracle 10.2.0.1 rac使用oradebug dump hanganalyze 分析oracle hang系列四Oracle
- 基於oracle 10.2.0.1 rac使用oradebug dump hanganalyze 分析oracle hang系列五Oracle
- 基於oracle 10.2.0.1 rac使用oradebug dump hanganalyze 分析oracle hang系列六Oracle
- 告警日誌頻繁出現Restarting dead background process QMNC資訊REST