Oracle診斷事件例項(一)

xz43發表於2010-12-10
1、設定SQL跟蹤事件
使用dbmsutil.sql指令碼來建立dbms_system包。
sid:目標session的sid;
serial#:目標session的序列號(serial number);
以上sidserial#資訊,可以透過v$session檢視獲得。
event:Oracle資料庫中內部定義的事件程式碼;
level:dump資訊的級別;當做一個塊的dump時,這個level被使用指定該data block address;
action:該跟蹤事件的Action;
例:跟蹤一個SQL語句(使用10046的12級)的事件,若該語句對應的Session的sid為8,serial#為219,則可以在SQL*Plus中執行如下指令碼:
SQL>execute sys.dbms_system.set_ev(8,219,10046,12,'');
 
2、使用oradebug跟蹤事件
首先,需要定位你想oradebug的後臺程式:
SQL>select spid from v$process where username = 'oracle';
然後,確認該程式spid為3514在作業系統級別存在:
ps -ef | grep 3514  
最後,使用oradebug來dump該程式的狀態資訊:
SQL>oradebug setospid 3514
SQL>oradebug unlimit
SQL>oradebug dump processstate 10
注意:
當我們dump程式狀態,系統狀態等等的時候,使用的語法如下:
oradebug dump
是不帶level關鍵字的,例如
SQL>oradebug dump processstate 1
 
oradebug與procedure set_ev對應關係如下:
oradebug dump
sys.dbms_system.set_ev(sid,serial#,65535,,);
 
3、immediate dump控制檔案control file
SQL>oradebug setospid 4081
SQL>oradebug dump controlf 10
或者
SQL>sys.dbms_system.set_ev(7,10,65535,10,'controlf');
備註:除了上面的controlf,其他的immediate dump包括 file_hdrs, redohdr, processstate, 和 systemstate
 
4、Error Stack的dump levels
這裡假設作業系統程式Id為4018,以下為設定當前process的事件,它對共享伺服器程式和並行查都是有用的。
SQL> oradebug setospid 4018
SQL> oradebug unlimit
SQL> oradebug dump errorstack 1
 
5、改變行為的事件
改變行為的事件通常被設定進引數檔案。使用的語法和跟蹤事件的完全一致。
例,設定該event事件來阻止SMON程式coalescing空閒空間:
event = "10269 trace name context forever, level 10"
 
6、10046事件
10046事件等同於sql_trace=true的作用。
使用該事件的好處是透過執行該事件的level,能夠把更詳細的內部資訊輸出到跟蹤檔案中。
SQL> alter session set events '10046 trace name context forever, level 12';
或者
SQL> alter system set events '10046 trace name context forever, level 12';
作用:
Level 1 提供標準的sql_trace功能;
Level 4 除了1外,還增加了繫結變數跟蹤功能;
Level 8 除了1外,還能跟蹤wait事件,透過這個來發現全表掃描和索引掃描。
Level12 除了1外,還有繫結變數跟蹤與wait事件跟蹤。
 
7、Events與Spfile的關係
ALTER SYSTEM命令一直是作為靜態的改變;不允許指定scope=momery。disable或者alter列表中的單個event,則整個列表會被重新加入。
Disable所有的events,語法如下:
SQL> alter system reset event scope=SPFILE sid='*';
上面的“*”可以是RAC環境中的一個例項名。
例:
SQL> alter system set event ='10325 trace name context forever, level 10 :10015 trace name context forever, level 1' comment = 'Debug tracing of control and rollback' scope = SPFILE;
 
8、hanganalyze event
這是一個資料庫被hang住,或者發生死鎖時的典型事件,它能夠基於ORA-60錯誤被設定到trigger中。hanganalyze試圖找出誰在等待被誰建立的等待鏈,然後根據不同程式的level來dump它們的錯誤堆。這就好比手工去做但更快速。
注意:
hanganalyze不是特意為了診斷應用級死鎖而提供的。
這個hanganalyze事件能夠透過以下三種方法來發起:
a。SQL> ALTER SESSION SET EVENTS 'immediate trace name HANGANALYZE level 4';
b。EVENT="60 trace name HANGANALYZE level 5" (instance level, triggered by ORA-60)
c。SQL> ORADEBUG hanganalyze 10
這裡的level決定了哪些程式被dump到errorstack中,主要的Level有:
10 Dump all processes (voluminous data output, not a good idea)
5 Dump all processes involved in wait chains (can still produce a lot of output)
4 Dump leaf nodes in wait chains
3 Dump only processes thought to be in a hang situation
2 Minimal output
1 Very minimal output
 
9、執行immediate系統狀態dump
SQL> alter session set events 'immediate trace name systemstate level 10';
或者,使用以下方式用oradebug來處理當前process:
SQL> oradebug setospid
SQL> oradebug dump systemstate 10
 
10、閱讀System state的dump檔案
A typical system state dump consists of three sections; the trace file header, system global information and process information. Of the three, the process information section is the most important for root cause investigations. The heading in the file for this section is “System State.” The first process state objects listed under this heading are the Oracle background processes. User processes (client) generally follow as do the other types of state objects (session, call, enqueue, etc.). The next few slides show a few of the different type of state objects as they would appear in a dump.

 
                  Process State  Object
 
SYSTEM STATE
------------
SO: 40002c530, type: 1, owner: 0, flag: INIT/-/-/0x00
(process) Oracle pid=6, calls cur/top: 40005c698/40005c698, flag: (a) SYSTEM
  int error: 0, call error: 0, sess error: 0, txn error 0
  (post info) last post received: 0 0 51
              last post received-location: ktmchg
              last process to post me: 40002ccf0 2 0
              last post sent: 79 0 4
              last post sent-location: kslpsr
              last process posted by me: 40002ccf0 2 0
    (latch info) wait_event=0 bits=0
 O/S info: user: user1, term: ?, ospid: 32494
 OSD pid info:Unix process pid: 32494,image:ora_smon_u01
解讀如下:
Line 3: The state object ID, type (1=process), and owner of the process.
Line 4: The Oracle process id (6) from v$process.pid, address of the current call (cur) from the state object and the top level call's address (top). Some possible flag values are:
0x01: Process dead and will be cleaned up
0x02: SYSTEM process
0x04: The process is PMON
0x08: The process is SMON
0x10: The process is the PSEUDO process
0x20: Oracle Shared server
0x40: Dispatcher process
Line 5: The error code (ora-nnnn) to be raised on the next interrupt (int error), the error code to return on user call (call error) and the Oracle error returned on session creation (sess error.)
Lines 6-11: The inter-process posting information.
Line 12: Latch info dumped by ksldmp, includes events waiting for (wait event) and the bit array of any latches owned.
Lines 13-14: O/S data (user, process ID and Unix process name.)
 
11、閱讀Session State 的dump檔案
                Session State Object
 
SO:400039930,type:3, owner:40002c530,flag: INIT/-/-/0x00
  (session) trans: 0, flag: (51) USR/- BSY/-/-/-/-/-
DID: 0001-0006-00000001, short-term DID: 0000-0000-00000000
 txn branch: 0
 oct: 0, prv: 0, user: 0/SYS
waiting for 'smon timer' seq=6639 wait_time=0  
sleep time=12c, failed=0,
解讀如下:
Line 1: The state object id and owner are stated here.
Line 2: Address of transaction SO (trans), session type (flag), some possible values are:
0x0001: A user session rather than a recursive session
0x0002: A recursive session (always internal)
0x0008: Disable commit/rollback from plsql
0x0010: User session created by system processes
0x0020: Set when the UGA is allocated in the SGA
0x0040: User session logs on to ORACLE
0x0080: User session created by Oracle Shared Server
Some descriptive letter codes follow. Possible values are:
USR: User session
BSY: Session is busy. It's in a call.
DED: Session marked dead by user process.
DEL: Session being deleted (through alter system kill session).
KIL: Session marked for kill (through alter system kill session).
Lines 3-4: Resource id (DID) information and transaction relationships are shown here.
Line 5: Oracle command type (oct), user privileges (priv), and Oracle user are shown.
Lines 6-7: Resource waited for, wait and sleep times are displayed here.
 
12、診斷hang或者loop是非常有用的檢視及資料字典如下
V$SESSION_WAIT, V$SESSION_EVENT
V$LATCH, V$LATCHHOLDER, V$LATCHNAME
V$SYSSTAT, V$LOCK
$PROCESS, V$SESSION, V$TRANSACTION
X$KCBFWAIT (buffer waits), X$KSQST (enqueues)
 
 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/9399028/viewspace-681507/,如需轉載,請註明出處,否則將追究法律責任。

相關文章