怎樣用除錯工具Dump Oracle系統狀態

hd_system發表於2017-05-02

如果Oracle資料庫hang住了,對Oracle做system dump,或做hang analyze,是研究和解決問題的有效辦法,至少在提交SR時能夠有更多的有用資訊。如果能夠連線資料庫,並能夠進行操作,那麼用oradebug是簡單快捷的辦法。

但有的時候,資料庫由於hang住,sqlplus不能連線時(在10g可以嘗試用sqlplus -prelim連線資料庫),可以使用作業系統上的除錯工具來dump oracle系統狀態。在一文中,就曾使用dbx做systemstate dump,並發現問題所在,並最終解決了問題。下面是當時用dbx做dump的過程:

# dbx -a 446910
Waiting to attach to process 446910 …
Successfully attached to oracle.
Type ‘help’ for help.
reading symbolic information …
stopped in iosl.select at 0×9000000000c94d8 ($t2)
0×9000000000c94d8 (select+0xfffffffffff06318) e8410028 ld r2,0×28(r1)
(dbx) print ksudss(10)

Segmentation fault in slrac at 0×100083aa0 ($t2)
0×100083aa0 (slrac+0xe4) 88030000 lbz r0,0×0(r3)
(dbx) detach

從上面可以看到,使用dbx做dump的過程為:

  • 找到有異常的程式號,比如CPU非常高,HANG住的程式等。如果做系統範圍的systemstate dump,可以是其他的程式。
  • dbx -a < 程式號>
  • print ksudss(10) --這裡是直接呼叫ORACLE程式中的ksudss函式,dump level為10,就等同於在sqlplus 中用oradebug dump systemstate 10
  • detach
  • quit

在LINUX下可以使用gdb,下面是一個例子:

[oracle@xty ~]$ ps -ef | grep LOCAL
oracle 3765 3764 1 05:55 ? 00:00:00 oraclexty (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle 3767 3668 0 05:55 pts/2 00:00:00 grep LOCAL
[oracle@xty ~]$ gdb $ORACLE_HOME/bin/oracle 3765
GNU gdb Red Hat Linux (6.1post-1.20040607.62rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...(no debugging symbols found)...Using host libthread_db library "/lib/tls/libthread_db.so.1".

Attaching to program: /u01/app/oracle/product/10.1.0/db_1/bin/oracle, process 3765
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libskgxp10.so...(no debugging symbols found)...done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libskgxp10.so
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libhasgen10.so...done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libhasgen10.so
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libskgxn2.so...done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libskgxn2.so
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libocr10.so...done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libocr10.so
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libocrb10.so...done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libocrb10.so
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libocrutl10.so...done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libocrutl10.so
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libjox10.so...done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libjox10.so
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libclsra10.so...done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libclsra10.so
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libdbcfg10.so...done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libdbcfg10.so
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libnnz10.so...done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libnnz10.so
Reading symbols from /usr/lib/libaio.so.1...done.
Loaded symbols for /usr/lib/libaio.so.1
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/tls/libm.so.6...done.
Loaded symbols for /lib/tls/libm.so.6
Reading symbols from /lib/tls/libpthread.so.0...done.
[Thread debugging using libthread_db enabled]
[New Thread -1219938624 (LWP 3765)]
Loaded symbols for /lib/tls/libpthread.so.0
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
0x006967a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) print ksudss(10)
[Switching to Thread -1219938624 (LWP 3765)]
$1 = 213658428
(gdb) detach
Detaching from program: /u01/app/oracle/product/10.1.0/db_1/bin/oracle, process 3765
(gdb) quit

然後我們可以找到有dump結果的trace檔案:

[oracle@xty ~]$ cd $ORACLE_BASE/admin/xty/udump
[oracle@xty udump]$ ls -lrt | grep 3765
-rw-r----- 1 oracle oinstall 599705 Nov 21 05:56 xty_ora_3765.trc

根據debugger工具attach程式的不同,trace檔案一般在user_dump_dest或background_dump_dest目錄下。

在LINUX下用gdb,在AIX下用dbx,那麼在HP-UX下呢,可以用HP的wdb(可以到檢視HP WDB的詳細資訊和下載最新的版本。在solaris上,也會有dbx或gdb(各個平臺有多種不同的debugger,其他還有adb,mdb等等)。有興趣的朋友可以用用。

除了上面提到的systemstate dump,還能不能夠做其他的dump?答案是肯定的,以下是一些dump相關的函式:

print ksdhng(3,1,0) 相當於oradebug hanganalyze 3
print ksudps(10) 相當於oradebug dump processstate 10
print curdmp() 相當於oradebug call curdmp(也就是oradebug dump cursordump)
print ksdtrc(4) 相當於oradebug dump events 4(這裡參數列示level,1--session,2--process,4--system)

以上列出的,不一定對處理HANG有意義,只是這裡覺得有些意思。^_^.其他還有意思的包括:

print ksdsel(10046,12) --相當於為attach的程式設定10046事件level 12
print skdxipc() --相當於oradebug ipc
print skdxprst() --相當於oradebug procstat

注意:不要在正常執行的生產系統上執行和測試。

當然,如果能用oradebug,那麼就用oradebug,畢竟方便很多,也更安全。這裡只是對使用debugger做dump一些擴充,供有興趣研究的朋友參考。

補充:在HP-UX上使用gdb進行system dump,可能會更復雜,可以參考Oracle Metalink Doc 273324.1

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29209863/viewspace-2138301/,如需轉載,請註明出處,否則將追究法律責任。

相關文章