使用Linux Strace跟蹤除錯Oracle程式程式
所謂作業系統,是應用程式與伺服器硬體進行溝通的中間層。應用程式的所有操作,都是和作業系統進行溝通互動。作業系統負責將所有互動轉化為裝置語言,進行硬體互動。
我們在進行Oracle故障除錯和核心原理工作的時候,經常需要了解後臺執行的動作和細節。一些故障場景,如程式程式hang住、無法登陸等問題,就需要作業系統級別監控,檢查定位問題。
Oracle自身已經提供了很多這型別的工具,如oradebug、各種等待事件和跟蹤方式。此外,各型別的作業系統提供出很多系統級別工具,幫助我們進行監控。本篇主要介紹Linux環境中的strace工具典型使用方法,留待需要朋友待查。
1、環境介紹
筆者使用紅帽Red Hat 6.5進行測試。
[root@XXX-Standby-Asm ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.5 (Santiago)
[root@XXX-Standby-Asm ~]# uname -a
Linux XXX-Standby-Asm 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST 2013 x86_64 x86_64 x86_64 GNU/Linux
對應使用Oracle版本為11.2.0.4。
[oracle@XXX-Standby-Asm ~]$ sqlplus /nolog
SQL*Plus: Release 11.2.0.4.0 Production on Tue Jun 16 16:39:21 2015
Copyright (c) 1982, 2013, Oracle. All rights reserved.
SQL> conn / as sysdba
Connected.
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
PL/SQL Release 11.2.0.4.0 - Production
CORE 11.2.0.4.0 Production
TNS for Linux: Version 11.2.0.4.0 - Production
NLSRTL Version 11.2.0.4.0 – Production
2、Strace命令資訊
使用-h引數,可以呼叫出strace命令的語法格式和各個使用引數。
[oracle@XXX-Standby-Asm ~]$ strace -h
usage: strace [-dDffhiqrtttTvVxx] [-a column] [-e expr] ... [-o file]
[-p pid] ... [-s strsize] [-u username] [-E var=val] ...
[command [arg ...]]
or: strace -c [-D] [-e expr] ... [-O overhead] [-S sortby] [-E var=val] ...
[command [arg ...]]
-c -- count time, calls, and errors for each syscall and report summary
-f -- follow forks, -ff -- with output into separate files
-F -- attempt to follow vforks, -h -- print help message
-i -- print instruction pointer at time of syscall
-q -- suppress messages about attaching, detaching, etc.
-r -- print relative timestamp, -t -- absolute timestamp, -tt -- with usecs
-T -- print time spent in each syscall, -V -- print version
-v -- verbose mode: print unabbreviated argv, stat, termio[s], etc. args
-x -- print non-ascii strings in hex, -xx -- print all strings in hex
-a column -- alignment COLUMN for printing syscall results (default 40)
-e expr -- a qualifying expression: option=[!]all or option=[!]val1[,val2]...
options: trace, abbrev, verbose, raw, signal, read, or write
-o file -- send trace output to FILE instead of stderr
-O overhead -- set overhead for tracing syscalls to OVERHEAD usecs
-p pid -- trace process with process id PID, may be repeated
-D -- run tracer process as a detached grandchild, not as parent
-s strsize -- limit length of print strings to STRSIZE chars (default 32)
-S sortby -- sort syscall counts by: time, calls, name, nothing (default time)
-u username -- run command as username handling setuid and/or setgid
-E var=val -- put var=val in the environment for command
-E var -- remove var from the environment for command
在諸多引數中,比較常用的有如下幾個:
ü -t引數:可以顯示呼叫作業系統過程的時間。這個在診斷中很重要,如果發現某一個執行過程時間消耗大,就可以定位到是什麼問題;
ü -p引數:strace除了可以從程式啟動時候開始監控之外,還可以根據作業系統程式編號,進入一個執行中的程式進行debug;
ü -o引數:輸出結果如果需要以檔案方式儲存,就透過-o引數指定檔名稱;
ü -e引數:監控表示式。提供trace、signal、read和write等內容,反映監控方面專案。常用的引數是trace=file,集中在作業系統檔案訪問層面。
下面,透過兩個典型應用場景,來演示一下strace應用。
3、啟動程式監控
strace監控研究的一個重要範例是sqlplus,很多前輩使用這個工具來測試跟蹤sqlplus啟動過程訪問的檔案形式。啟動strace監控sqlplus,就是在strace命令後直接輸入sqlplus命令即可。
--啟動命令
[oracle@localhost ~]$ strace -t -e trace=file sqlplus /nolog
14:21:14 execve("/u01/app/oracle/bin/sqlplus", ["sqlplus", "/nolog"], [/* 31 vars */]) = 0 –啟動命令
14:21:14 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
14:21:14 open("/u01/app/oracle/lib/tls/x86_64/libsqlplus.so", O_RDONLY) = -1 ENOENT (No such file or directory)
14:21:14 stat("/u01/app/oracle/lib/tls/x86_64", 0x7fff3076da50) = -1 ENOENT (No such file or directory)
14:21:14 open("/u01/app/oracle/lib/tls/libsqlplus.so", O_RDONLY) = -1 ENOENT (No such file or directory)
14:21:14 stat("/u01/app/oracle/lib/tls", 0x7fff3076da50) = -1 ENOENT (No such file or directory)
14:21:14 open("/u01/app/oracle/lib/x86_64/libsqlplus.so", O_RDONLY) = -1 ENOENT (No such file or directory)
14:21:14 stat("/u01/app/oracle/lib/x86_64", 0x7fff3076da50) = -1 ENOENT (No such file or directory)
14:21:14 open("/u01/app/oracle/lib/libsqlplus.so", O_RDONLY) = 3
14:21:14 open("/u01/app/oracle/lib/libclntsh.so.11.1", O_RDONLY) = 3
14:21:14 open("/u01/app/oracle/lib/libnnz11.so", O_RDONLY) = 3
14:21:14 open("/u01/app/oracle/lib/libdl.so.2", O_RDONLY) = -1 ENOENT (No such file or directory)
(篇幅原因,有省略……)
14:21:14 access("/etc/tnsnav.ora", F_OK) = -1 ENOENT (No such file or directory)
14:21:14 access("/u01/app/oracle/network/admin/tnsnav.ora", F_OK) = -1 ENOENT (No such file or directory)
14:21:14 open("/u01/app/oracle/oracore/zoneinfo", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 6
14:21:14 open("/u01/app/oracle/oracore/zoneinfo/timezlrg_1.dat", O_RDONLY) = 7
14:21:14 open("/u01/app/oracle/oracore/zoneinfo/timezlrg_2.dat", O_RDONLY) = 7
(讀取各種時區資料……)
14:21:14 open("/u01/app/oracle/oracore/zoneinfo/timezone_14.dat", O_RDONLY) = 7
14:21:14 open("/u01/app/oracle/oracore/zoneinfo/timezlrg_14.dat", O_RDONLY) = 6
14:21:14 open("/usr/lib64/libnuma.so", O_RDONLY) = -1 ENOENT (No such file or directory)
14:21:14 open("/usr/lib64/libnuma.so.1", O_RDONLY) = 6
14:21:14 open("/proc/self/status", O_RDONLY) = 6
14:21:14 open("/sys/devices/system/node", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 6
14:21:14 open("/sys/devices/system/node/node0/meminfo", O_RDONLY) = 7
14:21:14 open("/sys/devices/system/cpu", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 6
14:21:14 open("/proc/self/status", O_RDONLY) = 6
14:21:14 open("/proc/sys/kernel/shmmax", O_RDONLY) = 6
14:21:14 open("/proc/meminfo", O_RDONLY) = 6
14:21:14 open("/usr/lib64/libnuma.so", O_RDONLY) = -1 ENOENT (No such file or directory)
14:21:14 open("/etc/localtime", O_RDONLY) = 6
14:21:14 open("/u01/app/oracle/rdbms/mesg/ocius.msb", O_RDONLY) = 6
SQL*Plus: Release 11.2.0.4.0 Production on Fri Jun 12 14:21:14 2015
Copyright (c) 1982, 2013, Oracle. All rights reserved.
--啟動之後,讀取的檔案。
14:21:14 stat("login.sql", 0x7fff3076c740) = -1 ENOENT (No such file or directory)
14:21:14 stat("/u01/app/oracle/sqlplus/admin/glogin.sql", {st_mode=S_IFREG|0644, st_size=368, ...}) = 0
14:21:14 access("/u01/app/oracle/sqlplus/admin/glogin.sql", F_OK) = 0
14:21:14 statfs("/u01/app/oracle/sqlplus/admin/glogin.sql", {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=174170552, f_bfree=147038320, f_bavail=138190973, f_files=44236800, f_ffree=44019067, f_fsid={1821877354, 763674030}, f_namelen=255, f_frsize=4096}) = 0
14:21:14 open("/u01/app/oracle/sqlplus/admin/glogin.sql", O_RDONLY) = 7
如果在啟動sqlplus過程中,直接進行登入。跟蹤過程中還包括了登入動作。
[oracle@localhost ~]$ strace -t -e trace=file sqlplus / as sysdba
14:29:30 execve("/u01/app/oracle/bin/sqlplus", ["sqlplus", "/", "as", "sysdba"], [/* 31 vars */]) = 0
(相同篇幅內容,略……)
SQL*Plus: Release 11.2.0.4.0 Production on Fri Jun 12 14:29:30 2015
Copyright (c) 1982, 2013, Oracle. All rights reserved.
--差異部分
14:29:30 open("/u01/app/oracle/dbs/cm_xxxdb.dat", O_RDONLY|O_DSYNC) = -1 ENOENT (No such file or directory)
14:29:30 getcwd("/home/oracle", 256) = 13
14:29:30 open("/proc/self/cmdline", O_RDONLY) = 7
14:29:30 open("/etc/nsswitch.conf", O_RDONLY) = 7
14:29:30 open("/u01/app/oracle/lib/libnss_files.so.2", O_RDONLY) = -1 ENOENT (No such file or directory)
14:29:30 open("/etc/ld.so.cache", O_RDONLY) = 7
14:29:30 open("/lib64/libnss_files.so.2", O_RDONLY) = 7
14:29:30 open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 7
14:29:30 open("/u01/app/oracle/lib/libnque11.so", O_RDONLY) = 7
14:29:30 getcwd("/home/oracle", 256) = 13
14:29:30 open("/u01/app/oracle/nls/data/lx20369.nlb", O_RDONLY) = 7
14:29:30 open("/u01/app/oracle/nls/data/lx207d0.nlb", O_RDONLY) = 7
14:29:30 readlink("/proc/self/fd/0", "/dev/pts/0", 254) = 10
14:29:30 open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 7
14:29:30 open("/etc/host.conf", O_RDONLY) = 7
14:29:30 open("/etc/resolv.conf", O_RDONLY) = 7
14:29:30 open("/etc/hosts", O_RDONLY|O_CLOEXEC) = 7
14:29:30 open("/etc/hostid", O_RDONLY) = -1 ENOENT (No such file or directory)
14:29:30 open("/etc/hosts", O_RDONLY|O_CLOEXEC) = 7
14:29:30 open("/u01/app/oracle/rdbms/mesg/oraus.msb", O_RDONLY) = 7
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
14:29:30 stat("login.sql", 0x7fff0a822b10) = -1 ENOENT (No such file or directory)
14:29:30 stat("/u01/app/oracle/sqlplus/admin/glogin.sql", {st_mode=S_IFREG|0644, st_size=368, ...}) = 0
14:29:30 access("/u01/app/oracle/sqlplus/admin/glogin.sql", F_OK) = 0
14:29:30 statfs("/u01/app/oracle/sqlplus/admin/glogin.sql", {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=174170552, f_bfree=147038317, f_bavail=138190970, f_files=44236800, f_ffree=44019066, f_fsid={1821877354, 763674030}, f_namelen=255, f_frsize=4096}) = 0
14:29:30 open("/u01/app/oracle/sqlplus/admin/glogin.sql", O_RDONLY) = 7
4、跟蹤執行程式方式
除了上面介紹的方法外,strace還可以直接跟蹤監控正在執行的系統程式。下面我們來觀察一下strace對Oracle後臺CKPT程式的監控。
[oracle@XXX-Standby-Asm ~]$ ps -ef | grep ckpt
grid 15950 1 0 Jun05 ? 00:01:02 asm_ckpt_+ASM
oracle 18608 17848 0 17:35 pts/0 00:00:00 grep ckpt
oracle 21395 1 0 Jun08 ? 00:04:27 ora_ckpt_xxxstb
從root角度,跟蹤程式編號21395的後臺程式。
[root@XXX-Standby-Asm ~]# strace -t -p 21395
Process 21395 attached - interrupt to quit
17:36:42 times({tms_utime=12512, tms_stime=14273, tms_cutime=0, tms_cstime=0}) = 645608271
17:36:42 semtimedop(6062085, {{16, -1, 0}}, 1, {1, 840000000}) = -1 EAGAIN (Resource temporarily unavailable)
17:36:44 getrusage(RUSAGE_SELF, {ru_utime={125, 122978}, ru_stime={142, 730301}, ...}) = 0
17:36:44 getrusage(RUSAGE_SELF, {ru_utime={125, 122978}, ru_stime={142, 730301}, ...}) = 0
17:36:44 times({tms_utime=12512, tms_stime=14273, tms_cutime=0, tms_cstime=0}) = 645608455
17:36:44 times({tms_utime=12512, tms_stime=14273, tms_cutime=0, tms_cstime=0}) = 645608455
17:36:44 read(16, "MSA\0\2\0\10\0P\0\0\0\0\0\0\0\20?P\r\0\0\0\0\0\200\330\266\177\177\0\0"..., 80) = 80
17:36:44 read(16, "MSA\0\2\0\10\0P\0\0\0\0\0\0\0\20?P\r\0\0\0\0\0\200\330\266\177\177\0\0"..., 80) = 80
17:36:44 times({tms_utime=12512, tms_stime=14273, tms_cutime=0, tms_cstime=0}) = 645608455
17:36:44 semtimedop(6062085, {{16, -1, 0}}, 1, {3, 0}) = -1 EAGAIN (Resource temporarily unavailable) –三秒差異
17:36:47 getrusage(RUSAGE_SELF, {ru_utime={125, 123978}, ru_stime={142, 731301}, ...}) = 0
17:36:47 getrusage(RUSAGE_SELF, {ru_utime={125, 123978}, ru_stime={142, 731301}, ...}) = 0
17:36:47 times({tms_utime=12512, tms_stime=14273, tms_cutime=0, tms_cstime=0}) = 645608755
17:36:47 times({tms_utime=12512, tms_stime=14273, tms_cutime=0, tms_cstime=0}) = 645608755
17:36:47 read(16, "MSA\0\2\0\10\0P\0\0\0\0\0\0\0\20?P\r\0\0\0\0\0\200\330\266\177\177\0\0"..., 80) = 80
17:36:47 read(16, "MSA\0\2\0\10\0P\0\0\0\0\0\0\0\20?P\r\0\0\0\0\0\200\330\266\177\177\0\0"..., 80) = 80
17:36:47 read(16, "MSA\0\2\0\10\0P\0\0\0\0\0\0\0\20?P\r\0\0\0\0\0\200\330\266\177\177\0\0"..., 80) = 80
17:36:47 read(16, "MSA\0\2\0\10\0P\0\0\0\0\0\0\0\20?P\r\0\0\0\0\0\200\330\266\177\177\0\0"..., 80) = 80
17:36:47 read(16, "MSA\0\2\0\10\0P\0\0\0\0\0\0\0\20?P\r\0\0\0\0\0\200\330\266\177\177\0\0"..., 80) = 80
17:36:47 read(16, "MSA\0\2\0\10\0P\0\0\0\0\0\0\0\20?P\r\0\0\0\0\0\200\330\266\177\177\0\0"..., 80) = 80
17:36:47 times({tms_utime=12512, tms_stime=14273, tms_cutime=0, tms_cstime=0}) = 645608756
17:36:47 times({tms_utime=12512, tms_stime=14273, tms_cutime=0, tms_cstime=0}) = 645608756
17:36:47 times({tms_utime=12512, tms_stime=14273, tms_cutime=0, tms_cstime=0}) = 645608756
17:36:47 times({tms_utime=12512, tms_stime=14273, tms_cutime=0, tms_cstime=0}) = 645608756
17:36:47 times({tms_utime=12512, tms_stime=14273, tms_cutime=0, tms_cstime=0}) = 645608756
17:36:47 semtimedop(6062085, {{16, -1, 0}}, 1, {3, 0}^C <unfinished ...>
Process 21395 detached
從裡面,我們可以看到ckpt的動作方式。
5、結論
Strace是Linux環境下的一個非常有用的工具。用好這個工具,可以讓我們更加接近核心細節,豐富知識能力。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/31397003/viewspace-2148563/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 除錯跟蹤利器---strace除錯
- 用strace跟蹤分析oracle 10.2.0.1 rac lmd程式系列二Oracle
- 用strace除錯程式(zt)除錯
- PHP 程式碼除錯跟蹤工具 YtracePHP除錯
- Linux命令strace跟蹤程式的系統呼叫-linux學習用什麼書Linux
- oracle 10g 程式跟蹤命令Oracle 10g
- 啟用使用者程式跟蹤
- Linux中使用GDB除錯程式Linux除錯
- 跟蹤oracle特定報錯 errorstackOracleError
- 使用Error Stack跟蹤Oracle錯誤堆疊資訊ErrorOracle
- 視覺化程式碼跟蹤除錯與STM32應用設計視覺化除錯
- JBuilder Debug時不能進入源程式檔案視窗跟蹤除錯?UI除錯
- 使用ErrorStack進行Oracle錯誤跟蹤及診斷ErrorOracle
- 使用 GDB 除錯多程式程式除錯
- Oracle 跟蹤全部使用者Oracle
- oracle 跟蹤其他使用者Oracle
- 利用strace/pstack除錯Nginx除錯Nginx
- 如何收集Oracle程式中的SQL跟蹤資訊KUOracleSQL
- Linux GDB 程式除錯工具使用詳解Linux除錯
- HP下對程式的跟蹤
- jivejdon程式碼跟蹤問題
- 動態跟蹤分析VB程式
- 【Longkin】ASP.NET應用程式跟蹤---(一)跟蹤頁面ASP.NET
- Python 程式碼除錯—使用 pdb 除錯Python除錯
- 使用 Tkprof 分析 ORACLE 跟蹤檔案Oracle
- Oracle 10046跟蹤的使用Oracle
- 使用dtrace跟蹤oracle函式呼叫Oracle函式
- ORACLE 跟蹤工具Oracle
- [zt] oracle跟蹤檔案與跟蹤事件Oracle事件
- oracle跟蹤檔案與跟蹤事件(zt)Oracle事件
- oracle跟蹤檔案和跟蹤事件(zt)Oracle事件
- 【Longkin】ASP.NET應用程式跟蹤 --- (三) 在程式碼裡訪問跟蹤資訊ASP.NET
- linux strace追蹤mysql執行語句LinuxMySql
- 轉:使用 Tkprof 分析 ORACLE 跟蹤檔案Oracle
- vscode 除錯linux程式VSCode除錯Linux
- 【TRACE】Oracle跟蹤事件Oracle事件
- Oracle跟蹤會話Oracle會話
- Oracle 跟蹤事件【轉】Oracle事件