Oracle OS Watcher使用說明

ysping發表於2009-08-07

OSW是Oracle提供的一個用於作業系統監控的工具包,這個工具包可以從Metalink下載。Metalink Note:301137.1
 
OSW支援的平臺
OSW is certified to run on the following platforms:
AIX
Tru64
Solaris
HP-UX
Linux[@more@]

OSW的安裝
從metalink上下載後(並ftp 到unix server上),直接tar開就可以使用了。如果是compress格式的,先解壓縮,比如:
uncompress osw.tar.Z
tar xvf osw.tar
 
[cs1] /oracle> tar xvf osw.tar
x .
x ./osw
x ./osw/Exampleprivate.net, 1731 bytes, 4 media blocks.
x ./osw/OSWatcher.sh, 11784 bytes, 24 media blocks.
x ./osw/OSWatcherFM.sh, 4451 bytes, 9 media blocks.
x ./osw/OSWg.jar, 722088 bytes, 1411 media blocks.
x ./osw/oswnet.sh, 334 bytes, 1 media blocks.
x ./osw/oswsub.sh, 401 bytes, 1 media blocks.
x ./osw/startOSW.sh, 1101 bytes, 3 media blocks.
x ./osw/stopOSW.sh, 560 bytes, 2 media blocks.
x ./osw/tarupfiles.sh, 127 bytes, 1 media blocks.
x ./osw/topaix.sh, 409 bytes, 1 media blocks.
x ./osw/README, 4997 bytes, 10 media blocks.
x ./osw/OSWgREADME, 3426 bytes, 7 media blocks.
OSW的解除安裝
 
如果想解除安裝OSW,直接將osw的工作目錄刪除即可。
rm -fr ./osw
 
OSW的設定
 
當OSW安裝完畢後,OSW的解壓縮指令碼里就提供了管理osw的啟動和停止的scripts。當第一次執行OSW的時候,系統會自動在osw的目錄下建立一個子目錄archive,並在archive目錄下再建立7個子目錄。
[cs1] /oracle/osw> cd archive
[cs1] /oracle/osw/archive> ls -ltr
total 0
drwxr-xr-x 2 oracle dba 256 Jul 13 10:24 oswprvtnet
drwxr-xr-x 2 oracle dba 256 Jul 13 10:24 oswvmstat
drwxr-xr-x 2 oracle dba 256 Jul 13 10:24 oswtop
drwxr-xr-x 2 oracle dba 256 Jul 13 10:24 oswps
drwxr-xr-x 2 oracle dba 256 Jul 13 10:24 oswnetstat
drwxr-xr-x 2 oracle dba 256 Jul 13 10:24 oswmpstat
drwxr-xr-x 2 oracle dba 256 Jul 13 10:24 oswiostat
[cs1] /oracle/osw/archive>
 
如 果要收集private networks 資訊,必須要手工建立一個可執行檔案在osw目錄下,並且命名為private.net。這個檔案的設定可以參考osw目錄下提供的 Exampleprivate.net.這個檔案中包含了用來檢查RAC private networks 執行traceroute的命令.
比如:
traceroute -r -F node1
traceroute -r -F node2
OSW的啟動
 
啟動OSW非常簡單。只要執行startOSW.sh就可以了。該shell 需要2個引數,第一個引數指定取樣時間間隔,第二個指定儲存資料的時間。預設情況下(如果不輸入引數),OSW取樣以30秒為間隔,儲存24小時的資料.
 
對於超過保留期限的資料,File Manager 會自動清理(File Manager 每隔一個小時排程一次)
 
下面是我在cs1上執行的啟動過程:
[cs1] /oracle/osw> startOSW.sh 60 1
Testing for discovery of OS Utilities...VMSTAT found on your system.IOSTAT found on your system.MPSTAT found on your system.NETSTAT found on your system.

Discovery completed.

Starting OSWatcher V2.0.2 on Fri Jul 13 10:24:43 BEIST 2007
With SnapshotInterval = 60
With ArchiveInterval = 1OSWatcher - Written by Carl Davis, Center of Expertise, Oracle Corporation

Starting Data Collection...osw heartbeat:Fri Jul 13 10:24:43 BEIST 2007

[cs1] /oracle/osw>
 
OSW的停止
 
停止OSW 服務,透過stopOSW.sh來完成
 
./stopOSW.sh
OSW的診斷輸出
 
執行osw後,收集的資訊被放在archive下的對應的7個子目錄下,以ascii格式存放。檔案命名格式如下:
__MM.DD.YY.HH24.dat
 
比如我測試收集到的檔案如下:
[cs1] /oracle/osw/archive> ls -ltrR
total 0
drwxr-xr-x 2 oracle dba 256 Jul 13 10:24 oswprvtnet
drwxr-xr-x 2 oracle dba 256 Jul 13 11:00 oswvmstat
drwxr-xr-x 2 oracle dba 256 Jul 13 11:00 oswtop
drwxr-xr-x 2 oracle dba 256 Jul 13 11:00 oswps
drwxr-xr-x 2 oracle dba 256 Jul 13 11:00 oswnetstat
drwxr-xr-x 2 oracle dba 256 Jul 13 11:00 oswmpstat
drwxr-xr-x 2 oracle dba 256 Jul 13 11:00 oswiostat
./oswprvtnet:
total 0

./oswvmstat:
total 48
-rw-r--r-- 1 oracle dba 18199 Jul 13 10:59 cs1_vmstat_07.13.07.1000.dat
-rw-r--r-- 1 oracle dba 3024 Jul 13 11:05 cs1_vmstat_07.13.07.1100.dat

./oswtop:total 1704
-rw-r--r-- 1 oracle dba 741040 Jul 13 10:59 cs1_top_07.13.07.1000.dat
-rw-r--r-- 1 oracle dba 122988 Jul 13 11:05 cs1_top_07.13.07.1100.dat

./oswps:
total 1696
-rw-r--r-- 1 oracle dba 739085 Jul 13 10:59 cs1_ps_07.13.07.1000.dat
-rw-r--r-- 1 oracle dba 121733 Jul 13 11:05 cs1_ps_07.13.07.1100.dat

./oswnetstat:
total 696
-rw-r--r-- 1 oracle dba 297692 Jul 13 10:59 cs1_netstat_07.13.07.1000.dat
-rw-r--r-- 1 oracle dba 49644 Jul 13 11:05 cs1_netstat_07.13.07.1100.dat

./oswmpstat:
total 208
-rw-r--r-- 1 oracle dba 82064 Jul 13 10:59 cs1_mpstat_07.13.07.1000.dat
-rw-r--r-- 1 oracle dba 13714 Jul 13 11:05 cs1_mpstat_07.13.07.1100.dat

./oswiostat:
total 920
-rw-r--r-- 1 oracle dba 393502 Jul 13 10:59 cs1_iostat_07.13.07.1000.dat
-rw-r--r-- 1 oracle dba 65568 Jul 13 11:05 cs1_iostat_07.13.07.1100.dat
[cs1] /oracle/osw/archive>
OSW的診斷資訊欄位含義說明
 
 
欄位
描述
oswiostat

tin
Shows the total number of characters read by the system for all ttys
tout
Shows the total number of characters written by the system to all ttys.
kps
indicates the amount of data transferred (read or written) to the drive in KB per second
tps
Indicates the number of transfers per second that were issued to the physical disk. A transfer is an I/O request to the physical disk. Multiple logical requests can be combined into a single I/O request to the disk.
serv
average response time of transactions, in milliseconds
us
Percentage of CPU cycles spent on user processes
sy
Percentage of CPU cycles spent on system processes wt
wt
Shows the percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request
id
 
Percentage of unused CPU cycles or idle time when the CPU is basically doing nothing .
oswmpstat

cpu
Processor ID
minf
Minor faults
mif
Major Faults
xcal
Processor cross-calls (when one CPU wakes up another by interrupting it).
intr
Interrupts
ithr
 
Interrupts as threads (except clock)
csw
Context switches
icsw
Involuntary context switches
migr
Thread migrations to another processor
smtx
Number of times a CPU failed to obtain a mutex
srw
Number of times a CPU failed to obtain a read/write lock on the first try
syscl
Number of system calls
usr
Percentage of CPU cycles spent on user processes
sys
Percentage of CPU cycles spent on system processes
wt
Percentage of CPU cycles spent waiting on event
idl
Percentage of unused CPU cycles or idle time when the CPU is basically doing nothing
oswnetstat

name
Device name of interface
Mtu
Maximum transmission unit
Net
Network Segment Address
address
Network address of the device
ipkts
Input packets
Ierrs
Input errors
opkts
Output Packets
Oerrs
Output errors
collis
Collisions
queue
Number in the Queue
oswps

f
Flags s State of the process
uid
The effective user ID number of the process
pid
The process ID of the process
ppid
The process ID of the parent process.
d
Processor utilization for scheduling (obsolete).
pri
The priority of the process.
ni
Nice value, used in priority computation.
addr
The memory address of the process.
sz
The total size of the process in virtual memory, including all mapped files and devices, in pages.
wchan
The address of an event for which the process is sleeping (if blank, the process is running).
stime
The starting time of the process, given in hours, minutes, and seconds.
tty
The controlling terminal for the process (the message ?, is printed when there is no controlling terminal).
time
The cumulative execution time for the process.
cmd
The command name process is executing.
oswtop

PID
Process ID of process
USERNAME
Username of process
THR
Process thread PRI Priority of process
NICE
Nice value of process
SIZE
Total size of a process, including code and data, plus the stack space in kilobytes
RES
Amount of physical memory used by the process
STATE
Current CPU state of process. The states can be S for sleeping, D for uninterrupted, R for running, T for stopped/traced, and Z for zombied
TIME
The CPU time that a process has used since it started
%CPU
The CPU time that a process has used since the last update
COMMAND
The task's command name
oswvmstat

PROCS

r
Number of processes that are in a wait state and basically not doing anything but waiting to run
b
Number of processes that were in sleep mode and were interrupted since the last update
w
Number of processes that have been swapped out by mm and vm subsystems and have yet to run
MEMORY

swap
The amount of swap space currently available free The size of the free list
PAGE

re
page reclaims
mf
minor faults
pi
kilobytes paged in
po
kilobytes paged out
fr
kilobytes freed
de
anticipated short-term memory shortfall (Kbytes)
sr
pages scanned by clock algorithm
DISK

Bi
Disk blocks sent to disk devices in blocks per second
FAULTS

In
Interrupts per second, including the CPU clocks
Sy
System calls
Cs
Context switches per second within the kernel
CPU

Us
Percentage of CPU cycles spent on user processes
Sy
Percentage of CPU cycles spent on system processes
Id
Percentage of unused CPU cycles or idle time when the CPU is basically doing nothing


[cs1] /oracle/osw> java -jar OSWg.jar -i /oracle/osw/archive

Starting OSWg V2.0.4
OSWatcher Graph Written by Oracle Center of Expertise
Copyright (c) 2007 by Oracle Corporation

Parsing Data. Please Wait...

Parsing file cs1_vmstat_07.13.07.1000.dat ...
Parsing file cs1_vmstat_07.13.07.1100.dat ...

Parsing Completed.


Enter 1 to Display CPU Process Queue Graphs
Enter 2 to Display CPU Utilization Graphs
Enter 3 to Display CPU Other Graphs
Enter 4 to Display Memory Graphs

Enter 6 to Generate All CPU Gif Files
Enter 7 to Generate All Memory Gif Files

Enter L to Specify Alternate Location of Gif Directory
Enter T to Specify Different Time Scale
Enter D to Return to Default Time Scale
Enter R to Remove Currently Displayed Graphs
Enter Q to Quit Program

Please Select an Option:6
OSWG_RunQueue.gif
OSWG_BlockQueue.gif
OSWG_CpuIdle.gif
OSWG_CpuSystem.gif
OSWG_CpuUser.gif
OSWG_Interrupts.gif
OSWG_CS.gif


Enter 1 to Display CPU Process Queue Graphs
Enter 2 to Display CPU Utilization Graphs
Enter 3 to Display CPU Other Graphs
Enter 4 to Display Memory Graphs

Enter 6 to Generate All CPU Gif Files
Enter 7 to Generate All Memory Gif Files

Enter L to Specify Alternate Location of Gif Directory
Enter T to Specify Different Time Scale
Enter D to Return to Default Time Scale
Enter R to Remove Currently Displayed Graphs
Enter Q to Quit Program

Please Select an Option:q
[cs1] /oracle/osw>

如果要檢視輸出結果,可以直接在上面的選單中選擇你要檢視的資訊。

使用OSWg的注意事項
 
如果我們需要分析的檔案很多,那麼java需要更多memory,否則可能會遇到類似如下的錯誤:
java.lang.OutOfMemoryError
 
這個時候,我們就不得不增加java heap的大小。如果要增加java heap的大小,可以透過使用-Xmx 引數來設定,比如:
java -jar -Xmx10M OSWg.jar -i /oracle/osw/archive
 
[cs1] /oracle/osw> java -jar -Xmx10M OSWg.jar -i /oracle/osw/archive

Starting OSWg V2.0.4
OSWatcher Graph Written by Oracle Center of Expertise
Copyright (c) 2007 by Oracle Corporation

Parsing Data. Please Wait...

Parsing file cs1_vmstat_07.13.07.1000.dat ...
Parsing file cs1_vmstat_07.13.07.1100.dat ...

Parsing Completed.


Enter 1 to Display CPU Process Queue Graphs
Enter 2 to Display CPU Utilization Graphs
Enter 3 to Display CPU Other Graphs
Enter 4 to Display Memory Graphs

Enter 6 to Generate All CPU Gif Files
Enter 7 to Generate All Memory Gif Files

Enter L to Specify Alternate Location of Gif Directory
Enter T to Specify Different Time Scale
Enter D to Return to Default Time Scale
Enter R to Remove Currently Displayed Graphs
Enter Q to Quit Program

Please Select an Option:


ools:OSW工具-Oracle的OS watcher
作者:eygle |English Version

OSW是Oracle提供的一個用於作業系統監控的工具包,這個工具包可以從Metalink下載。Metalink Note:301137.1
下載解包以後即可使用:

$ tar -xvf osw.tar
x ., 0 bytes, 0 tape blocks
x ./osw, 0 bytes, 0 tape blocks
x ./osw/Exampleprivate.net, 1731 bytes, 4 tape blocks
x ./osw/OSWatcher.sh, 11784 bytes, 24 tape blocks
x ./osw/OSWatcherFM.sh, 4451 bytes, 9 tape blocks
x ./osw/OSWg.jar, 722088 bytes, 1411 tape blocks
x ./osw/oswnet.sh, 334 bytes, 1 tape blocks
x ./osw/oswsub.sh, 401 bytes, 1 tape blocks
x ./osw/startOSW.sh, 1101 bytes, 3 tape blocks
x ./osw/stopOSW.sh, 560 bytes, 2 tape blocks
x ./osw/tarupfiles.sh, 127 bytes, 1 tape blocks
x ./osw/topaix.sh, 409 bytes, 1 tape blocks
x ./osw/README, 4997 bytes, 10 tape blocks
x ./osw/OSWgREADME, 3426 bytes, 7 tape blocks

呼叫執行從startOSW.sh檔案開始,該檔案需要兩個引數,第一個引數指定取樣時間間隔,第二個指定儲存資料的時間。預設的,如果不輸入引數,OSW取樣以30秒為間隔,儲存2448小時的資料。

以下是自定義引數的執行過程:

bash-2.03$ ./startOSW.sh 30 240
bash-2.03$
Testing for discovery of OS Utilities...

VMSTAT found on your system.
IOSTAT found on your system.
MPSTAT found on your system.
NETSTAT found on your system.
TOP found on your system.

Discovery completed.

Starting OSWatcher V2.0.2 on Tue Jul 3 14:40:21 CST 2007
With SnapshotInterval = 30
With ArchiveInterval = 240

OSWatcher - Written by Carl Davis, Center of Expertise, Oracle Corporation

Starting Data Collection...

osw heartbeat:Tue Jul 3 14:40:21 CST 2007
osw heartbeat:Tue Jul 3 14:40:51 CST 2007
osw heartbeat:Tue Jul 3 14:41:21 CST 2007

分析監控得到的資料,可以得到圖表輸出
具體設定可以參考OSWgREADME檔案中的說明。

OSW生成的圖表非常直觀,可以用來對伺服器的執行狀況進行監控和報告:

-The End-
-----

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/670493/viewspace-1025013/,如需轉載,請註明出處,否則將追究法律責任。

相關文章