由AIX系統故障導致系統重啟,使Oracle資料庫自動啟動例項

mengzhaoliang發表於2009-04-08

/*
*時間:2009-04-08  Wednesday
*環境:AIX5.3   Oracle10g10.2.0.1.0
*標題:由AIX系統故障導致系統重啟,使Oracle資料庫自動啟動例項
*/

1、檢視資料庫alert_SID.log日誌,發現資料庫例項沒有關閉,就重新啟動了。這個現象比較奇怪。
  
Sat Apr  4 00:02:30 2009
Thread 1 advanced to log sequence 6560
  Current log# 1 seq# 6560 mem# 0: /oracle/oms/redolog/redo01.log
  Current log# 1 seq# 6560 mem# 1: /oracle/oms/mirrlog/redo01m.log
Thread 1 advanced to log sequence 6561
  Current log# 4 seq# 6561 mem# 0: /oracle/oms/redolog/redo04.log
  Current log# 4 seq# 6561 mem# 1: /oracle/oms/mirrlog/redo04m.log
Sat Apr  4 00:05:40 2009
Starting control autobackup
Sat Apr  4 00:07:40 2009
Control autobackup written to SBT_TAPE device
 comment 'API Version 2.0,MMS Version 5.3.3.0',
 media '3'
 handle 'c-2813856949-20090404-00'
Sat Apr  4 03:15:07 2009
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Picked latch-free SCN scheme 3
Autotune of undo retention is turned on.
IMODE=BR
ILAT =61
LICENSE_MAX_USERS = 0
SYS auditing is enabled
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.1.0.
System parameters with non-default values:
  processes                = 500
  sessions                 = 555
  __shared_pool_size       = 436207616
  __large_pool_size        = 16777216
  __java_pool_size         = 16777216
  __streams_pool_size      = 0
  sga_target               = 3221225472
  control_files            = /oracle/oms/102_64/dbs/cntrl/control01.ctl,

/oracle/oms/oradata/sysdata/cntrl/control02.ctl, /oracle/oms/oradata/undo/cntrl/control03.ctl
  db_block_size            = 8192
  __db_cache_size          = 1660944384
  db_16k_cache_size        = 1073741824
  compatible               = 10.2.0.1.0
  log_archive_dest_1       = LOCATION=/oracle/oms/oraarch
  log_archive_format       = %t_%s_%r.dbf
  db_file_multiblock_read_count= 16
  undo_management          = AUTO
  undo_tablespace          = UNDOTBS1
  remote_login_passwordfile= EXCLUSIVE
  audit_sys_operations     = TRUE
  db_domain                =
  dispatchers              = (PROTOCOL=TCP) (SERVICE=TESTXDB)
  session_cached_cursors   = 100
  utl_file_dir             = /oracle/oms
  job_queue_processes      = 10
  background_dump_dest     = /oracle/oms/admin/TEST/bdump
  user_dump_dest           = /oracle/oms/admin/TEST/udump
  core_dump_dest           = /oracle/oms/admin/TEST/cdump
  audit_file_dest          = /oracle/oms/admin/TEST/adump
  audit_trail              = DB
  db_name                  = TEST
  open_cursors             = 1500
  pga_aggregate_target     = 1073741824
PMON started with pid=2, OS id=495724
PSP0 started with pid=3, OS id=467102
MMAN started with pid=4, OS id=434250
DBW0 started with pid=5, OS id=389216
LGWR started with pid=6, OS id=336070
CKPT started with pid=7, OS id=385138
SMON started with pid=8, OS id=516132
RECO started with pid=9, OS id=524298
CJQ0 started with pid=10, OS id=327826
MMON started with pid=11, OS id=454662
Sat Apr  4 03:15:08 2009
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
MMNL started with pid=12, OS id=512008
Sat Apr  4 03:15:08 2009
starting up 1 shared server(s) ...
Sat Apr  4 03:15:09 2009
ALTER DATABASE   MOUNT
Sat Apr  4 03:15:15 2009
Setting recovery target incarnation to 1
Sat Apr  4 03:15:15 2009
Successful mount of redo thread 1, with mount id 2852742015
Sat Apr  4 03:15:15 2009
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Sat Apr  4 03:15:15 2009
ALTER DATABASE OPEN
Sat Apr  4 03:15:16 2009
Beginning crash recovery of 1 threads
 parallel recovery started with 7 processes
Sat Apr  4 03:15:16 2009
Started redo scan
Sat Apr  4 03:15:16 2009
Completed redo scan
 17845 redo blocks read, 3472 data blocks need recovery

在日誌中沒有報啟動例項的原因,可能是系統的錯誤。


2、檢視了AIX系統最近重啟的時間
命令:last  reboot
LHXXDBS01:/> last reboot
reboot    ~                                   Apr 04 03:12
reboot    ~                                   Dec 28 09:25
reboot    ~                                   Dec 19 10:56
reboot    ~                                   Nov 25 21:17


果然是由系統引起了,AIX系統在2009年4月4日 3:12重啟了系統,在3:15  Oracle資料庫隨著系統自動啟動了例項.


3、AIX系統的錯誤:
LHXXDBS01:/> errpt | more
IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
C69F5C9B   0408085909 P S SYSPROC        SOFTWARE PROGRAM ABNORMALLY TERMINATED
12081DC6   0405030609 P S harmad         SOFTWARE PROGRAM ERROR
3D32B80D   0405030609 P S topsvcs        NIM thread blocked
3D32B80D   0405030609 P S topsvcs        NIM thread blocked
3D32B80D   0405030609 P S topsvcs        NIM thread blocked
AFA89905   0404031309 I O grpsvcs        Group Services daemon started
97419D60   0404031309 I O topsvcs        Topology Services daemon started
A6DF45AA   0404031309 I O RMCdaemon      The daemon is started.
67145A39   0404031209 U S SYSDUMP        SYSTEM DUMP
F48137AC   0404031109 U O minidump       COMPRESSED MINIMAL DUMP
225E3B63   0404031109 T S PANIC          SOFTWARE PROGRAM ABNORMALLY TERMINATED
9DBCFDEE   0404031209 T O errdemon       ERROR LOGGING TURNED ON
3D32B80D   0401032209 P S topsvcs        NIM thread blocked

TIMESTAMP: MMDDHHMMYY (月日時分年)
T(型別): P 永久; T 臨時; U 未知 (永久性的錯誤應引起重視)
C(分類): H 硬體; S 軟體; O 使用者; U未知

 

#errpt -d H 列出所有硬體出錯資訊
#errpt -d S 列出所有軟體出錯資訊
#errpt -aj ERROR_ID 列出詳細出錯資訊
#errpt -aj 0502f666

LHXXDBS01:/> errpt -d  S
IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
C69F5C9B   0408085909 P S SYSPROC        SOFTWARE PROGRAM ABNORMALLY TERMINATED
12081DC6   0405030609 P S harmad         SOFTWARE PROGRAM ERROR
3D32B80D   0405030609 P S topsvcs        NIM thread blocked
3D32B80D   0405030609 P S topsvcs        NIM thread blocked
3D32B80D   0405030609 P S topsvcs        NIM thread blocked
67145A39   0404031209 U S SYSDUMP        SYSTEM DUMP
225E3B63   0404031109 T S PANIC          SOFTWARE PROGRAM ABNORMALLY TERMINATED
3D32B80D   0401032209 P S topsvcs        NIM thread blocked


LHXXDBS01:/> errpt -aj  67145A39
---------------------------------------------------------------------------
LABEL:          DUMP_STATS
IDENTIFIER:     67145A39

Date/Time:       Sat Apr  4 03:12:14 BEIST 2009
Sequence Number: 2023
Machine Id:      00051BF7D600
Node Id:         localhost
Class:           S
Type:            UNKN
Resource Name:   SYSDUMP

Description
SYSTEM DUMP

Probable Causes
UNEXPECTED SYSTEM HALT

User Causes
SYSTEM DUMP REQUESTED BY USER

        Recommended Actions
        PERFORM. PROBLEM DETERMINATION PROCEDURES

Failure Causes
UNEXPECTED SYSTEM HALT

        Recommended Actions
        PERFORM. PROBLEM DETERMINATION PROCEDURES

Detail Data
DUMP DEVICE
/dev/lg_dumplv
DUMP SIZE
             310838272
TIME
Sat Apr  4 03:08:04 2009
DUMP TYPE (1 = PRIMARY, 2 = SECONDARY)
           1
DUMP STATUS
           0
ERROR CODE
           0
DUMP INTEGRITY
Compressed dump - Run dmpfmt with -c flag                                 on dump after uncompressing.
FILE NAME

PROCESSOR ID 6


一般dump是由於軟體出錯引起(888-102-207 除外),機器通常可以重啟。

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/12778571/viewspace-586707/,如需轉載,請註明出處,否則將追究法律責任。

相關文章