突然停電,資料庫不能啟動

wmlm發表於2008-07-23

-- XXXX庫恢復一例

-- XXXX庫,說昨天停電了,伺服器啟來後,資料庫不能啟動
-- 紅旗LINUX, ORACLE 9I RAC, OCFS

-- 檢視了一下,告警日誌中提示如下:
Tue Jul 22 14:33:39 2008
ARC1: Archival stopped
Tue Jul 22 14:33:43 2008
ARCH: Archiving is disabled
Shutting down archive processes
Archiving is disabled
Archive process shutdown avoided: 0 active
Tue Jul 22 14:34:55 2008
Starting ORACLE instance (normal)
Tue Jul 22 14:34:55 2008
Global Enqueue Service Resources = 10946, pool = 4
Tue Jul 22 14:34:55 2008
Global Enqueue Service Enqueues = 16733
Tue Jul 22 14:34:55 2008
WARNING: EINVAL creating segment of size 0x00000000860c5000
fix shm parameters in /etc/system or equivalent
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
SCN scheme 1
Using log_archive_dest parameter default value
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up ORACLE RDBMS Version: 9.2.0.6.0.
System parameters with non-default values:
processes = 500
timed_statistics = TRUE
shared_pool_size = 838860800
sga_max_size = 2232914336
large_pool_size = 16777216
java_pool_size = 0
spfile = /oradata/pdsdb/spfilepdsdb.ora
control_files = /oradata/pdsdb/control01.ctl, /oradata/pdsdb/control02.ctl,

/oradata/pdsdb/control03.ctl
db_block_size = 8192
db_cache_size = 1056964608
compatible = 9.2.0.0.0
log_archive_start = TRUE
log_archive_dest_1 = LOCATION=/arch1
log_archive_format = %t_%s.dbf
db_file_multiblock_read_count= 16
cluster_database = TRUE
cluster_database_instances= 2
thread = 1
fast_start_mttr_target = 300
instance_number = 1
undo_management = AUTO
undo_tablespace = UNDOTBS1
undo_retention = 10800
remote_login_passwordfile= EXCLUSIVE
db_domain =
instance_name = pdsdb1
job_queue_processes = 10
hash_join_enabled = TRUE
background_dump_dest = /opt/oracle/admin/pdsdb/bdump
user_dump_dest = /opt/oracle/admin/pdsdb/udump
core_dump_dest = /opt/oracle/admin/pdsdb/cdump
sort_area_size = 16777216
db_name = pdsdb
open_cursors = 300
star_transformation_enabled= FALSE
query_rewrite_enabled = FALSE
pga_aggregate_target = 524288000
aq_tm_processes = 1
Tue Jul 22 14:34:55 2008
cluster interconnect IPC version:Oracle UDP/IP
IPC Vendor 1 proto 2 Version 1.0
PMON started with pid=2
DIAG started with pid=3
LMON started with pid=4
LMD0 started with pid=5
LMS0 started with pid=6
LMS1 started with pid=7
DBW0 started with pid=8
LGWR started with pid=9
CKPT started with pid=10
SMON started with pid=11
RECO started with pid=12
CJQ0 started with pid=13
QMN0 started with pid=14
Tue Jul 22 14:34:59 2008
ARCH: STARTING ARCH PROCESSES
ARC0 started with pid=15
ARC0: Archival started
ARC1 started with pid=16
Tue Jul 22 14:34:59 2008
ARCH: STARTING ARCH PROCESSES COMPLETE
Tue Jul 22 14:34:59 2008
ARC1: Archival started
Tue Jul 22 14:34:59 2008
Tue Jul 22 14:34:59 2008
ARC1: Thread not mounted
ARC0: Thread not mounted
Tue Jul 22 14:34:59 2008
ALTER DATABASE MOUNT
Tue Jul 22 14:34:59 2008
lmon registered with NM - instance id 1 (internal mem no 0)
Tue Jul 22 14:34:59 2008
Reconfiguration started (old inc 0, new inc 1)
List of nodes:
0
Global Resource Directory frozen
one node partition
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Resources and enqueues cleaned out
Resources remastered 0
0 GCS shadows traversed, 0 cancelled, 0 closed
0 GCS resources traversed, 0 cancelled
set master node info
Submitted all remote-enqueue requests
Update rdomain variables
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
0 GCS shadows traversed, 0 replayed, 0 unopened
Submitted all GCS remote-cache requests
0 write requests issued in 0 GCS resources
0 PIs marked suspect, 0 flush PI msgs
Tue Jul 22 14:35:00 2008
Reconfiguration complete
Post SMON to start 1st pass IR
Tue Jul 22 14:35:00 2008
This instance was first to mount
LCK0 started with pid=18
***
Corrupt block relative dba: 0x00000001 (file 0, block 1)
Fractured block found during control file header read
Data in bad block -
type: 21 format: 2 rdba: 0x00000001
last change scn: 0xffff.000048b1 seq: 0x1 flg: 0x04
consistency value in tail: 0x48ae1501
check value in block header: 0x5a3, computed block checksum: 0x1f
spare1: 0x0, spare2: 0x0, spare3: 0x0
***

[@more@]***
Corrupt block relative dba: 0x00000001 (file 0, block 1)
Fractured block found during control file header read
Data in bad block -
type: 21 format: 2 rdba: 0x00000001
last change scn: 0xffff.000048b1 seq: 0x1 flg: 0x04
consistency value in tail: 0x48ae1501
check value in block header: 0x5a3, computed block checksum: 0x1f
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
***
Corrupt block relative dba: 0x00000001 (file 0, block 1)
Fractured block found during control file header read
Data in bad block -
type: 21 format: 2 rdba: 0x00000001
last change scn: 0xffff.000048b1 seq: 0x1 flg: 0x04
consistency value in tail: 0x48ae1501
check value in block header: 0x5a3, computed block checksum: 0x1f
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Tue Jul 22 14:35:02 2008
Errors in file /opt/oracle/admin/pdsdb/udump/pdsdb1_ora_7578.trc:
ORA-00202: controlfile: '/oradata/pdsdb/control01.ctl'
ORA-227 signalled during: ALTER DATABASE MOUNT...
Tue Jul 22 14:41:02 2008
Restarting dead background process QMN0
QMN0 started with pid=14
Tue Jul 22 14:47:05 2008
Restarting dead background process QMN0
QMN0 started with pid=14
Shutting down instance: further logons disabled
Shutting down instance (immediate)
License high water mark = 3
Tue Jul 22 14:52:45 2008
ALTER DATABASE CLOSE NORMAL
ORA-1507 signalled during: ALTER DATABASE CLOSE NORMAL...
ARCH: Archiving is disabled
Shutting down archive processes
Archiving is disabled

-- 什麼原因呢?壞塊?
-- 檢查一下環境ocm沒有啟,難道? 於是乎lsmod ocm start gsdctl stat ...... 結果呢?濤聲依舊

-- 重建控制檔案。有備份嗎?沒有。全文搜尋一下,也沒有。有管理員嗎?連個備份也沒有!
-- 手工重建吧
backup ctl files to ctl_bak directory,
ls -l /oradata/pdsdb/。。。。。。

-- 重建1 使用noresetlogs archivelog,結果OPEN時報錯
-- 重建2 使用resetlogs archivelog,結果同上 . 咋回事? 匯出個PFILE看看,於是
cp initpdsdb1.ora initpdsdb1.ora.bak
create pfile from spfile;
還是沒有發現問題:<
-- 重建3 檢視日誌,如下:
Creating archive destination LOG_ARCHIVE_DEST_1: '/arch1/1_2852.dbf'
Tue Jul 22 15:43:40 2008
Errors in file /opt/oracle/admin/pdsdb/udump/pdsdb1_ora_8010.trc:
ORA-00600: internal error code, arguments: [25012], [1], [2], [], [], [], [], []
Tue Jul 22 15:43:41 2008
Errors in file /opt/oracle/admin/pdsdb/udump/pdsdb1_ora_8010.trc:
ORA-00600: internal error code, arguments: [25012], [1], [2], [], [], [], [], []
Tue Jul 22 15:43:41 2008
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 8010
ORA-1092 signalled during: alter database open...
同事上網G一把,說是找不到UNDO,NND,難道create controlfile 時沒有加UNDO?
看看指令碼檔案,果然沒加,汗......
加上重來,....除了網路狀況差,其它沒有遇到什麼大的障礙,DB1順利啟動.

這麼一來,去DB2上也啟一把.結果說另外一例項是EXCLUSIVE,不能啟動共享伺服器! ?
回頭看看,剛才改了初始化引數,恢復過來,SHUTDOWN DB1 然後重來,OK.

小結:正確判斷,細心處理

下面是一些參考資料
create controlfile reuse database pdsdb noresetlogs archivelog
maxlogfiles 5
maxlogmembers 3
maxloghistory 226
maxdatafiles 100
maxinstances 6
logfile
group 1 ('/oradata/pdsdb/redo01_1.log','/oradata/pdsdb/redolog01_2.log') size 100M,
group 2 ('/oradata/pdsdb/redo02_1.log','/oradata/pdsdb/redolog02_2.log') size 100M,
group 3 ('/oradata/pdsdb/redo03_1.log','/oradata/pdsdb/redolog03_2.log') size 100M,
group 4 ('/oradata/pdsdb/redolog04_1.log','/oradata/pdsdb/redo04_2.log') size 100M
datafile '/oradata/pdsdb/indx01.dbf' size 500m,
'/oradata/pdsdb/system01.dbf' size 2000m,
'/oradata/pdsdb/tools01.dbf' size 500m,
'/oradata/pdsdb/users01.dbf' size 8196m,
'/oradata/pdsdb/users02.dbf' size 8196m,
'/oradata/pdsdb/users03.dbf' size 8196m,
'/oradata/pdsdb/users04.dbf' size 8192m,
'/oradata/pdsdb/users05.dbf' size 8192m,
'/oradata/pdsdb/users06.dbf' size 8192m,
'/oradata/pdsdb/users07.dbf' size 8196m,
'/oradata/pdsdb/users08.dbf' size 8196m,
'/oradata/pdsdb/users09.dbf' size 8196m,
'/oradata/pdsdb/users10.dbf' size 8196m,
'/oradata/pdsdb/users11.dbf' size 8196m,
'/oradata/pdsdb/users12.dbf' size 8196m,
'/oradata/pdsdb/users13.dbf' size 8196m,
'/oradata/pdsdb/users14.dbf' size 8196m,
'/oradata/pdsdb/users15.dbf' size 8196m,
'/oradata/pdsdb/users16.dbf' size 8196m,
'/oradata/pdsdb/users17.dbf' size 8196m,
'/oradata/pdsdb/users18.dbf' size 8196m,
'/oradata/pdsdb/users19.dbf' size 8196m,
'/oradata/pdsdb/users20.dbf' size 8196m,
'/oradata/pdsdb/undotbs01.dbf' ,
'/oradata/pdsdb/undotbs02.dbf'
character set zhs16gbk
;


recover database;
alter system archive log all;
alter database open ;

alter database temp add tepfile '/oradata/pdsdb/temp01.dbf' size reuse autoextend off;

-- 不能open
Creating archive destination LOG_ARCHIVE_DEST_1: '/arch1/1_2852.dbf'
Tue Jul 22 15:43:40 2008
Errors in file /opt/oracle/admin/pdsdb/udump/pdsdb1_ora_8010.trc:
ORA-00600: internal error code, arguments: [25012], [1], [2], [], [], [], [], []
Tue Jul 22 15:43:41 2008
Errors in file /opt/oracle/admin/pdsdb/udump/pdsdb1_ora_8010.trc:
ORA-00600: internal error code, arguments: [25012], [1], [2], [], [], [], [], []
Tue Jul 22 15:43:41 2008
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 8010
ORA-1092 signalled during: alter database open...

-- db2 can't startup
SQL> startup
ORACLE instance started.

Total System Global Area 2232914336 bytes
Fixed Size 745888 bytes
Variable Size 1174405120 bytes
Database Buffers 1056964608 bytes
Redo Buffers 798720 bytes
ORA-01183: cannot mount database in SHARED mode

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/271063/viewspace-1007688/,如需轉載,請註明出處,否則將追究法律責任。

相關文章