客戶反映,下午4點多,**平臺訪問速度突然變得很慢。
因為這套系統關係到全省**平臺的使用,平時都是24小時不間斷執行的,這次這個故障,很可能導致應用的停滯。因為資料庫又沒有備份,客戶很擔心,資料庫一旦失敗,就再起不來了。
因此客戶一檢查到資料庫出現了昨天**平臺資料庫一樣的問題,氣氛一下就變得緊張了。
<wbr>
這是一套 構建在AIX 5.3.10 上的 oracle 10.2.0.4
RAC應用,兩節點間以負載均衡的模式對外提供服務。因為是核心應用,該平臺部署在2臺 IBM
P595的兩個分割槽上,配有72G記憶體和32顆POWER 5+的 CPU。客戶和我反映,平時最忙的時候,系統的CPU 和
記憶體的使用率也只有50%-60%左右。
到達客戶現場之後,我很快查詢了資料庫的錯誤日誌。
<wbr>
日誌資訊如下:
zhyw2 :
Tue Aug 17 22:59:46 2010
Errors in file
/opt/oracle/admin/bsp/bdump/bsp1922_j000_729190.trc:
ORA-12012: error on auto execute of job 145
ORA-12008: error in materialized view refresh path
ORA-00376: file 43 cannot be read at this time
ORA-01110: data file 43: '/dev/rlv_raw37_16g'
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2251
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2457
ORA-06512: at "SYS.DBMS_IREFRESH", line 685
ORA-06512: at "SYS.DBMS_REFRESH", line 195
ORA-06512: at line 1
<wbr>
Tue Aug 17 21:39:46 2010
KCF: write/open error block=0xb080f online=1
file=54 /dev/rlv_raw48_16g
error=27063 txt: 'IBM AIX RISC System/6000
Error: 47: Write-protected media<wbr>
Additional information: -1
Additional information: 8192'
Automatic datafile offline due to write error on
Tue Aug 17 21:55:46 2010
Errors in file
/opt/oracle/admin/bsp/udump/bsp1922_ora_406246.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-01115: IO error reading block from file 35 (block #
923276)
ORA-01110: data file 35: '/dev/rlv_raw29_16g'
ORA-27091: unable to queue I/O
ORA-27072: File I/O error
IBM AIX RISC System/6000 Error: 5: I/O error
Additional information: 7
Additional information: 923275
Additional information: 923275
Additional information: -1
Tue Aug 17 23:47:35 2010
System State dumped to trace file
/opt/oracle/admin/bsp/bdump/bsp1922_diag_63911
0.trc
Tue Aug 17 23:48:21 2010
Errors in file
/opt/oracle/admin/bsp/bdump/bsp1922_m002_430378.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00028: your session has been killed
ORA-06512: at "SYS.PRVT_HDM", line 10
ORA-06512: at "SYS.WRI$_ADV_HDM_T", line 16
ORA-06512: at "SYS.PRVT_ADVISOR", line 1535
ORA-06512: at "SYS.PRVT_ADVISOR", line 1618
ORA-06512: at "SYS.PRVT_HDM", line 106
ORA-06512: at line 1
<wbr>
<wbr>
我很快的過濾了一下兩個節點的日誌資訊,找到了我們關心的幾個問題,如下所示:
file 54: /dev/rlv_raw48_16g
ORA-01110: data file 35: '/dev/rlv_raw29_16g'
ORA-01110: data file 43: '/dev/rlv_raw37_16g'
<wbr>
看來故障也是因資料庫檔案的訪問出錯導致的。
通過下面命令,我檢視了相關資料庫檔案當前的狀態資訊:
select name,status from v$datafile;
NAME
STATUS <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
--------------------
-------------------- <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/rlv_system_8g
SYSTEM <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/rlv_undot11_8g
ONLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/rlv_sysaux_8g
ONLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/rlv_user_8g
ONLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/rlv_undot12_8g
ONLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/rlv_raw29_16g
ONLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/rlv_raw37_16g
RECOVER <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/rlv_raw48_16g
ONLINE<wbr><wbr>
從上可以看到,當前資料塊
"/dev/rlv_raw37_16g" 處於recover狀態,需要做恢復。
其他幾個資料檔案顯示的狀態是正常的。<wbr>
<wbr>
<wbr>
我檢視了下RAC兩節點的歸檔日誌資訊
[oracle@zhyw1]$ls -l
total 135787576
-rw-r-----
1 oracle oinstall 16350676480
Jul 29 12:16 bsp1921_1_227_713969898.arc<wbr><wbr><wbr><wbr><wbr>
-rw-r-----
1 oracle oinstall 16350670336
Aug 3 17:46 bsp1921_1_228_713969898.arc<wbr><wbr><wbr><wbr><wbr><wbr>
-rw-rw----
1 oracle oinstall 4119506432
Aug 4 21:15 bsp1921_1_229_713969898.arc<wbr><wbr><wbr><wbr><wbr><wbr>
-rw-rw----
1 oracle oinstall 16350673408
Aug 10 15:35 bsp1921_1_230_713969898.arc<wbr><wbr><wbr><wbr><wbr>
-rw-rw----
1 oracle oinstall 16350669824
Aug 14 21:45 bsp1921_1_231_713969898.arc<wbr><wbr><wbr><wbr><wbr>
drwxr-xr-x
2
root
system
256 Mar 16 09:15 lost+found<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
[oracle@zhyw1]$cd /arch2
[oracle@zhyw1]$ls -l
total 281756560
-rw-r-----
1 oracle oinstall 16350686720
Jul 22 09:47 bsp1922_2_221_713969898.arc<wbr><wbr><wbr><wbr><wbr>
-rw-r-----
1 oracle oinstall 16350676480
Jul 23 18:56 bsp1922_2_222_713969898.arc<wbr><wbr><wbr><wbr><wbr>
-rw-r-----
1 oracle oinstall 16350677504
Jul 28 18:11 bsp1922_2_223_713969898.arc<wbr><wbr><wbr><wbr><wbr>
-rw-r-----
1 oracle oinstall 16350675968
Aug 2 11:23 bsp1922_2_224_713969898.arc<wbr><wbr><wbr><wbr><wbr><wbr>
-rw-rw----
1 oracle oinstall 13451708416
Aug 4 18:57 bsp1922_2_225_713969898.arc<wbr><wbr><wbr><wbr><wbr><wbr>
-rw-rw----
1 oracle oinstall 16350674432
Aug 8 20:05 bsp1922_2_226_713969898.arc<wbr><wbr><wbr><wbr><wbr><wbr>
-rw-rw----
1 oracle oinstall 16350808064
Aug 11 10:49 bsp1922_2_227_713969898.arc<wbr><wbr><wbr><wbr><wbr>
-rw-rw----
1 oracle oinstall 16350674944
Aug 13 16:46 bsp1922_2_228_713969898.arc<wbr><wbr><wbr><wbr><wbr>
-rw-rw----
1 oracle oinstall 16350668288
Aug 17 09:46 bsp1922_2_229_713969898.arc<wbr><wbr><wbr><wbr><wbr>
drwxr-xr-x
2
root
system
256 Mar 16 14:20 lost+found<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
[oracle@zhyw1]$
<wbr>
我心裡暗暗的慶幸,看來歸檔的保留還比較完整,資料庫資料檔案的恢復應該沒有問題。
在我正在檢查資料庫情況的時候,突然發現節點2的instance狀態不正常了。
<wbr>
從節點2 檢視群集資訊,報了下面的錯誤
檢視群集狀態:如下:
# crsctl check crs
Failure 1 contacting CSS daemon
Cannot communicate with CRS
Cannot communicate with EVM
[oracle@zhyw2]$crs_stat -t
IOT/Abort trap
<wbr>
而從節點1用crs_stat 檢視,發現節點2的instance已經down掉了!
<wbr>
# crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
# crs_stat -t
Name
Type
Target
State
Host <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
------------------------------------------------------------
ora....921.srv
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....922.srv
application
ONLINE
OFFLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....p192.cs
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....21.inst
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....22.inst
application
ONLINE
OFFLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.bsp.db
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....W1.lsnr
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw1.gsd
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw1.ons
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw1.vip
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....W2.lsnr
application
ONLINE
ONLINE
zhyw2 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw2.gsd
application
ONLINE
OFFLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw2.ons
application
ONLINE
OFFLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw2.vip
application
ONLINE
ONLINE
zhyw2 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr>
# crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
# crs_stat -t
Name
Type
Target
State
Host <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
------------------------------------------------------------
ora....921.srv
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....922.srv
application
ONLINE
OFFLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....p192.cs
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....21.inst
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....22.inst
application
ONLINE
OFFLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.bsp.db
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....W1.lsnr
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw1.gsd
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw1.ons
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw1.vip
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....W2.lsnr
application
ONLINE
ONLINE
zhyw2 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw2.gsd
application
ONLINE
OFFLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw2.ons
application
ONLINE
OFFLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw2.vip
application
ONLINE
ONLINE
zhyw2 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr>
我在第2個節點上嘗試對crs 程式重啟,並觀察ocssd.log
資訊如下:
<wbr>
[
CSSD]2010-08-18 00:00:42.061 [2572]
>TRACE:
Authentication OSD error, op:<wbr><wbr><wbr><wbr><wbr>
scls_auth_response_prepare<wbr>
loc: mkdir<wbr>
info: failed to make dir
/opt/oracle/product/10.2.0/crs/css/auth/A3572513, No s<wbr>
pace left on device
dep: 28
[
CSSD]2010-08-18 00:00:42.489 [2572]
>TRACE:
Authentication OSD error, op:<wbr><wbr><wbr><wbr><wbr>
scls_auth_response_prepare<wbr>
loc: mkdir<wbr>
info: failed to make dir
/opt/oracle/product/10.2.0/crs/css/auth/A1193328, No s<wbr>
pace left on device
dep: 28
[
CSSD]2010-08-18 00:00:42.544 [2572]
>TRACE:
Authentication OSD error, op:<wbr><wbr><wbr><wbr><wbr>
scls_auth_response_prepare<wbr>
loc: mkdir<wbr>
info: failed to make dir
/opt/oracle/product/10.2.0/crs/css/auth/A5267322, No s<wbr>
pace lef
<wbr>
上面的一條info 引起了我的注意:怎麼會報no space left?
難道空間滿了,我馬上df 檢視了下第二個節點空間的使用情況:
檢視當前的容量資訊:
# df -k
Filesystem
1024-blocks
Free %Used
Iused %Iused Mounted on<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/hd4
2097152
1465464
31%
7967
3% /<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/hd2
3145728
1196032
62%
42303 14%
/usr<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/hd9var
1048576
585188
45%
7592
6% /var<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/hd3
5242880
3774380
29%
748
1% /tmp<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/hd1
20971520
9642316
55%
8164
1% /home<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/proc
-
-
-
-
- /proc<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/hd10opt
31457280
0
100%
78501 92%
/opt<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/archlv
298188800 157264636
48%
14
1% /arch2<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
10.142.56.2:/arch1 298188800
230249144
23%
9
1% /arch1<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
#
<wbr>
原來/opt 資料夾滿了! 估計是因為故障,oracle資料庫不斷產生trace檔案,而trace
檔案把這個目錄撐死了。如果這樣的話,那估計第一個節點也撐不了多久了。我檢視下第一個節點的/opt 目錄空間,果然也倒了
92%了。
<wbr>
為了以後分析的可能性,我暫時不想刪除oracle 的
trace檔案,於是我用下面命令確認rootvg是否有足夠的剩餘空間,檢視當前的rootvg剩餘空間如下:
# lsvg rootvg
VOLUME
GROUP:
rootvg
VG IDENTIFIER:
00c450d500004c000000012795dce835<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
VG
STATE:
active
PP
SIZE:
256 megabyte(s)<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
VG
PERMISSION:
read/write
TOTAL
PPs:
1092 (279552 megabytes)<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
MAX
LVs:
256
FREE
PPs:
304 (77824 megabytes)<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
LVs:
10
USED
PPs:
788 (201728 megabytes)<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
OPEN
LVs:
9
QUORUM:
1 (Disabled)<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
TOTAL
PVs:
2
VG DESCRIPTORS: 3<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
STALE
PVs:
0
STALE
PPs:
0<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ACTIVE
PVs:
2
AUTO
ON:
yes<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
MAX PPs per
VG:
32512 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
MAX PPs per
PV:
1016
MAX
PVs:
32<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
LTG size (Dynamic): 256
kilobyte(s)
AUTO
SYNC:
no<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
HOT
SPARE:
no
BB
POLICY:
relocatable<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr>
可以看到,剩餘容量為77G,於是我選擇對OPT資料夾進行擴充套件,
smitty jfs2->
擴充套件之後,容量如下:
# df -k
Filesystem
1024-blocks
Free %Used
Iused %Iused Mounted on<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/hd4
2097152
1465392
31%
7967
3% /<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/hd2
3145728
1196032
62%
42303 14%
/usr<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/hd9var
1048576
585260
45%
7591
6% /var<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/hd3
5242880
3774380
29%
748
1% /tmp<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/hd1
20971520
9642312
55%
8164
1% /home<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/proc
-
-
-
-
- /proc<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/hd10opt
41943040
10483932
76%
78521
4% /opt<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
/dev/archlv
298188800 157264636
48%
14
1% /arch2<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
10.142.56.2:/arch1 298188800
230249144
23%
9
1% /arch1<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr>
再對節點2的 crs 程式重啟
# crsctl start crs
Attempting to start CRS stack
The CRS stack will be started shortly
<wbr>
<wbr>
可以看到,經過重啟後,節點2又加入到RAC中來。
檢視群集狀態如下:
# crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
<wbr>
但是系統仍然有問題,gsd和ons程式還是沒有起來
# crs_stat -t
Name
Type
Target
State
Host <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
------------------------------------------------------------
ora....921.srv
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....922.srv
application
ONLINE
OFFLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....p192.cs
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....21.inst
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....22.inst
application
ONLINE
OFFLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.bsp.db
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....W1.lsnr
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw1.gsd
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw1.ons
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw1.vip
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....W2.lsnr
application
ONLINE
ONLINE
zhyw2 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw2.gsd
application
ONLINE
OFFLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw2.ons
application
ONLINE
OFFLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw2.vip
application
ONLINE
ONLINE
zhyw2 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr>
我檢視concurrent vg的狀況,確認所有的關係的vg都已經掛載成功了,
# lsvg -o
oravg
oravg2
oravg3
oravg4
oravg5
oravg6
oravg7
archvg
rootvg
<wbr>
我手工重啟crs 相關的程式
[oracle@zhyw2]$srvctl stop nodeapps -n zhyw2
[oracle@zhyw2]$srvctl start nodeapps -n zhyw2
<wbr>
日誌 crsd.log
2010-08-18 02:48:04.944: [
CRSRES][12435]32Start of `ora.zhyw2.gsd` on member `zhyw2`
succeeded.<wbr>
2010-08-18 02:48:05.151: [
CRSRES][12438]32startRunnable: setting CLI values<wbr>
2010-08-18 02:48:05.157: [
CRSRES][12438]32Attempting to start `ora.zhyw2.vip` on member
`zhyw2`<wbr>
2010-08-18 02:48:07.181: [
CRSRES][12438]32Start of `ora.zhyw2.vip` on member `zhyw2`
succeeded.<wbr>
2010-08-18 02:48:07.401: [
CRSRES][12443]32startRunnable: setting CLI values<wbr>
2010-08-18 02:48:07.410: [
CRSRES][12443]32Attempting to start `ora.zhyw2.ons` on member
`zhyw2`<wbr>
2010-08-18 02:48:08.501: [
CRSRES][12443]32Start of `ora.zhyw2.ons` on member `zhyw2`
succeeded.<wbr>
2010-08-18 02:48:08.509: [ COMMCRS][9523]clsc_receive:
(1146c80b0) error 2
<wbr>
2010-08-18 02:48:08.738: [
CRSRES][12446]32startRunnable: setting CLI values<wbr>
2010-08-18 02:48:08.744: [
CRSRES][12446]32Attempting to start `ora.zhyw2.LISTENER_ZHYW2.lsnr`
on member `zhyw2`<wbr>
2010-08-18 02:48:09.767: [
CRSRES][12446]32Start of `ora.zhyw2.LISTENER_ZHYW2.lsnr` on member
`zhyw2` succeeded.<wbr>
在檢視crs的狀態,gsd,ons程式都已經起來了,但是例項以及它關聯的服務還是offline狀態:
<wbr>
[oracle@zhyw2]$crs_stat -t
Name
Type
Target
State
Host <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
------------------------------------------------------------
ora....921.srv
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....922.srv
application
ONLINE
OFFLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....p192.cs
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....21.inst
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....22.inst
application
ONLINE
OFFLINE <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.bsp.db
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora....W1.lsnr
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw1.gsd
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw1.ons
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
ora.zhyw1.vip
application
ONLINE
ONLINE
zhyw1 <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr>
我檢視了下當前的程式狀況如下:
<wbr>
SQL> select status from v$instance;
<wbr>
STATUS
------------------------
STARTED
<wbr>
<wbr>
當前的資料庫是nomount狀態的。於是我嘗試重啟資料庫程式,嘗試手工啟動zhyw2
<wbr>
SQL> startup
ORA-32004: obsolete and/or deprecated parameter(s) specified
ORACLE instance started.
<wbr>
Total System Global Area 1.7180E+10 bytes
Fixed
Size
2114248 bytes<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
Variable
Size
1.2063E+10 bytes<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
Database
Buffers
5100273664 bytes<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
Redo
Buffers
14659584 bytes<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr>
奇怪的是,資料庫在啟動的時候,hang在上面的介面就不動了。
我檢視了該時間的日誌情況,如下所示:
<wbr>
檢視alert*.log
Wed Aug 18 02:50:55 2010
Starting ORACLE instance (normal)
sskgpgetexecname failed to get name
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
WARNING: No cluster interconnect has been
specified. Depending on<wbr>
the communication driver configured Oracle cluster traffic<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
may be directed to the public interface of this machine.<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
Oracle recommends that RAC clustered databases be configured<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
with a private interconnect for enhanced security and<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
performance.<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
Picked latch-free SCN scheme 3
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
ksdpec: called for event 13740 prior to event group
initialization
Starting up ORACLE RDBMS Version: 10.2.0.4.0.
System parameters with non-default values:
processes
= 1500<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
sessions
= 1655<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
sga_max_size
= 17179869184<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
__shared_pool_size
= 11995709440<wbr><wbr><wbr><wbr><wbr><wbr><wbr>
__large_pool_size
= 16777216<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
__java_pool_size
= 16777216<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
__streams_pool_size
= 33554432<wbr><wbr><wbr><wbr><wbr><wbr>
spfile
= /dev/rlv_spfile_8g<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
sga_target
= 17179869184<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
control_files
= /dev/rlv_cnt11_512m, /dev/rlv_cnt12_512m, /dev/rlv_<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
cnt13_512m
db_block_size
= 8192<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
__db_cache_size
= 5100273664<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
compatible
= 10.2.0.3.0<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
log_archive_dest_1
= LOCATION=/arch2<wbr><wbr><wbr><wbr><wbr><wbr><wbr>
log_archive_format
= bsp1922_%t_%s_%r.arc<wbr><wbr><wbr><wbr><wbr><wbr><wbr>
db_file_multiblock_read_count= 16<wbr>
cluster_database
= TRUE<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
cluster_database_instances= 2<wbr>
。。。。。。。。。。。
Reconfiguration started (old inc 0, new inc 16)
List of nodes:
0 1<wbr>
Global Resource Directory frozen<wbr>
* allocate domain 0, invalid = TRUE
Communication channels reestablished<wbr>
* domain 0 valid = 0 according to instance
0<wbr>
Wed Aug 18 02:50:58 2010
Master broadcasted resource hash value
bitmaps<wbr>
Non-local Process blocks cleaned out<wbr>
<wbr>
Wed Aug 18 02:50:58 2010
LMS 8: 0 GCS shadows traversed, 0 replayed<wbr>
Wed Aug 18 02:50:58 2010
Submitted all GCS remote-cache requests<wbr>
Fix write in gcs resources<wbr>
Reconfiguration complete
LCK0 started with pid=31, OS id=815828
Wed Aug 18 02:50:59 2010
ALTER DATABASE MOUNT<wbr><wbr>
Wed Aug 18 02:54:33 2010
Wed Aug 18 02:54:33 2010
System State dumped to trace file
/opt/oracle/admin/bsp/bdump/bsp1922_diag_12046
52.trc
System State dumped to trace file
/opt/oracle/admin/bsp/bdump/bsp1922_diag_12046
52.trc
Wed Aug 18 02:54:59 2010
System State dumped to trace file
/opt/oracle/admin/bsp/bdump/bsp1922_diag_12046
52.trc
<wbr>
日誌寫到這裡就沒有了:
感覺是RAC節點之間的同步問題,決定做資料庫伺服器節點2的重啟操作,嘗試解決這個RAC節點的故障:
重啟節點2,手動啟動資源
shutdown -Fr-> smitty clstart ->
varyonvg -c oravg
<wbr>
節點2的資料庫還是不能正常開啟,錯誤資訊如下:
ALLTER DATABASE MOUNT<wbr><wbr>
Wed Aug 18 03:15:35 2010
alter database mount
Wed Aug 18 03:15:35 2010
ORA-1154 signalled during: alter database mount...
^C[oracle@zhyw2]$tail -f alert*.log
Submitted all GCS remote-cache requests<wbr>
Fix write in gcs resources<wbr>
Reconfiguration complete
LCK0 started with pid=31, OS id=90800
Wed Aug 18 03:12:01 2010
ALTER DATABASE MOUNT<wbr><wbr>
Wed Aug 18 03:15:35 2010
alter database mount
Wed Aug 18 03:15:35 2010
ORA-1154 signalled during: alter database mount...
<wbr>
這時負責應用的王工也到場了,他也發現了節點2的情況,比較緊張。
我告訴王工:"我覺得是兩節點同步的問題導致的問題。這種情況下,有必要重啟下節點1的資料庫,
來嘗試解決節點2無法open的問題。"
王工說鑑於資料庫訪問已經過於緩慢,嚴重影響了使用,他們已經申請到了停機時間。有什麼需要重啟的就重啟吧。<wbr><wbr>
<wbr>
首先停止監聽,
lsnrctl stop
再幹掉了第一個節點上所有 LOCAL=NO 的程式:
ps -ef |grep NO | awk '{ print $2 } ' | xargs kill -9
再停止資料庫例項
sqlplus / as sysdba -> shutdown immediate;
<wbr>
在把第一個節點zhyw1重啟後,看zhyw2的日誌,發現資料庫被很快的open了。<wbr><wbr>
我再次重啟了2臺小機,果然,第2個節點順利的open成功了。<wbr><wbr>
現在輪到解決那個資料壞塊的問題了。(未完待續)<wbr><wbr>
<wbr><wbr>