EMC 故障情況下ORACLE 救火行動(之二)

Kamus發表於2010-08-28

客戶反映,下午4點多,**平臺訪問速度突然變得很慢。

因為這套系統關係到全省**平臺的使用,平時都是24小時不間斷執行的,這次這個故障,很可能導致應用的停滯。因為資料庫又沒有備份,客戶很擔心,資料庫一旦失敗,就再起不來了。

因此客戶一檢查到資料庫出現了昨天**平臺資料庫一樣的問題,氣氛一下就變得緊張了。

 <wbr>

這是一套 構建在AIX 5.3.10 上的 oracle 10.2.0.4 RAC應用,兩節點間以負載均衡的模式對外提供服務。因為是核心應用,該平臺部署在2臺 IBM P595的兩個分割槽上,配有72G記憶體和32顆POWER 5+的 CPU。客戶和我反映,平時最忙的時候,系統的CPU 和 記憶體的使用率也只有50%-60%左右。

到達客戶現場之後,我很快查詢了資料庫的錯誤日誌。

 <wbr>

日誌資訊如下:

zhyw2 :

Tue Aug 17 22:59:46 2010

Errors in file /opt/oracle/admin/bsp/bdump/bsp1922_j000_729190.trc:

ORA-12012: error on auto execute of job 145

ORA-12008: error in materialized view refresh path

ORA-00376: file 43 cannot be read at this time

ORA-01110: data file 43: '/dev/rlv_raw37_16g'

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2251

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2457

ORA-06512: at "SYS.DBMS_IREFRESH", line 685

ORA-06512: at "SYS.DBMS_REFRESH", line 195

ORA-06512: at line 1

 <wbr>

Tue Aug 17 21:39:46 2010

KCF: write/open error block=0xb080f online=1

file=54 /dev/rlv_raw48_16g

 error=27063 txt: 'IBM AIX RISC System/6000 Error: 47: Write-protected media<wbr>

Additional information: -1

Additional information: 8192'

Automatic datafile offline due to write error on

Tue Aug 17 21:55:46 2010

Errors in file /opt/oracle/admin/bsp/udump/bsp1922_ora_406246.trc:

ORA-00603: ORACLE server session terminated by fatal error

ORA-01115: IO error reading block from file 35 (block # 923276)

ORA-01110: data file 35: '/dev/rlv_raw29_16g'

ORA-27091: unable to queue I/O

ORA-27072: File I/O error

IBM AIX RISC System/6000 Error: 5: I/O error

Additional information: 7

Additional information: 923275

Additional information: 923275

Additional information: -1

Tue Aug 17 23:47:35 2010

System State dumped to trace file /opt/oracle/admin/bsp/bdump/bsp1922_diag_63911

0.trc

Tue Aug 17 23:48:21 2010

Errors in file /opt/oracle/admin/bsp/bdump/bsp1922_m002_430378.trc:

ORA-00604: error occurred at recursive SQL level 1

ORA-00028: your session has been killed

ORA-06512: at "SYS.PRVT_HDM", line 10

ORA-06512: at "SYS.WRI$_ADV_HDM_T", line 16

ORA-06512: at "SYS.PRVT_ADVISOR", line 1535

ORA-06512: at "SYS.PRVT_ADVISOR", line 1618

ORA-06512: at "SYS.PRVT_HDM", line 106

ORA-06512: at line 1

 <wbr>

 <wbr>

我很快的過濾了一下兩個節點的日誌資訊,找到了我們關心的幾個問題,如下所示:

file 54: /dev/rlv_raw48_16g

ORA-01110: data file 35: '/dev/rlv_raw29_16g'

ORA-01110: data file 43: '/dev/rlv_raw37_16g'

 <wbr>

看來故障也是因資料庫檔案的訪問出錯導致的。

通過下面命令,我檢視了相關資料庫檔案當前的狀態資訊:

select name,status from v$datafile;

NAME                 STATUS                                                    <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

-------------------- --------------------                                      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/rlv_system_8g   SYSTEM                                                    <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/rlv_undot11_8g  ONLINE                                                    <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/rlv_sysaux_8g   ONLINE                                                    <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/rlv_user_8g     ONLINE                                                    <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/rlv_undot12_8g  ONLINE                                                    <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/rlv_raw29_16g   ONLINE                                                    <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/rlv_raw37_16g   RECOVER                                                   <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/rlv_raw48_16g   ONLINE<wbr><wbr>

從上可以看到,當前資料塊

"/dev/rlv_raw37_16g" 處於recover狀態,需要做恢復。  其他幾個資料檔案顯示的狀態是正常的。<wbr>

 <wbr>

 <wbr>

我檢視了下RAC兩節點的歸檔日誌資訊

[oracle@zhyw1]$ls -l

total 135787576

-rw-r-----    1 oracle   oinstall 16350676480 Jul 29 12:16 bsp1921_1_227_713969898.arc<wbr><wbr><wbr><wbr><wbr>

-rw-r-----    1 oracle   oinstall 16350670336 Aug  3 17:46 bsp1921_1_228_713969898.arc<wbr><wbr><wbr><wbr><wbr><wbr>

-rw-rw----    1 oracle   oinstall 4119506432 Aug  4 21:15 bsp1921_1_229_713969898.arc<wbr><wbr><wbr><wbr><wbr><wbr>

-rw-rw----    1 oracle   oinstall 16350673408 Aug 10 15:35 bsp1921_1_230_713969898.arc<wbr><wbr><wbr><wbr><wbr>

-rw-rw----    1 oracle   oinstall 16350669824 Aug 14 21:45 bsp1921_1_231_713969898.arc<wbr><wbr><wbr><wbr><wbr>

drwxr-xr-x    2 root     system          256 Mar 16 09:15 lost+found<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

[oracle@zhyw1]$cd /arch2

[oracle@zhyw1]$ls -l

total 281756560

-rw-r-----    1 oracle   oinstall 16350686720 Jul 22 09:47 bsp1922_2_221_713969898.arc<wbr><wbr><wbr><wbr><wbr>

-rw-r-----    1 oracle   oinstall 16350676480 Jul 23 18:56 bsp1922_2_222_713969898.arc<wbr><wbr><wbr><wbr><wbr>

-rw-r-----    1 oracle   oinstall 16350677504 Jul 28 18:11 bsp1922_2_223_713969898.arc<wbr><wbr><wbr><wbr><wbr>

-rw-r-----    1 oracle   oinstall 16350675968 Aug  2 11:23 bsp1922_2_224_713969898.arc<wbr><wbr><wbr><wbr><wbr><wbr>

-rw-rw----    1 oracle   oinstall 13451708416 Aug  4 18:57 bsp1922_2_225_713969898.arc<wbr><wbr><wbr><wbr><wbr><wbr>

-rw-rw----    1 oracle   oinstall 16350674432 Aug  8 20:05 bsp1922_2_226_713969898.arc<wbr><wbr><wbr><wbr><wbr><wbr>

-rw-rw----    1 oracle   oinstall 16350808064 Aug 11 10:49 bsp1922_2_227_713969898.arc<wbr><wbr><wbr><wbr><wbr>

-rw-rw----    1 oracle   oinstall 16350674944 Aug 13 16:46 bsp1922_2_228_713969898.arc<wbr><wbr><wbr><wbr><wbr>

-rw-rw----    1 oracle   oinstall 16350668288 Aug 17 09:46 bsp1922_2_229_713969898.arc<wbr><wbr><wbr><wbr><wbr>

drwxr-xr-x    2 root     system          256 Mar 16 14:20 lost+found<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

[oracle@zhyw1]$

 <wbr>

我心裡暗暗的慶幸,看來歸檔的保留還比較完整,資料庫資料檔案的恢復應該沒有問題。

在我正在檢查資料庫情況的時候,突然發現節點2的instance狀態不正常了。

 <wbr>

從節點2 檢視群集資訊,報了下面的錯誤

檢視群集狀態:如下:

# crsctl check crs

Failure 1 contacting CSS daemon

Cannot communicate with CRS

Cannot communicate with EVM

[oracle@zhyw2]$crs_stat -t

IOT/Abort trap

 <wbr>

而從節點1用crs_stat 檢視,發現節點2的instance已經down掉了!

 <wbr>

# crsctl check crs

CSS appears healthy

CRS appears healthy

EVM appears healthy

# crs_stat -t

Name           Type           Target    State     Host       <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

------------------------------------------------------------

ora....921.srv application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....922.srv application    ONLINE    OFFLINE              <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....p192.cs application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....21.inst application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....22.inst application    ONLINE    OFFLINE              <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.bsp.db     application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....W1.lsnr application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw1.gsd  application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw1.ons  application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw1.vip  application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....W2.lsnr application    ONLINE    ONLINE    zhyw2      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw2.gsd  application    ONLINE    OFFLINE              <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw2.ons  application    ONLINE    OFFLINE              <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw2.vip  application    ONLINE    ONLINE    zhyw2      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

 <wbr>

# crsctl check crs

CSS appears healthy

CRS appears healthy

EVM appears healthy

# crs_stat -t

Name           Type           Target    State     Host       <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

------------------------------------------------------------

ora....921.srv application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....922.srv application    ONLINE    OFFLINE              <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....p192.cs application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....21.inst application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....22.inst application    ONLINE    OFFLINE              <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.bsp.db     application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....W1.lsnr application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw1.gsd  application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw1.ons  application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw1.vip  application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....W2.lsnr application    ONLINE    ONLINE    zhyw2      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw2.gsd  application    ONLINE    OFFLINE              <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw2.ons  application    ONLINE    OFFLINE              <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw2.vip  application    ONLINE    ONLINE    zhyw2      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

 <wbr>


我在第2個節點上嘗試對crs 程式重啟,並觀察ocssd.log

資訊如下:

 <wbr>

   CSSD]2010-08-18 00:00:42.061 [2572] >TRACE:   Authentication OSD error, op:<wbr><wbr><wbr><wbr><wbr>

 scls_auth_response_prepare<wbr>

 loc: mkdir<wbr>

 info: failed to make dir /opt/oracle/product/10.2.0/crs/css/auth/A3572513, No s<wbr>

pace left on device

dep: 28

   CSSD]2010-08-18 00:00:42.489 [2572] >TRACE:   Authentication OSD error, op:<wbr><wbr><wbr><wbr><wbr>

 scls_auth_response_prepare<wbr>

 loc: mkdir<wbr>

 info: failed to make dir /opt/oracle/product/10.2.0/crs/css/auth/A1193328, No s<wbr>

pace left on device

dep: 28

   CSSD]2010-08-18 00:00:42.544 [2572] >TRACE:   Authentication OSD error, op:<wbr><wbr><wbr><wbr><wbr>

 scls_auth_response_prepare<wbr>

 loc: mkdir<wbr>

 info: failed to make dir /opt/oracle/product/10.2.0/crs/css/auth/A5267322, No s<wbr>

pace lef

 <wbr>

上面的一條info 引起了我的注意:怎麼會報no space left?

難道空間滿了,我馬上df 檢視了下第二個節點空間的使用情況:

檢視當前的容量資訊:

# df -k

Filesystem    1024-blocks      Free %Used    Iused %Iused Mounted on<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/hd4          2097152   1465464   31%     7967     3% /<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/hd2          3145728   1196032   62%    42303    14% /usr<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/hd9var       1048576    585188   45%     7592     6% /var<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/hd3          5242880   3774380   29%      748     1% /tmp<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/hd1         20971520   9642316   55%     8164     1% /home<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/proc                                      /proc<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/hd10opt     31457280         100%    78501    92% /opt<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/archlv     298188800 157264636   48%       14     1% /arch2<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

10.142.56.2:/arch1   298188800 230249144   23%           1% /arch1<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

#

 <wbr>

原來/opt 資料夾滿了! 估計是因為故障,oracle資料庫不斷產生trace檔案,而trace 檔案把這個目錄撐死了。如果這樣的話,那估計第一個節點也撐不了多久了。我檢視下第一個節點的/opt 目錄空間,果然也倒了 92%了。

 <wbr>

為了以後分析的可能性,我暫時不想刪除oracle 的 trace檔案,於是我用下面命令確認rootvg是否有足夠的剩餘空間,檢視當前的rootvg剩餘空間如下:

# lsvg rootvg

VOLUME GROUP:       rootvg                   VG IDENTIFIER:  00c450d500004c000000012795dce835<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

VG STATE:           active                   PP SIZE:        256 megabyte(s)<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

VG PERMISSION:      read/write               TOTAL PPs:      1092 (279552 megabytes)<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

MAX LVs:            256                      FREE PPs:       304 (77824 megabytes)<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

LVs:                10                       USED PPs:       788 (201728 megabytes)<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

OPEN LVs:                                 QUORUM:         1 (Disabled)<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

TOTAL PVs:                                VG DESCRIPTORS: 3<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

STALE PVs:                                STALE PPs:      0<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ACTIVE PVs:                               AUTO ON:        yes<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

MAX PPs per VG:     32512                                    <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

MAX PPs per PV:     1016                     MAX PVs:        32<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

LTG size (Dynamic): 256 kilobyte(s)          AUTO SYNC:      no<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

HOT SPARE:          no                       BB POLICY:      relocatable<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

 <wbr>

可以看到,剩餘容量為77G,於是我選擇對OPT資料夾進行擴充套件,

smitty jfs2->

擴充套件之後,容量如下:

# df -k

Filesystem    1024-blocks      Free %Used    Iused %Iused Mounted on<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/hd4          2097152   1465392   31%     7967     3% /<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/hd2          3145728   1196032   62%    42303    14% /usr<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/hd9var       1048576    585260   45%     7591     6% /var<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/hd3          5242880   3774380   29%      748     1% /tmp<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/hd1         20971520   9642312   55%     8164     1% /home<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/proc                                      /proc<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/hd10opt     41943040  10483932   76%    78521     4% /opt<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

/dev/archlv     298188800 157264636   48%       14     1% /arch2<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

10.142.56.2:/arch1   298188800 230249144   23%           1% /arch1<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

 <wbr>

再對節點2的 crs 程式重啟

# crsctl start crs

Attempting to start CRS stack

The CRS stack will be started shortly

 <wbr>

 <wbr>

可以看到,經過重啟後,節點2又加入到RAC中來。

檢視群集狀態如下:

# crsctl check crs

CSS appears healthy

CRS appears healthy

EVM appears healthy

 <wbr>

但是系統仍然有問題,gsd和ons程式還是沒有起來

# crs_stat -t

Name           Type           Target    State     Host       <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

------------------------------------------------------------

ora....921.srv application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....922.srv application    ONLINE    OFFLINE              <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....p192.cs application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....21.inst application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....22.inst application    ONLINE    OFFLINE              <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.bsp.db     application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....W1.lsnr application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw1.gsd  application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw1.ons  application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw1.vip  application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....W2.lsnr application    ONLINE    ONLINE    zhyw2      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw2.gsd  application    ONLINE    OFFLINE              <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw2.ons  application    ONLINE    OFFLINE              <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw2.vip  application    ONLINE    ONLINE    zhyw2      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

 <wbr>

我檢視concurrent vg的狀況,確認所有的關係的vg都已經掛載成功了,

# lsvg -o

oravg

oravg2

oravg3

oravg4

oravg5

oravg6

oravg7

archvg

rootvg

 <wbr>

我手工重啟crs 相關的程式

[oracle@zhyw2]$srvctl stop nodeapps -n zhyw2

[oracle@zhyw2]$srvctl start nodeapps -n zhyw2

 <wbr>

日誌 crsd.log

2010-08-18 02:48:04.944: [  CRSRES][12435]32Start of `ora.zhyw2.gsd` on member `zhyw2` succeeded.<wbr>

2010-08-18 02:48:05.151: [  CRSRES][12438]32startRunnable: setting CLI values<wbr>

2010-08-18 02:48:05.157: [  CRSRES][12438]32Attempting to start `ora.zhyw2.vip` on member `zhyw2`<wbr>

2010-08-18 02:48:07.181: [  CRSRES][12438]32Start of `ora.zhyw2.vip` on member `zhyw2` succeeded.<wbr>

2010-08-18 02:48:07.401: [  CRSRES][12443]32startRunnable: setting CLI values<wbr>

2010-08-18 02:48:07.410: [  CRSRES][12443]32Attempting to start `ora.zhyw2.ons` on member `zhyw2`<wbr>

2010-08-18 02:48:08.501: [  CRSRES][12443]32Start of `ora.zhyw2.ons` on member `zhyw2` succeeded.<wbr>

2010-08-18 02:48:08.509: [ COMMCRS][9523]clsc_receive: (1146c80b0) error 2

 <wbr>

2010-08-18 02:48:08.738: [  CRSRES][12446]32startRunnable: setting CLI values<wbr>

2010-08-18 02:48:08.744: [  CRSRES][12446]32Attempting to start `ora.zhyw2.LISTENER_ZHYW2.lsnr` on member `zhyw2`<wbr>

2010-08-18 02:48:09.767: [  CRSRES][12446]32Start of `ora.zhyw2.LISTENER_ZHYW2.lsnr` on member `zhyw2` succeeded.<wbr>

在檢視crs的狀態,gsd,ons程式都已經起來了,但是例項以及它關聯的服務還是offline狀態:

 <wbr>

[oracle@zhyw2]$crs_stat -t

Name           Type           Target    State     Host       <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

------------------------------------------------------------

ora....921.srv application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....922.srv application    ONLINE    OFFLINE              <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....p192.cs application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....21.inst application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....22.inst application    ONLINE    OFFLINE              <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.bsp.db     application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora....W1.lsnr application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw1.gsd  application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw1.ons  application    ONLINE    ONLINE    zhyw1      <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

ora.zhyw1.vip  application    ONLINE    ONLINE    zhyw1   <wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

 <wbr>


我檢視了下當前的程式狀況如下:

 <wbr>

SQL> select status from v$instance;

 <wbr>

STATUS

------------------------

STARTED

 <wbr>

 <wbr>

當前的資料庫是nomount狀態的。於是我嘗試重啟資料庫程式,嘗試手工啟動zhyw2

 <wbr>

SQL> startup

ORA-32004: obsolete and/or deprecated parameter(s) specified

ORACLE instance started.

 <wbr>

Total System Global Area 1.7180E+10 bytes

Fixed Size                  2114248 bytes<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

Variable Size            1.2063E+10 bytes<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

Database Buffers         5100273664 bytes<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

Redo Buffers               14659584 bytes<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

 <wbr>

奇怪的是,資料庫在啟動的時候,hang在上面的介面就不動了。

我檢視了該時間的日誌情況,如下所示:

 <wbr>

檢視alert*.log

Wed Aug 18 02:50:55 2010

Starting ORACLE instance (normal)

sskgpgetexecname failed to get name

LICENSE_MAX_SESSION = 0

LICENSE_SESSIONS_WARNING = 0

  WARNING: No cluster interconnect has been specified. Depending on<wbr>

           the communication driver configured Oracle cluster traffic<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

           may be directed to the public interface of this machine.<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

           Oracle recommends that RAC clustered databases be configured<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

           with a private interconnect for enhanced security and<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

           performance.<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

Picked latch-free SCN scheme 3

Autotune of undo retention is turned on.

LICENSE_MAX_USERS = 0

SYS auditing is disabled

ksdpec: called for event 13740 prior to event group initialization

Starting up ORACLE RDBMS Version: 10.2.0.4.0.

System parameters with non-default values:

  processes                = 1500<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

  sessions                 = 1655<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

  sga_max_size             = 17179869184<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

  __shared_pool_size       = 11995709440<wbr><wbr><wbr><wbr><wbr><wbr><wbr>

  __large_pool_size        = 16777216<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

  __java_pool_size         = 16777216<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

  __streams_pool_size      = 33554432<wbr><wbr><wbr><wbr><wbr><wbr>

  spfile                   = /dev/rlv_spfile_8g<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

  sga_target               = 17179869184<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

  control_files            = /dev/rlv_cnt11_512m, /dev/rlv_cnt12_512m, /dev/rlv_<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

cnt13_512m

  db_block_size            = 8192<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

  __db_cache_size          = 5100273664<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

  compatible               = 10.2.0.3.0<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

  log_archive_dest_1       = LOCATION=/arch2<wbr><wbr><wbr><wbr><wbr><wbr><wbr>

  log_archive_format       = bsp1922_%t_%s_%r.arc<wbr><wbr><wbr><wbr><wbr><wbr><wbr>

  db_file_multiblock_read_count= 16<wbr>

  cluster_database         = TRUE<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>

  cluster_database_instances= 2<wbr>

。。。。。。。。。。。

Reconfiguration started (old inc 0, new inc 16)

List of nodes:

 0 1<wbr>

 Global Resource Directory frozen<wbr>

* allocate domain 0, invalid = TRUE

 Communication channels reestablished<wbr>

 * domain 0 valid = 0 according to instance 0<wbr>

Wed Aug 18 02:50:58 2010

 Master broadcasted resource hash value bitmaps<wbr>

 Non-local Process blocks cleaned out<wbr>

 <wbr>

Wed Aug 18 02:50:58 2010

 LMS 8: 0 GCS shadows traversed, 0 replayed<wbr>

Wed Aug 18 02:50:58 2010

 Submitted all GCS remote-cache requests<wbr>

 Fix write in gcs resources<wbr>

Reconfiguration complete

LCK0 started with pid=31, OS id=815828

Wed Aug 18 02:50:59 2010

ALTER DATABASE   MOUNT<wbr><wbr>

Wed Aug 18 02:54:33 2010

Wed Aug 18 02:54:33 2010

System State dumped to trace file /opt/oracle/admin/bsp/bdump/bsp1922_diag_12046

52.trc

System State dumped to trace file /opt/oracle/admin/bsp/bdump/bsp1922_diag_12046

52.trc

Wed Aug 18 02:54:59 2010

System State dumped to trace file /opt/oracle/admin/bsp/bdump/bsp1922_diag_12046

52.trc

 <wbr>

日誌寫到這裡就沒有了:

感覺是RAC節點之間的同步問題,決定做資料庫伺服器節點2的重啟操作,嘗試解決這個RAC節點的故障:

重啟節點2,手動啟動資源

shutdown -Fr-> smitty clstart -> varyonvg -c oravg

 <wbr>

節點2的資料庫還是不能正常開啟,錯誤資訊如下:

ALLTER DATABASE   MOUNT<wbr><wbr>

Wed Aug 18 03:15:35 2010

alter database mount

Wed Aug 18 03:15:35 2010

ORA-1154 signalled during: alter database mount...

^C[oracle@zhyw2]$tail -f alert*.log

 Submitted all GCS remote-cache requests<wbr>

 Fix write in gcs resources<wbr>

Reconfiguration complete

LCK0 started with pid=31, OS id=90800

Wed Aug 18 03:12:01 2010

ALTER DATABASE   MOUNT<wbr><wbr>

Wed Aug 18 03:15:35 2010

alter database mount

Wed Aug 18 03:15:35 2010

ORA-1154 signalled during: alter database mount...

 <wbr>

這時負責應用的王工也到場了,他也發現了節點2的情況,比較緊張。

我告訴王工:"我覺得是兩節點同步的問題導致的問題。這種情況下,有必要重啟下節點1的資料庫,

   來嘗試解決節點2無法open的問題。" 王工說鑑於資料庫訪問已經過於緩慢,嚴重影響了使用,他們已經申請到了停機時間。有什麼需要重啟的就重啟吧。<wbr><wbr>

 <wbr>

首先停止監聽,

lsnrctl stop

再幹掉了第一個節點上所有 LOCAL=NO 的程式:

ps -ef |grep NO | awk '{ print $2 } ' | xargs kill -9

再停止資料庫例項

sqlplus / as sysdba -> shutdown immediate;

 <wbr>

   在把第一個節點zhyw1重啟後,看zhyw2的日誌,發現資料庫被很快的open了。<wbr><wbr>

   我再次重啟了2臺小機,果然,第2個節點順利的open成功了。<wbr><wbr>

   現在輪到解決那個資料壞塊的問題了。(未完待續)<wbr><wbr>

  
<wbr><wbr>

<!-- 正文結束 --&gt

相關文章