oracle 11.2.0.3.6升級故障---instance無法啟動

聽海★藍心夢發表於2013-11-27
 

1.1 例項無法cleaning

1.1.1 故障現象

ORACLE 11.2.0.3.0升級到oracle 11.2.0.3.6,節點1上打psu過程完全正常,沒有任何報錯。但是,最後一步啟動CRS的時候,發現節點1上例項無法正常啟動。

1.1.2 叢集狀態

    發現節點1上,例項狀態為unknown,並且srvctl無法啟動和關閉節點1上的例項。

RAC02:/home/grid> crsctl stat res -t                            

--------------------------------------------------------------------------------

NAME           TARGET  STATE        SERVER                   STATE_DETAILS      

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.ASM_ARCH.dg

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.ASM_CRS.dg

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.ASM_DATA01.dg

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.ASM_DATA02.dg

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.ASM_DATA03.dg

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                               

ora.LISTENER.lsnr

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.asm

               ONLINE  ONLINE       rac01           Started            

               ONLINE  ONLINE       rac02           Started            

ora.gsd

               OFFLINE OFFLINE      rac01                              

               OFFLINE OFFLINE      rac02                               

ora.net1.network

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.ons

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

      1        ONLINE  ONLINE       rac01                              

ora.cvu

      1        ONLINE  ONLINE       rac01                              

ora.racdb.db

      1        ONLINE  UNKNOWN      rac01                              

      2        ONLINE  ONLINE       rac02           Open               

ora.rac01.vip

      1        ONLINE  ONLINE       rac01                               

ora.rac02.vip

      1        ONLINE  ONLINE       rac02                              

ora.oc4j

      1        ONLINE  ONLINE       rac01                              

ora.scan1.vip

      1        ONLINE  ONLINE       rac01 

1.1.3 手工停止例項

試著手工停掉節點1例項,發現節點1例項一直是CLEANING狀態,不能正常停止:

RAC01:/home/grid> crsctl stat res -t

--------------------------------------------------------------------------------

NAME           TARGET  STATE        SERVER                   STATE_DETAILS      

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.ASM_ARCH.dg

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.ASM_CRS.dg

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.ASM_DATA01.dg

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.ASM_DATA02.dg

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.ASM_DATA03.dg

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.LISTENER.lsnr

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                               

ora.asm

               ONLINE  ONLINE       rac01           Started            

               ONLINE  ONLINE       rac02           Started            

ora.gsd

               OFFLINE OFFLINE      rac01                               

               OFFLINE OFFLINE      rac02                              

ora.net1.network

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                               

ora.ons

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

      1        ONLINE  ONLINE       rac02                              

ora.cvu

      1        ONLINE  ONLINE       rac02                              

ora.racdb.db

      1        ONLINE  OFFLINE                               CLEANING           

      2        ONLINE  ONLINE       rac02           Open               

ora.rac01.vip

      1        ONLINE  ONLINE       rac01                              

ora.rac02.vip

      1        ONLINE  ONLINE       rac02                              

ora.oc4j

      1        ONLINE  ONLINE       rac01                               

ora.scan1.vip

      1        ONLINE  ONLINE       rac02

1.1.4 檢視日誌     

檢視日誌crsd.log

2013-11-27 13:22:43.287: [    A**][41] {1:37211:819} Starting the agent: /oracle/grid/11.2.0/grid_1/bin/oraagent with user id: oracle and incarnation:15

2013-11-27 13:22:43.287: [UiServer][47] {1:37211:819} Container [ Name: ORDER

        MESSAGE:

        TextMessage[CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'rac01']

        MSGTYPE:

        TextMessage[3]

        OBJID:

        TextMessage[rac01]

        WAIT:

        TextMessage[0]

]

2013-11-27 13:22:43.288: [UiServer][47] {1:37211:819} Container [ Name: ORDER

        MESSAGE:

        TextMessage[CRS-2679: Attempting to clean 'ora.racdb.db' on 'rac01']

        MSGTYPE:

        TextMessage[3]

        OBJID:

        TextMessage[ora.racdb.db 1 1]

        WAIT:

        TextMessage[0]

]

2013-11-27 13:22:43.345: [    A**][41] {1:37211:819} Starting the HB [Interval =  30000, misscount = 6kill allowed=1] for agent: /oracle/grid/11.2.0/grid_1/bin/oraagent_oracle

2013-11-27 13:22:43.347: [    A**][41] {1:37211:819} Could not forward message [RESOURCE_CLEAN[ora.racdb.db 1 1] ID 4100:3131] to agent. /oracle/grid/11.2.0/grid_1/bin/oraagent_oracle is not running

2013-11-27 13:22:43.348: [    A**][41] {1:37211:819} Starting of the agent: /oracle/grid/11.2.0/grid_1/bin/oraagent with user id oracle is already in progress.

2013-11-27 13:22:57.461: [    A**][44] {2:6722:146} Created alert : (:CRSAGF00130:) :  Failed to start the agent /oracle/grid/11.2.0/grid_1/bin/oraagent_oracle

2013-11-27 13:22:57.462: [    A**][44] {2:6722:146} A** Proxy Server sending the last reply to PE for message:RESOURCE_CLEAN[ora.racdb.db 1 1] ID 4100:2464

2013-11-27 13:22:57.462: [    A**][44] {2:6722:146} Can not stop the agent: /oracle/grid/11.2.0/grid_1/bin/oraagent_oracle because pid is not initialized

2013-11-27 13:22:57.462: [   CRSPE][49] {2:6722:146} Received reply to action [Clean] message ID: 2464

2013-11-27 13:22:57.462: [   CRSPE][49] {2:6722:146} RI [ora.racdb.db 1 1] new internal state: [STABLE] old value: [CLEANING]

2013-11-27 13:22:57.462: [   CRSPE][49] {2:6722:146} Fatal Error from A** Proxy: Unable to start the agent process

2013-11-27 13:22:57.463: [   CRSPE][49] {2:6722:146} CRS-2680: Clean of 'ora.racdb.db' on 'rac01' failed

2013-11-27 13:22:57.465: [   CRSPE][49] {2:6722:146} Sequencer for [ora.racdb.db 1 1] has completed with error: CRS-5802: Unable to s

tart the agent process     

     

1.1.5 檢視oraagent程式

因為日誌裡有提到oraagent_oracle無法啟動,所以在兩個節點上對比oraagent程式狀態:

節點2

RAC02:/home/grid> ps -ef | grep oraagent

    grid  3211     1  0 12:04:45 ?         0:50 /oracle/grid/11.2.0/grid_1/bin/oraagent.bin

    grid  3830     1  0 12:05:50 ?         0:24 /oracle/grid/11.2.0/grid_1/bin/oraagent.bin

  oracle  6046     1  0 12:12:42 ?         0:57 /oracle/grid/11.2.0/grid_1/bin/oraagent.bin

    grid  3204 26634  1 13:39:21 pts/4     0:00 grep oraagent      

節點1   

RAC01:/home/grid> ps -ef | grep oraagent

    grid  5077     1  0 11:46:17 ?         1:05 /oracle/grid/11.2.0/grid_1/bin/oraagent.bin

    grid  5938     1  0 11:47:41 ?         0:35 /oracle/grid/11.2.0/grid_1/bin/oraagent.bin

    grid 29140 19598  0 13:39:15 pts/2     0:00 grep oraagent   

1.1.6 手工啟動oraagent程式

從上面的資訊看,節點1比節點2少啟動一個oraagent.bin程式。於是在節點1上嘗試進入oracle使用者啟動該程式:

RAC01:/home/oracle> /oracle/grid/11.2.0/grid_1/bin/oraagent.bin

sh: /oracle/grid/11.2.0/grid_1/bin/oraagent.bin: Execute permission denied.

RAC01:/home/oracle> ll /oracle/grid/11.2.0/grid_1/bin/oraagent.bin

-rwx------   1 grid       oinstall   40262952 Nov 26 19:46 /oracle/grid/11.2.0/grid_1/bin/oraagent.bin

RAC01:/home/oracle> id

uid=1101(oracle) gid=1000(oinstall) groups=1031(dba)

RAC01:/home/oracle> exit

logout

RAC01:/> id

uid=0(root) gid=3(sys) groups=0(root),1(other),2(bin),4(adm),5(daemon),6(mail),7(lp),20(users)

RAC01:/> chmod 777 /oracle/grid/11.2.0/grid_1/bin/oraagent.bin   

1.1.7 再次停止instance

在節點1上用srvctl停掉instance,然後啟動:

RAC01:/oracle/grid> srvctl stop instance -d racdb -n rac01 -f

發現這次stop可以順利完成,檢視狀態,發現節點1的順利狀態也是正常的shutdwon狀態:

RAC01:/oracle/grid> crsctl stat res -t

--------------------------------------------------------------------------------

NAME           TARGET  STATE        SERVER                   STATE_DETAILS      

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.ASM_ARCH.dg

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.ASM_CRS.dg

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.ASM_DATA01.dg

               ONLINE  ONLINE       rac01                               

               ONLINE  ONLINE       rac02                              

ora.ASM_DATA02.dg

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                               

ora.ASM_DATA03.dg

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.LISTENER.lsnr

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.asm

               ONLINE  ONLINE       rac01           Started            

               ONLINE  ONLINE       rac02           Started            

ora.gsd

               OFFLINE OFFLINE      rac01                              

               OFFLINE OFFLINE      rac02                              

ora.net1.network

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.ons

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

      1        ONLINE  ONLINE       rac01                              

ora.cvu

      1        ONLINE  ONLINE       rac01                              

ora.racdb.db

      1        OFFLINE OFFLINE                               Instance Shutdown  

      2        ONLINE  ONLINE       rac02           Open               

ora.rac01.vip

      1        ONLINE  ONLINE       rac01                              

ora.rac02.vip

      1        ONLINE  ONLINE       rac02                              

ora.oc4j

      1        ONLINE  ONLINE       rac01                              

ora.scan1.vip

      1        ONLINE  ONLINE       rac01                     

1.1.8 再使用srvctl啟動instance

RAC01:/oracle/grid> srvctl start instance -d racdb -n rac01

啟動命令成功完成!!!!

檢視狀態,節點1上的例項已經順利啟動:

RAC01:/oracle/grid> crsctl stat res -t

--------------------------------------------------------------------------------

NAME           TARGET  STATE        SERVER                   STATE_DETAILS      

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.ASM_ARCH.dg

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.ASM_CRS.dg

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.ASM_DATA01.dg

               ONLINE  ONLINE       rac01                               

               ONLINE  ONLINE       rac02                              

ora.ASM_DATA02.dg

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                               

ora.ASM_DATA03.dg

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.LISTENER.lsnr

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.asm

               ONLINE  ONLINE       rac01           Started            

               ONLINE  ONLINE       rac02           Started            

ora.gsd

               OFFLINE OFFLINE      rac01                              

               OFFLINE OFFLINE      rac02                              

ora.net1.network

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

ora.ons

               ONLINE  ONLINE       rac01                              

               ONLINE  ONLINE       rac02                              

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

      1        ONLINE  ONLINE       rac01                              

ora.cvu

      1        ONLINE  ONLINE       rac01                              

ora.racdb.db

      1        ONLINE  ONLINE       rac01           Open               

      2        ONLINE  ONLINE       rac02           Open               

ora.rac01.vip

      1        ONLINE  ONLINE       rac01                              

ora.rac02.vip

      1        ONLINE  ONLINE       rac02                              

ora.oc4j

      1        ONLINE  ONLINE       rac01                              

ora.scan1.vip

      1        ONLINE  ONLINE       rac01       

至此,節點1oracle例項已經順利啟動,節點1升級完成,開始進行節點2升級。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/751371/viewspace-1061243/,如需轉載,請註明出處,否則將追究法律責任。

相關文章