AIX RAC9I 心跳線斷掉測試

westzq1984發表於2009-05-11

SQL> select * from gv$instance;

   INST_ID INSTANCE_NUMBER INSTANCE_NAME    HOST_ VERSION           STARTUP_T STATUS       PAR    THREAD# ARCHIVE LOG_SWITCH_
---------- --------------- ---------------- ----- ----------------- --------- ------------ --- ---------- ------- -----------
LOGINS     SHU DATABASE_STATUS   INSTANCE_ROLE      ACTIVE_ST
---------- --- ----------------- ------------------ ---------
         1               1 rac1             P61A  9.2.0.8.0         11-MAY-09 OPEN         YES          1 STOPPED
ALLOWED    NO  ACTIVE            PRIMARY_INSTANCE   NORMAL

         2               2 rac2             P61B  9.2.0.8.0         11-MAY-09 OPEN         YES          2 STOPPED
ALLOWED    NO  ACTIVE            PRIMARY_INSTANCE   NORMAL


SQL> select inst_id,open_mode from gv$database;

   INST_ID OPEN_MODE
---------- ----------
         1 READ WRITE
         2 READ WRITE

拔掉例項2(P61B)的心跳線

所有節點都HANG住,例項1上查詢無返回,新開視窗登入卡住,登入例項2卡住

節點1日誌(P61A)

Mon May 11 21:54:30 2009
IPC Send timeout detected. Sender ospid 250018
Mon May 11 21:54:31 2009
IPC Send timeout detected. Sender ospid 348268
Mon May 11 21:55:02 2009
Communications reconfiguration: instance 1
Evicting instance 2 from cluster
Mon May 11 21:55:29 2009
Waiting for instances to leave:
2
Mon May 11 21:55:33 2009
Trace dumping is performing id=[cdmp_20090511215503]
Mon May 11 21:55:39 2009
Reconfiguration started (old inc 2, new inc 4)
List of nodes:
 0
 Nested/batched reconfiguration detected.
 Global Resource Directory frozen
one node partition
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
 Resources and enqueues cleaned out
 Resources remastered 605
 601 GCS shadows traversed, 0 cancelled, 0 closed
 200 GCS resources traversed, 0 cancelled
 set master node info
 Submitted all remote-enqueue requests
 Update rdomain variables
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 601 GCS shadows traversed, 0 replayed, 0 unopened
 Submitted all GCS remote-cache requests
 0 write requests issued in 601 GCS resources
 0 PIs marked suspect, 0 flush PI msgs
Mon May 11 21:55:39 2009
Reconfiguration complete
 Post SMON to start 1st pass IR
Mon May 11 21:55:39 2009
Instance recovery: looking for dead threads
Mon May 11 21:55:39 2009
Beginning instance recovery of 1 threads
Mon May 11 21:55:39 2009
Started redo scan
Mon May 11 21:55:39 2009
Completed redo scan
 182 redo blocks read, 32 data blocks need recovery
Mon May 11 21:55:39 2009
Started recovery at
 Thread 2: logseq 5, block 3, scn 0.0
Mon May 11 21:55:39 2009
Recovery of Online Redo Log: Thread 2 Group 3 Seq 5 Reading mem 0
  Mem# 0 errs 0: /dev/rtrac_redo2_11
Mon May 11 21:55:40 2009
Completed redo application
Mon May 11 21:55:40 2009
Ended recovery at
 Thread 2: logseq 5, block 185, scn 0.209203
 8 data blocks read, 32 data blocks written, 182 redo blocks read
Ending instance recovery of 1 threads
SMON: about to recover undo segment 11
SMON: mark undo segment 11 as available
SMON: about to recover undo segment 12
SMON: mark undo segment 12 as available
SMON: about to recover undo segment 13
SMON: mark undo segment 13 as available
SMON: about to recover undo segment 14
SMON: mark undo segment 14 as available
SMON: about to recover undo segment 15
SMON: mark undo segment 15 as available
SMON: about to recover undo segment 16
SMON: mark undo segment 16 as available
SMON: about to recover undo segment 17
SMON: mark undo segment 17 as available
SMON: about to recover undo segment 18
SMON: mark undo segment 18 as available
SMON: about to recover undo segment 19
SMON: mark undo segment 19 as available
SMON: about to recover undo segment 20
SMON: mark undo segment 20 as available

節點2(P61B)
Mon May 11 21:54:35 2009
IPC Send timeout detected. Sender ospid 299166
Mon May 11 21:55:07 2009
Communications reconfiguration: instance 0
IPC Send timeout detected. Sender ospid 393310
Mon May 11 21:55:37 2009
Trace dumping is performing id=[cdmp_20090511215507]
Mon May 11 21:55:43 2009
Errors in file /u01/app/oracle/admin/rac/bdump/rac2_lmon_393310.trc:
ORA-29740: evicted by member 0, group incarnation 3
Mon May 11 21:55:43 2009
LMON: terminating instance due to error 29740
Instance terminated by LMON, pid = 393310

從察覺到節點網路失效到完成接管,用了大概70秒。。。節點B被踢出叢集,但是從拔掉網線到察覺,大概用了6分鐘


節點1(P61A)上的查詢
SQL> /
select inst_id,open_mode from gv$database
                              *
ERROR at line 1:
ORA-12805: parallel query server died unexpectedly


SQL> SQL> SQL> SQL> SQL> SQL> SQL> SQL> /
select inst_id,open_mode from gv$database
*
ERROR at line 1:
ORA-12805: parallel query server died unexpectedly


SQL> /

   INST_ID OPEN_MODE
---------- ----------
         1 READ WRITE

SQL> /

   INST_ID OPEN_MODE
---------- ----------
         1 READ WRITE

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/8242091/viewspace-594777/,如需轉載,請註明出處,否則將追究法律責任。

相關文章