測試新增刪除11.2 RAC節點的時候，碰到了這個bug。

透過dbca新增例項，又透過dbca刪除例項。檢查資料庫的日誌：

SQL> select group#, thread#, sequence#, bytes, status
2 from v$log;

    GROUP#    THREAD# SEQUENCE#      BYTES STATUS
---------- ---------- ---------- ---------- ----------------
         1          1         11 524288000 INACTIVE
         2          1         12 524288000 CURRENT
         3          2          5 524288000 CURRENT
         4          2          4 524288000 INACTIVE
         7          4          1 524288000 ACTIVE
         8          4          0 524288000 UNUSED

6 rows selected.

例項3對應thread 3的日誌已經被清除，但是怎麼會多出來一個thread 4的日誌。這個thread 4是什麼時候加入到資料庫中的。

可惜在刪除了例項3後才發現這個問題，因此例項3上所有的日誌都已經沒有了。

不過dbca新增和刪除節點都是在例項1上進行的，檢查例項1對應的alert檔案：

Thu Dec 23 16:35:58 2010
CREATE SMALLFILE UNDO TABLESPACE "UNDOTBS3" DATAFILE SIZE 2048M AUTOEXTEND ON NEXT 5120K MAXSIZE 65535M BLOCKSIZE 16384
Thu Dec 23 16:36:32 2010
Completed: CREATE SMALLFILE UNDO TABLESPACE "UNDOTBS3" DATAFILE SIZE 2048M AUTOEXTEND ON NEXT 5120K MAXSIZE 65535M BLOCKSIZE 16384
ALTER DATABASE ADD LOGFILE THREAD 3 GROUP 5 SIZE 512000K,
GROUP 6 SIZE 512000K
Thu Dec 23 16:37:03 2010
Completed: ALTER DATABASE ADD LOGFILE THREAD 3 GROUP 5 SIZE 512000K,
GROUP 6 SIZE 512000K
ALTER DATABASE ENABLE THREAD 3
Completed: ALTER DATABASE ENABLE THREAD 3
ALTER SYSTEM SET instance_number=3 SCOPE=SPFILE SID='testrac3';
ALTER SYSTEM SET thread=3 SCOPE=SPFILE SID='testrac3';
ALTER SYSTEM SET undo_tablespace='UNDOTBS3' SCOPE=SPFILE SID='testrac3';
Thu Dec 23 16:38:18 2010
Reconfiguration started (old inc 4, new inc 6)
List of instances:
1 2 3 (myinst: 1)
Global Resource Directory frozen
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Thu Dec 23 16:38:18 2010
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Submitted all GCS remote-cache requests
Fix write in gcs resources
Reconfiguration complete
Thu Dec 23 18:26:38 2010
Reconfiguration started (old inc 6, new inc 8)
List of instances:
1 2 (myinst: 1)
Global Resource Directory frozen
* dead instance detected - domain 0 invalid = TRUE
Communication channels reestablished
Thu Dec 23 18:26:38 2010
* domain 0 not valid according to instance 2
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Thu Dec 23 18:26:38 2010
LMS 0: 1 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Submitted all GCS remote-cache requests
Fix write in gcs resources
Reconfiguration complete
Thu Dec 23 18:27:15 2010
ALTER DATABASE DISABLE THREAD 3
Completed: ALTER DATABASE DISABLE THREAD 3
ALTER SYSTEM ARCHIVE LOG
ALTER DATABASE DROP LOGFILE GROUP 5
Deleted Oracle managed file +DATA/testrac/onlinelog/group_5.296.738520593
Deleted Oracle managed file +DATA/testrac/onlinelog/group_5.297.738520601
Completed: ALTER DATABASE DROP LOGFILE GROUP 5
ALTER DATABASE DROP LOGFILE GROUP 6
Deleted Oracle managed file +DATA/testrac/onlinelog/group_6.298.738520609
Deleted Oracle managed file +DATA/testrac/onlinelog/group_6.299.738520617
Completed: ALTER DATABASE DROP LOGFILE GROUP 6
DROP TABLESPACE UNDOTBS3 INCLUDING CONTENTS AND DATAFILES
Deleted Oracle managed file +DATA/testrac/datafile/undotbs3.295.738520559
Completed: DROP TABLESPACE UNDOTBS3 INCLUDING CONTENTS AND DATAFILES
ALTER SYSTEM RESET thread SCOPE=SPFILE SID='testrac3';
ALTER SYSTEM RESET instance_number SCOPE=SPFILE SID='testrac3';
ALTER SYSTEM RESET undo_tablespace SCOPE=SPFILE SID='testrac3';

可以看到完整的例項3新增thread 3日誌以及最後刪除例項是對應的diable thread 3的過程。

這裡沒有thread 4的資訊，檢查一些節點2上的alert檔案：

Thu Dec 23 16:37:19 2010
Redo thread 3 internally disabled at seq 1 (CKPT)
Thu Dec 23 16:37:19 2010
ARC3: Archiving disabled thread 3 sequence 1
Archived Log entry 10 added for thread 3 sequence 1 ID 0x6e4846f dest 1:
Thu Dec 23 16:38:18 2010
Reconfiguration started (old inc 4, new inc 6)
List of instances:
1 2 3 (myinst: 2)
Global Resource Directory frozen
Communication channels reestablished
Thu Dec 23 16:38:18 2010
* domain 0 valid = 1 according to instance 1
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Thu Dec 23 16:38:18 2010
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Submitted all GCS remote-cache requests
Fix write in gcs resources
Reconfiguration complete
Thu Dec 23 16:38:19 2010
alter database add logfile thread 4 SIZE 524288000 , SIZE 524288000
Thu Dec 23 16:38:53 2010
Completed: alter database add logfile thread 4 SIZE 524288000 , SIZE 524288000
alter database enable public thread 4
Completed: alter database enable public thread 4
Thu Dec 23 16:47:54 2010
db_recovery_file_dest_size of 65536 MB is 6.15% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Thu Dec 23 18:26:38 2010
Reconfiguration started (old inc 6, new inc 8)
List of instances:
1 2 (myinst: 2)
Global Resource Directory frozen
* dead instance detected - domain 0 invalid = TRUE
Communication channels reestablished
Thu Dec 23 18:26:38 2010
* domain 0 valid = 0 according to instance 1
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Thu Dec 23 18:26:38 2010
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Thu Dec 23 18:26:39 2010
Instance recovery: looking for dead threads
Submitted all GCS remote-cache requests
Instance recovery: lock domain invalid but no dead threads
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete

在例項2上果然看到了thread 4啟用以及新增日誌的資訊。

查詢v$thread可以看到詳細的資訊：

SQL> select thread#, status, enabled, instance, current_group#, sequence#
2 from v$thread;

   THREAD# STATUS ENABLED INSTANCE             CURRENT_GROUP# SEQUENCE#
---------- ------ -------- -------------------- -------------- ----------
         1 OPEN   PUBLIC   testrac1                          2         12
         2 OPEN   PUBLIC   testrac2                          3          5
         4 CLOSED PUBLIC   UNNAMED_INSTANCE_4                0          1

奇怪的是，這裡的操作並不是我執行的，而且可以確認的是，也不可能是其他人執行的，那麼只有一個可能性，就是Oracle自動執行的。

如果這個動作沒有什麼額外的深意，那麼就是bug。在metalink上查詢了一下，居然這麼新的bug也能找到：Bug 9916360: EXTRA ONLINE REDO LOG FILES GENERATED FOR NON-EXISTENT RAC THREADS。

這個bug的狀態還沒有被fixed，在11.2.0.2的bug列表中倒是包括了這個bug的BASE bug號，但是沒有進一步的說明。

其實解決導致的問題並不複雜：

SQL> alter database disable thread 4;

Database altered.

SQL> alter database drop logfile group 7;

Database altered.

SQL> alter database drop logfile group 8;

Database altered.

SQL> select group#, thread#, sequence#, bytes, status
2 from v$log;

    GROUP#    THREAD# SEQUENCE#      BYTES STATUS
---------- ---------- ---------- ---------- ----------------
         1          1         11 524288000 INACTIVE
         2         1         12 524288000 CURRENT
         3          2          5 524288000 CURRENT
         4          2          4 524288000 INACTIVE

SQL> select thread#, status, enabled, instance, current_group#, sequence#
2 from v$thread;

   THREAD# STATUS ENABLED INSTANCE             CURRENT_GROUP# SEQUENCE#
---------- ------ -------- -------------------- -------------- ----------
         1 OPEN   PUBLIC   testrac1                          2         12
         2 OPEN   PUBLIC   testrac2                          3          5

RAC環境建立額外的THREAD

相關文章