很久以前某次銀行生產環境環境data gurad新增表空間資料檔案故障(UNNAMED00011)

long_small發表於2020-06-22

一 環境:  data gurad 主備機之間使用物理standby,資料庫版本 11.2.0.1.0,aix 平臺



二 故障描述:主庫使用以下語句新增表空間的資料檔案後,備庫沒有同步建立對應資料檔案:

        ALTER TABLESPACE OCR ADD DATAFILE '/orcldata/ocr7.dbf' SIZE 30000M 

             備庫查詢日誌使用情況:SELECT SEQUENCE#,APPLIED FROM V$ARCHIVED_LOG ORDER BY SEQUENCE#,

         發現日誌從80929到80947都沒有應用

            檢查程式:SELECT PROCESS, STATUS FROM V$MANAGED_STANDBY; 結果結果無MRP0程式

            檢查備庫是否有沒有傳遞過來的日誌:SELECT * FROM V$ARCHIVE_GAP; 結果為no rows 即所有主庫產生的日誌 

          都傳遞到備庫, 在主備庫之間使用archive log list 和在歸檔目錄下ls -l檢查兩邊日誌都一樣

            select  DEST_ID,DEST_NAME,error from v$archive_dest;--無結果

      以上情況說明:歸檔傳輸沒有問題,在備庫端無法應用日誌。

三 解決過程:


   檢查引數

   主庫:show parameter standby_file_management  

    standby_file_management=auto


   備庫:show parameter standby_file_management  


    standby_file_management=manual


   修改備庫引數standby_file_management為auto,重啟備庫。

   vi  /orcldata/initorcl.int

    standby_file_management=auto

   startup pfile='/orcldata/initorcl.int' mount;

  試圖重新啟動日誌應用:

   alter database recover managed standby database disconnect from session;

   執行成功:database altered

   但是在檢查程式,和日誌應用情況,發現依然沒有應用

   SELECT SEQUENCE#,APPLIED FROM V$ARCHIVED_LOG ORDER BY SEQUENCE#;

   ---APPLIED為no的日誌依然從80929開始。


   SELECT PROCESS, STATUS FROM V$MANAGED_STANDBY;--  無MRP0程式

   

  啟動資料庫到open狀態,alter database open,報錯如下:


Beginning standby crash recovery.

Serial Media Recovery started

Managed Standby Recovery starting Real Time Apply

Standby crash recovery aborted due to error 1111.

Errors in file /oracle/diag/rdbms/orcl2/orcl/trace/orcl_ora_16515082.t

ORA-01111: name for data file 11 is unknown - rename to correct file

ORA-01110: data file 11: '/oracle/db/dbs/UNNAMED00011'

ORA-01157: cannot identify/lock data file 11 - see DBWR trace file

ORA-01111: name for data file 11 is unknown - rename to correct file

ORA-01110: data file 11: '/oracle/db/dbs/UNNAMED00011'

Completed standby crash recovery.




發現出現異常檔案:/oracle/db/dbs/UNNAMED00011,ls該檔案又沒有結果

$ ls -l /oracle/db/dbs/UNNAMED00011

ls: 0653-341 The file /oracle/db/dbs/UNNAMED00011 does not exist.


根據報錯:ORA-01111提示,在主庫檢查資料檔案

select name,file# from v$datafile;

發現11號檔案對應的就是剛才新增的資料檔案:/orcldata/ocr7.dbf



在備庫同樣檢查:select name,file# from v$datafile;


發現11好檔案對應的是:/oracle/db/dbs/UNNAMED00011


由此判斷:主庫新增的資料檔案/orcldata/ocr7.dbf在備庫被應用錯誤,產生了/oracle/db/dbs/UNNAMED00011檔案。


根據以上情況,初步判斷主庫無法應用日誌是因為在備庫沒有產生正確的檔名,檢視透過rename檔名,修改正確,

 重新啟動備庫日誌應用應該沒有問題,詳見以下步驟:

alter system set STANDBY_FILE_MANAGEMENT=AUTO;


rename 不正常的檔案/oracle/db/dbs/UNNAMED00011到正確的檔案:

ALTER DATABASE RENAME FILE '/oracle/db/dbs/UNNAMED00011'  TO '/orcldata/ocr7.dbf';

執行不成功:oracle提示無法找到/orcldata/ocr7.dbf,這是由於系統上沒有/orcldata/ocr7.dbf檔案。

在作業系統touch一個檔案 /orcldata/ocr7.dbf


再次執行ALTER DATABASE RENAME FILE '/oracle/db/dbs/UNNAMED00011'  TO '/orcldata/ocr7.dbf';


修改成功,再次嘗試開始啟動日誌應用:


alter database recover managed standby database disconnect from session;

   執行不成功:提示11號資料檔案需要恢復。

   但是在檢查程式,和日誌應用情況,發現依然沒有應用

   SELECT SEQUENCE#,APPLIED FROM V$ARCHIVED_LOG ORDER BY SEQUENCE#;

  ---APPLIED為no的日誌依然從80929開始。

   SELECT PROCESS, STATUS FROM V$MANAGED_STANDBY;--  無MRP0程式

   

至此,陷入僵局,根據提示使用以下命令恢復資料庫 recover database,recover datafile 11都失敗。


後根據錯誤號查詢metlink,根據oracle官網提示,出現這種情況的解決方案是:當備庫日誌應用出現異常檔案

 /oracle/db/dbs/UNNAMED00011


應該透過以下方法解決 

-----------------------------------------------------------------------------------------------------------------------------------

ALTER SYSTEM SET standby_file_management='MANUAL' SCOPE=BOTH;

alter database create datafile  '/oracle/db/dbs/UNNAMED00011' as '/orcldata/ocr7.dbf';

 --即透過異常檔案建立出需要應用的正確檔名。

ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH;

ALTER DATABASE RECOVER  managed standby database disconnect from session 


-----------------------------------------------------------------------------------------------------------------------------------



由於之前透過rename資料檔名:

ALTER DATABASE RENAME FILE '/oracle/db/dbs/UNNAMED00011'  TO '/orcldata/ocr7.dbf';

再次檢查,select name,file# from v$datafile;

發現11好檔案對應已經是正常的檔案:/orcldata/ocr7.dbf,即無法透過官網提示的解決方案操作。


試圖透過rename 操作複用以前的情況:ALTER DATABASE RENAME FILE '/orcldata/ocr7.dbf'  

TO ' /oracle/db/dbs/UNNAMED00011';

提示不成功,因為沒有檔案/oracle/db/dbs/UNNAMED00011,該檔案在作業系統下本來不存在。


再嘗試將/oracle/db/dbs/UNNAMED00011 rename 到另外一個檔案 /orcldata/ocr8.dbf,然後

將/orcldata/ocr8.dbf rename 到 /orcldata/ocr7.dbf,方案依然失敗,如下:


ALTER DATABASE RENAME FILE '/oracle/db/dbs/UNNAMED00011'  TO '/orcldata/ocr8.dbf';

ALTER DATABASE RENAME FILE '/orcldata/ocr7.dbf'  TO '/orcldata/ocr8.dbf';


最終解決辦法,見(四 解決方案 2)


四 最終解決方案


1 當主庫新增資料檔案,備庫無法應用,在$ORACLE_HOME/dbs目錄下產生UNNAMED000N檔案時候,參考以下解決方案:


啟動備庫到日誌應用狀態報錯:


Sun Jul 5 23:28:23 2009

Media Recovery Log /opt/oracle/archivelog/1_47_689973859.dbf

Media Recovery Log /opt/oracle/archivelog/1_48_689973859.dbf

Media Recovery Log /opt/oracle/archivelog/1_49_689973859.dbf

WARNING: File being created with same name as in Primary

Existing file may be overwritten

File #5 added to control file as 'UNNAMED00005'.

Originally created as:

'/opt/oracle/oradata/mmstest/test01.dbf'

Recovery was unable to create the file as:

'/opt/oracle/oradata/mmstest/test01.dbf'

Errors with log /opt/oracle/archivelog/1_49_689973859.dbf


出現此種情況,進一步的告警日誌可能會報出如下錯誤:



Sun Jul 5 23:28:28 2009

Errors in file /opt/oracle/admin/mmstest/bdump/mmstest_mrp0_32062.trc:

ORA-19502: write error on file "/opt/oracle/oradata/mmstest/test01.dbf", blockno 1024 (blocksize=8192)

ORA-27072: File I/O error

Linux Error: 9: Bad file descriptor

Additional information: 4

Additional information: 1024

Additional information: 397312

Some recovered datafiles maybe left media fuzzy

Media recovery may continue but open resetlogs may fail

Sun Jul 5 23:28:29 2009

Errors in file /opt/oracle/admin/mmstest/bdump/mmstest_mrp0_32062.trc:

ORA-19502: write error on file "/opt/oracle/oradata/mmstest/test01.dbf", blockno 1024 (blocksize=8192)

ORA-27072: File I/O error

Linux Error: 9: Bad file descriptor

Additional information: 4

Additional information: 1024

Additional information: 397312

以及嘗試recover時可能再次出現:


Mon Jul 6 01:36:30 2009

Errors in file /opt/oracle/admin/mmstest/bdump/mmstest_mrp0_32589.trc:

ORA-01111: name for data file 5 is unknown - rename to correct file

ORA-01110: data file 5: '/opt/oracle/product/10.2.0/dbs/UNNAMED00005'

ORA-01157: cannot identify/lock data file 5 - see DBWR trace file

ORA-01111: name for data file 5 is unknown - rename to correct file

ORA-01110: data file 5: '/opt/oracle/product/10.2.0/dbs/UNNAMED00005'


出現這些錯誤時MRP程式會停止工作,恢復中斷:


Mon Jul 6 01:36:30 2009

MRP0: Background Media Recovery process shutdown (mmstest)


在修正相關的問題之後,我們可以進行如下一系列的操作來恢復這些錯誤:


SQL> alter system set standby_file_management=manual;


System altered.


SQL> alter database create datafile 

2 '/opt/oracle/product/10.2.0/dbs/UNNAMED00005' as '/opt/oracle/oradata/mmstest/test01.dbf';


Database altered.


SQL> alter system set standby_file_management=auto;


System altered.


SQL> recover managed standby database disconnect from session;

Media recovery complete.


此時備庫的恢復得以繼續:

Mon Jul 6 01:41:14 2009

ALTER SYSTEM SET standby_file_management='MANUAL' SCOPE=MEMORY;

Mon Jul 6 01:42:13 2009

alter database create datafile

'/opt/oracle/product/10.2.0/dbs/UNNAMED00005' as '/opt/oracle/oradata/mmstest/test01.dbf'

Mon Jul 6 01:42:14 2009

Completed: alter database create datafile

'/opt/oracle/product/10.2.0/dbs/UNNAMED00005' as '/opt/oracle/oradata/mmstest/test01.dbf'

Mon Jul 6 01:42:26 2009

ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=MEMORY;

Mon Jul 6 01:42:40 2009

ALTER DATABASE RECOVER managed standby database disconnect from session

Mon Jul 6 01:42:40 2009

Attempt to start background Managed Standby Recovery process (mmstest)

MRP0 started with pid=16, OS id=32607

Mon Jul 6 01:42:41 2009

MRP0: Background Managed Standby Recovery process started (mmstest)

Managed Standby Recovery not using Real Time Apply

parallel recovery started with 3 processes

Media Recovery Log /opt/oracle/archivelog/1_49_689973859.dbf

Mon Jul 6 01:42:47 2009

Completed: ALTER DATABASE RECOVER managed standby database disconnect from session

Mon Jul 6 01:43:02 2009

Media Recovery Log /opt/oracle/archivelog/1_50_689973859.dbf

Mon Jul 6 01:43:17 2009

Media Recovery Log /opt/oracle/archivelog/1_51_689973859.dbf

Mon Jul 6 01:43:32 2009

Media Recovery Log /opt/oracle/archivelog/1_52_689973859.dbf

Mon Jul 6 01:43:45 2009

Media Recovery Log /opt/oracle/archivelog/1_53_689973859.dbf


正常情況下的配置及檔案建立,其提示應該類似如下過程:

Mon Jul 6 01:53:28 2009

WARNING: File being created with same name as in Primary

Existing file may be overwritten

Recovery created file /opt/oracle/oradata/mmstest/wztest02.dbf

Successfully added datafile 7 to media recovery

Datafile #7: '/opt/oracle/oradata/mmstest/wztest02.dbf'

Media Recovery Log /opt/oracle/archivelog/1_80_689973859.dbf


在這個測試環境中,是由於空間不足導致的檔案建立失敗。


注意,在以上步驟中,如果standby_file_management設定為AUTO時,執行create命令會遇到如下錯誤:

SQL> alter database rename 

2 file '/opt/oracle/product/10.2.0/dbs/UNNAMED00005' to '/opt/oracle/oradata/mmstest/test01.dbf';


alter database rename

*

ERROR at line 1:

ORA-01511: error in renaming log/data files

ORA-01275: Operation RENAME is not allowed if standby file management is automatic.


2 由於報錯提示一個資料檔案異常,可以採取恢復備庫單個資料檔案的辦法。

  主庫先備份對應資料檔案的備份copy,再建立一個standby控制檔案

              在備庫端,先使用主庫產生的standby控制檔案,重新命名到引數檔案中記錄的控制檔案指向:

           /orcldata/control01.ctl和/orcldata/control02.ctl

              啟動備庫到mount狀態,恢復11號資料檔案 recover datafile 11

            

具體步驟是:


主庫端:

主庫先備份一個資料檔案的映像


rman target /

backup as copy datafile '/orcldata/ocr7.dbf' format='/orcldata/ocr7_stb';


sqlplus / as sysdba


主庫生成standby控制檔案:


alter database create standby controlfile as '/orcldata/control_stb.ctl';


ftp control_stb.ctl和 ocr7_stb到備庫


備庫:


備份備庫原有的控制檔案

cp /orcldata/control01.ctl /oradata 

cp /orcllog/control02.ctl  /oradata 

使用主庫生成的statdby控制檔案:control_stb.ctl 


cp /orcldata/control_stb.ctl /orcldata/control01.ctl

cp /orcldata/control_stb.ctl  /orcllog/control02.ctl


啟動資料庫到mount:

startup pfile='/orcldata/initorcl.int' mount;


查詢需要恢復的資料檔案:

select  ERROR  from  v$recover_file;

提示 11號檔案需要恢復


recover datafile 11


提示需要使用當前的日誌檔案,再執行以下命令


alter database recover managed standby database using current logfile disconnect from session;

 

提示需要先取消應用

 

alter database recover managed standby database cancel


再執行recover datafile 11


--recover managed standby database; 


成功,日誌逐漸恢復

ok....



檢視alter日誌動態變化,逐個應用歸檔日誌,約過半小時,沒有應用的日誌全部應用完畢


在主庫切換日誌,備庫正常應用,至此解決問題。


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/25846553/viewspace-2699910/,如需轉載,請註明出處,否則將追究法律責任。

相關文章