Stuck recovery of database ORA-00600[3020] (Doc ID 283269.1)

rongshiyuan發表於2014-06-28

Stuck recovery of database ORA-00600[3020] (Doc ID 283269.1)


In this Document

Symptoms
Cause
Solution
References

Applies to:

Enterprise Manager for Oracle Database - Version 10.1.0.2 to 11.1 [Release 10.1 to 10.2]
Oracle Database - Enterprise Edition - Version 9.0.1.4 to 11.1.0.6 [Release 9.0.1 to 11.1]
Oracle Database - Personal Edition - Version 9.2.0.1 to 11.1.0.6 [Release 9.2 to 11.1]
Information in this document applies to any platform.
***Checked for relevance on 16-Apr-2014***

Symptoms

While doing a recovery the recovery process may fails with ORA-600[3020] error message

ORA-00600: internal error code, arguments: [3020], [2885689059], [1], [419819],[26750], [808], [], []
ORA-10567: Redo is inconsistent with data block (file# %s, block# %xxxx)
ORA-10564: tablespace
ORA-01110: data file %s

Cause

Recovery stops because of failed consistency checks, a problem called stuck recovery. Stuck recovery can occur when an underlying operating system or storage system loses a write issued by the database during normal operation. There is an inconsistency between the information stored in the redo and the information stored in a database block being recovered.

The database signals an internal error when applying the redo. This problem can be caused by an Oracle Database bug or may be because of I/O problem ( hardware or O/S related issue )

There is a known EMC issue related to an RDBMS ORA-600 [3020] where the root-cause is on OS/Hardware level.

Details from EMC on the nature of the fix (problem with Symmetrix microcode)
     ID: emc230687
     Domain: EMC1
    Solution Class: 3.X Compatibility

Solution

When media recovery encounters a problem, the alert log may indicate that recovery can continue if it is allowed to corrupt the data block causing the problem. The alert log contains information about the block: its block type, block address, the tablespace it belongs to, and so forth. For blocks containing user data, the alert log may also report the data object number.

In this case, the database can proceed with recovery if it is allowed to mark the problem block as corrupt. Nevertheless, this response is not always advisable. For example, if the block is an important block in the SYSTEM tablespace, marking the block as corrupt can eventually prevent you from opening the recovered database. Another consideration is whether the recovery problem is isolated. If this problem is followed immediately by many other problems in the redo stream, then you may want to open the database with the RESETLOGS option.

For a block containing user data, you can usually query the database to find out which object or table owns this block. If the database is not open, then you should be able to open the database read-only, even if you are recovering a whole database backup. The following example cancels recovery and opens read-only:

CANCEL 
ALTER DATABASE OPEN READ ONLY; 


From the alert.log error message we can found the dataifle and its corresponding block number. Even for the blocks containing user data, the alert log may also report the data object number. From this we can get the object information.

SQL>SELECT SEGMENT_NAME FROM DBA_EXTENTS WHERE FILE_ID= &file_number 
AND &BLOCK_NUMBER BETWEEN BLOCK_ID AND BLOCK_ID+BLOCKS-1;


-- where  File_number  is the datafile number in the error message and
 Block_Number is the block number in the error message.


In case we get the data object number in the alert.log then we can determine the owner, object name, and object type by issuing this query:

SQL>SELECT OWNER, OBJECT_NAME, SUBOBJECT_NAME, OBJECT_TYPE
FROM DBA_OBJECTS
WHERE DATA_OBJECT_ID = &Object_number;

-- Where object_number is object_id in the error message.

To determine whether a recovery problem is isolated, we can run a diagnostic trial recovery, which scans the redo stream for problems but does not actually make any changes to the recovered database. If a trial recovery discovers any recovery problems, then it reports them in the alert_SID.log. You can use the RECOVER ... TEST statement to invoke trial recovery. Kindly refer Note 283262.1 Trial Recovery

Note : If the problem is not isolated or its belongs to SYSTEM tablespace then its better to open the database with the RESETLOGS option.


If the block is relatively unimportant like belogs to index tablespace or if the problem is isolated, then it is better to corrupt the block. If you decide to allow recovery to proceed in spite of block corruptions, then run the RECOVER command with the ALLOW n CORRUPTION clause, where n is the number of allowable corrupt blocks.

To allow recovery to corrupt blocks:

1. Ensure that all normal recovery preconditions are met.

2. Run the RECOVER command, allowing a single corruption, repeating as necessary for each corruption to be made. for an example :

SQL>RECOVER DATABASE ALLOW 1 CORRUPTION; 

Note : The ALLOW integer CORRUPTION clause lets you specify, in the event of logfile corruption, the number of corrupt blocks that can be tolerated while allowing recovery to proceed.

When you use this clause during trial recovery (that is, in conjunction with the TEST clause), integer can exceed 1. When using this clause during normal recovery, integer cannot exceed 1.

References

NOTE:283262.1 - Trial Recovery




來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/17252115/viewspace-1197752/,如需轉載,請註明出處,否則將追究法律責任。

相關文章