Oracle資料庫 ORA-600 [13013]故障處理

yuntui發表於2016-11-03

中午接到電話,客戶的核心繫統Oracle資料庫例項當機,遠端過去在告警日誌中看到大量的如下報錯,報錯很頻繁:

......
Fri Jul 25 13:20:14 2014
Errors in file /u01/app/oracle/diag/rdbms/d012band/d012band/trace/d012band_smon_5964354.trc  (incident=43361):
ORA-00600: internal error code, arguments: [13013], [5001], [268], [8452274], [7], [8452274], [17], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/d012band/d012band/incident/incdir_43361/d012band_smon_5964354_i43361.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Fri Jul 25 13:20:17 2014
Dumping diagnostic data in directory=[cdmp_20140725132017], requested by (instance=1, osid=5964354 (SMON)), summary=[incident=43361].
Starting background process SMCO
......

一段時間後,最後例項被PMON程式終止,導致核心系統當機。

客戶環境:Oracle Database 11.2.0.3.0 for IBM AIX 6.1,單機,IBM HACMP主備模式保護。


    參考《 New and Improved: ORA-600 [13013] "Unable to get a Stable set of Records" (文件 ID 1438920.1)》和《ORA-600 [13013] "Unable to get a Stable set of Records" (文件 ID 28185.1)》文章;該報錯是由於對某個表執行DML操作,該表對應的某個索引損壞導致的,解決的辦法是找出操作的表和受損的索引,重建索引即可。

下面是對ORA-600的引數說明:

6 Argument format
  =================

  This format relates to Oracle Server 8.0.3 and above

    Arg [a] Passcount
    Arg [b] Data Object number
    Arg [c] Tablespace Decimal Relative DBA (RDBA) of block containing the row to be updated
    Arg [d] Row Slot number
    Arg [e] Decimal RDBA of block being updated (Typically same as [c])
    Arg [f] Code

由此執行下面的SQL語句查詢找出DML操作的表,以及表對應的索引:

SQL> SELECT OWNER,OBJECT_NAME,OBJECT_TYPE FROM DBA_OBJECTS WHERE DATA_OBJECT_ID=268;

發現問題物件是SYS使用者下面的SMON_SCN_TIME表,SYS.SMON_SCN_TIME是Oracle更新非常頻繁的一張表,該資料字典用於維護SCN和時間的關係。

SQL> SELECT OWNER,INDEX_NAME FROM DBA_INDEXES WHERE TABLE_NAME='SMON_SCN_TIME';

發現這個表對應有SMON_SCN_TIME_TIM_IDX和SMON_SCN_TIME_SCN_IDX兩個索引。

對錶進行分析,找出有問題的索引:

SQL> ANALYZE TABLE smon_scn_time VALIDATE STRUCTURE;

Table analyzed.

SQL> ANALYZE TABLE smon_scn_time VALIDATE STRUCTURE CASCADE ONLINE;

table/Index Cross Reference Failure - see trace file

能夠正常的分析表,但是不能正常的分析表上的索引。

參考《OERR: ORA-1499 table/Index Cross Reference Failure - see trace file (文件 ID 1499.1)》文章,嘗試找出受影響的索引:

檢視trace file:
          BH (0x70000078fe2df98) file#: 2 rdba: 0x0081483e (2/84030) class: 1 ba: 0x70000078ce52000
            set: 174 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 2272,19
            dbwrid: 5 obj: 272 objn: 272 tsn: 1 afn: 2 hint: f
            hash: [0x700000801ff36a0,0x700000801ff36a0] lru: [0x70000074bf81f40,0x70000074bf81ce0]
            ckptq: [NULL] fileq: [NULL] objq: [0x7000007f59b4138,0x7000007f59b4138] objaq: [0x7000007f59b4128,0x7000007f59b4128]
            use: [NULL] wait: [NULL] fast-cr-pins: 2
            st: XCURRENT md: NULL fpin: 'kdiwh15: kdifxs' tch: 12
            flags:
            LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [65535]
            buffer tsn: 1 rdba: 0x0081483e (2/84030)
            scn: 0x0000.2f4752d8 seq: 0x01 flg: 0x06 tail: 0x52d80601
            frmt: 0x02 chkval: 0x0778 type: 0x06=trans data

0x0081483e要去掉0x,0x表示十六進位制

執行下面的SQL語句:

SELECT owner, segment_name, segment_type, partition_name
FROM   DBA_SEGMENTS
WHERE  header_file = (SELECT file# 
                      FROM   v$datafile 
                      WHERE  rfile# = dbms_utility.data_block_address_file(to_number('0080f8b2','XXXXXXXX'))
                        AND  ts#= 1)
  AND header_block = dbms_utility.data_block_address_block(to_number('0080f8b2','XXXXXXXX'));

可惜沒有任何的查詢結果。

繼續參考《Instance Terminated With Error ORA-00474: SMON Process Terminated With Error (文件 ID 1361872.1)》文章,文章有下面一段描述:

CAUSE

ORA600 [13011] is raised due to indexes corruption. To verify the corruption run the following statements:


SQL> conn / as sysdba
SQL> ANALYZE TABLE smon_scn_time VALIDATE STRUCTURE;

Table analyzed.

-- It should come out clean giving message table analyzed.

SQL> ANALYZE TABLE smon_scn_time VALIDATE STRUCTURE CASCADE ONLINE; 

-- it should fail with Ora-1499 if at least one index is corrupted

ORA-1499就是我們前面遇到的報錯:

Error: ORA 1499 
Text: table/Index Cross Reference Failure - see trace file 
-------------------------------------------------------------------------------
Cause:  An error occurred when validating an index or a table using the 
        ANALYZE command.
        One or more entries does not point to the appropriate cross-reference.
Action: Check the trace file for more descriptive messages about the problem.
        Correct these errors.

如果在分析表的時候收到ORA-1499的錯誤,表示至少有一個索引是損壞的。

SOLUTION

Rebuild corrupted indexes:


SQL> conn as sysdba
SQL> ALTER INDEX SMON_SCN_TIME_TIM_IDX REBUILD ONLINE;
SQL> ALTER INDEX SMON_SCN_TIME_SCN_IDX REBUILD ONLINE;

then re-run


SQL> ANALYZE TABLE smon_scn_time VALIDATE STRUCTURE CASCADE ONLINE;

Table analyzed.

Note: The last statement should not report any errors.


兩個索引重建之後,ANALYZE TABLE smon_scn_time VALIDATE STRUCTURE CASCADE ONLINE語句能順利的分析,告警日誌不再有報錯產生。

總結:要學會充分利用MOS,根據不同的線索搜尋不同的文章來看,不要在一篇文章上吊死。

--end--

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/30633755/viewspace-2127734/,如需轉載,請註明出處,否則將追究法律責任。

相關文章