goldengate extract abended unable to queue I/O, I/O beyond file size

paulyibinyi發表於2011-10-10

Applies to:

Oracle GoldenGate - Version: 10.4.0.99 and later   [Release: 10.4.0 and later ]
Information in this document applies to any platform.

Symptoms

extract that reads archived logs on ASM hits following error:

Problem now occuring with a 1 Extract configuration. Extract abends with:
11864334 records processed as of 2010-07-21 02:56:02 (rate 102,delta 303)
2010-07-21 02:58:34.374 Redo thread 2: Online log +INDEX01/actppr/onlinelog/group_4.264.669999177 on sequence# 48708 has missing trailing blocks.
2010-07-21 02:58:52.752 Redo thread 2: Online log +INDEX01/actppr/onlinelog/group_4.264.669999177 on sequence# 48710 has missing trailing blocks.
2010-07-21 02:58:53.132 Redo thread 2: Corrupted data (non-ALO) in archived log +FLASH01/actppr/archivelog/2010_07_21/thread_2_seq_48710.1609.724906723 on sequence# 48710
2010-07-21 02:59:05.717 Redo thread 2: Failed to read in more data from log +FLASH01/actppr/archivelog/2010_07_21/thread_2_seq_48710.1609.724906723 on sequence# 48710
2010-07-21 02:59:18.654 Redo thread 2: Failed to read in more data from log +FLASH01/actppr/archivelog/2010_07_21/thread_2_seq_48710.1609.724906723 on sequence# 48710
212146455571218590 Redo Thread 2: thread abend: REDO_read_transaction( 1, (nil), Reading ASM file +FLASH01/actppr/archivelog/2010_07_21/thread_2_seq_48710.1609.724906723, SQL : (27091) ORA-27091: unable to queue I/O
212146455571218590 Redo Thread 2: ORA-17510: Attempt to do i/o beyond file size
212146455571218590 Redo Thread 2: ORA-06512: at "SYS.X$DBMS_DISKGROUP", line 124
212146455571218590 Redo Thread 2: ORA-06512: at line 1 )-> 500

2010-07-21 02:59:31 GGS ERROR 190 Reading ASM file +FLASH01/actppr/archivelog/2010_07_21/thread_2_seq_48710.1609.724906723, SQL : (27091) ORA-27091: unable to queue I/O
ORA-17510: Attempt to do i/o beyond file size
ORA-06512: at "SYS.X$DBMS_DISKGROUP", line 124
ORA-06512: at line 1.

Cause

There can be 2 causes for this error:
1. ASM diskgroup is frequently mounted and dismounted, which makes the ASM diskgroup temporarily unavailable.
 
2. When extract is processing archived log, extract waits for the log to be completely written out before starting to processing them. However if extract could not validate this via check on the block header from very last block and file size is not increasing, then extract abends after a wait of about 10 seconds.

Solution

Based on the cause, following solution/workaround may be used:

1. Check database alert.logs. If there are many mounted and dismounted messages for the ASM diskgroup, this is likely a known issue  particularly in an environment where multiple database are being supported where when an archive log is written then we dismount the diskgroup. If another environment tries to access the diskgroup while it is being dismounted we get errors. A small dummy tablespace in the related diskgroup may be created. This will then prevent the diskgroup from constantly being mounted and dismounted. This will then eliminate the possibility for this to be the cause.

e.g, CREATE TABLESPACE dummy DATAFILE '+FLASH01' SIZE 10M;

In this way there is always a file open in the diskgroup and therefore it will not be repeatedly mounted and dismounted.

2. If the above is not the cause of the problem, following parameter may be used:
    "TRANLOGOPTIONS COMPLETEARCHIVEDLOGTIMEOUT 600 "

When extract is processing archived log, extract would wait for the log to be completely written out before it start processing them. However if extract could not validate this via check on the block header from very last block and file size is not increasing, then extract would abend after a wait of about 10 seconds. The parameter "COMPLETEARCHIVEDLOGTIMEOUT" is to be followed by integer number to indicate the number of seconds to wait.

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/7199859/viewspace-708922/,如需轉載,請註明出處,否則將追究法律責任。

相關文章