skip a transaction in goldengate(跳過一個事務OGG)

shilei1發表於2014-08-11

我們現在用OGG做兩個ORACLE OLTP資料庫的A-A同步含DDL,剛發現Replicat程式ABENDING了,下面分析一下原因

ggserr.log日誌

2012-10-31 17:09:05  WARNING OGG-00869  Oracle GoldenGate Delivery for Oracle, ricme.prm:  OCI Error ORA-02292: integrity constraint (ICME.FK_NOPROSCORE_TO_STU) violated - child record found (status = 2292). UPDATE "ICME"."ICME_STUDENT" SET "IC_CODE" = :a1,"REMARK" = :a2,"MODIFY_TIME" = :a3 WHERE "IC_CODE" = :b0. 2012-10-31 17:09:05 WARNING OGG-01004 Oracle GoldenGate Delivery for Oracle, ricme.prm:  Aborted grouped transaction on 'ICME.ICME_STUDENT', Database error 2292 (OCI Error ORA-02292: integrity constraint (ICME.FK_NOPROSCORE_TO_STU) violated - child record found (status = 2292). UPDATE "ICME"."ICME_STUDENT" SET "IC_CODE" = :a1,"REMARK" = :a2,"MODIFY_TIME" = :a3 WHERE "IC_CODE" = :b0). 2012-10-31 17:09:05 WARNING OGG-01003 Oracle GoldenGate Delivery for Oracle, ricme.prm:  Repositioning to rba 84509907 in seqno 40. 2012-10-31 17:09:05 WARNING OGG-01154 Oracle GoldenGate Delivery for Oracle, ricme.prm: SQL error 2292 mapping ICME.ICME_STUDENT to ICME.ICME_STUDENT OCI Error ORA-02292: integrity constraint (ICME.FK_NOPROSCORE_TO_STU) violated - child record found (status = 2292). UPDATE "ICME"."ICME_STUDENT" SET "IC_CODE" = :a1,"REMARK" = :a2,"MODIFY_TIME" = :a3 WHERE "IC_CODE" = :b0. 2012-10-31 17:09:05 WARNING OGG-01003 Oracle GoldenGate Delivery for Oracle, ricme.prm:  Repositioning to rba 84509907 in seqno 40. 2012-10-31 17:09:05 ERROR   OGG-01296 Oracle GoldenGate Delivery for Oracle, ricme.prm:  Error mapping from ICME.ICME_STUDENT to ICME.ICME_STUDENT. 2012-10-31 17:09:05 ERROR   OGG-01668 Oracle GoldenGate Delivery for Oracle, ricme.prm:  PROCESS ABENDING.

在日誌中能看出大概SQL,我的replicat group配置檔案配置了DiscardFile 記錄了image

[oracle@ggsdb dirrpt]$ vi ricme.dsc 
OCI Error ORA-02292: integrity constraint (ICME.FK_NOPROSCORE_TO_STU) violated – child record found (status = 2292). UPDATE “ICME”.”ICME_STUDENT” SET “IC_COD 
E” = :a1,”REMARK” = :a2,”MODIFY_TIME” = :a3 WHERE “IC_CODE” = :b0 
Aborting transaction on dirdat/l2 beginning at seqno 40 rba 84509907 
error at seqno 40 rba 84509907 
Problem replicating ICME.ICME_STUDENT to ICME.ICME_STUDENT 
Mapping problem with compressed key update record (target format)… 

IC_CODE = 1114020AY 
IC_CODE = 3 
REMARK = 
000000: bf a8 ba c5 d6 d8 b8 b4

看到這個sql,我確認了修改內容,問了下同事果然是失誤操作,修改了學員卡號,而那個卡號上是有trigger,會級連修改好多相關表,而且有外來鍵約束,但從庫上的trigger是disable的,所以就遇到了外來鍵約束導致備庫更新失敗,不過後來同事又修改回來了,資料上在主庫是還原了的,那我可以來跳過此事務

首先先找到replicat程式當前應用到的rba,也就是csn(commit sequence number),在oracle中的scn,來定位下次應用的起始RBA,它就是在trail檔案中下一次replicat 程式將要fseek() call 並起動程式的位置(actual byte position )

GGSCI (ggsdb) 4> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
REPLICAT    ABENDED     RICME       00:00:00      00:29:41    

GGSCI (ggsdb) 5> info rep ricme

REPLICAT   RICME     Last Started 2012-10-31 17:23   Status ABENDED
Checkpoint Lag       00:00:00 (updated 00:29:47 ago)
Log Read Checkpoint  File dirdat/l2000040
                     2012-10-31 17:08:56.879106  RBA 84509907

透過上面的資訊我們知道了replicat程式ricme group 下在應用到了dirdat/l2000040的RBA 84509907,我們想跳過這個事務應用下一條記錄就可以,但是可不是簡單的在當前的RBA上加1,RBA必須是有OGG格式過的,如果輸入的是無效地址啟動後EXCEPTION會記錄到ggserr.log中,我們可以用OGG安裝目錄下的logdump工具來定位下一條記錄的“真正”位置

[oracle@ggsdb ogg11r2]$ ./logdump Oracle GoldenGate Log File Dump Utility for Oracle
Version 11.2.1.0.1 OGGCORE_11.2.1.0.1_PLATFORMS_120423.0230 Copyright (C) 1995, 2012, Oracle and/or its affiliates. All rights reserved.

Logdump 1 >open dirdat/l2000040
Current LogTrail is /oracle/ogg11r2/dirdat/l2000040
Logdump 2 >pos 84509907 Reading forward from RBA 84509907 Logdump 3 >n 2012/10/31 17:08:58.914.149 GGSPKUpdate          Len 69 RBA 84509907 Name: ICME.ICME_STUDENT
After  Image:                                             Partition 4 G  b 0011 0000 000d 0000 0009 3131 3134 3032 3041 5900 | ..........1114020AY. 0000 0500 0000 0133 0018 000c 0000 0008 bfa8 bac5 | .......3............
 d6d8 b8b4 001d 0015 0000 3230 3132 2d31 302d 3331 | ..........2012-10-31 3a31 373a 3034 3a33 39 | :17:04:39 Logdump 4 >n 2012/10/31 17:08:58.914.149 FieldComp            Len 23 RBA 84510103 Name: ICME.ICME_PROJECT_SCORE
After  Image:                                             Partition 4 G  m 0000 000a 0000 0000 0000 0252 1521 0001 0005 0000 | ...........R.!...... 0001 33 | ..3 Logdump 5 >exit

pos是position的縮寫,意思是定位到replicat啟始的位置,n是next的縮寫,第一個n定位顯示出當前應用的記錄,可以看出是update 還有表的名字,還有image的值,我們要跳過這個事務當然要再輸一個n,可以看到下一個記錄的rba是 84510103絕不是前面RBA簡單的加1.這樣我們就可以修改replicat程式啟動時的rba指定為84510103

GGSCI (ggsdb) 1> alter replicat ricme, extrba 84510103 REPLICAT altered.
GGSCI (ggsdb) 3> start ricme

Sending START request to MANAGER ...
REPLICAT RICME starting

當然如果還有失敗的事務還可以繼續next用上面的方法,不過如果有幾個連續的事務需要skip,那就可以用另外一個方法

start rep ricme skiptransaction

不過跳過的事務數是未知的,同樣也會記錄到discard檔案中,如果引數中配置了。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/196700/viewspace-1249079/,如需轉載,請註明出處,否則將追究法律責任。

相關文章