DB2 DDL操作引起的GoldenGate錯誤

keeptrying發表於2013-02-04

源端:

程式:rtop1ptop1(pump)

資料:

D:\>db2 select * from topicis.bb

 

ID

-----------

          1

          2

          3

 

  3 條記錄已選擇。

 

目標端:

程式:rtop1

資料:

[topicis@localhost ~]$ db2 "select * from topicis.bb"

 

ID        

-----------

          2

          3

          1

 

  3 record(s) selected.

 

 

修改源端表結構:

D:\>db2 alter table topicis.bb add column name varchar(10)

DB20000I  SQL 命令成功完成。

 

D:\>db2 select * from topicis.bb

 

ID          NAME

----------- ----------

          1 -

          2 -

          3 -

 

  3 條記錄已選擇。

 

這裡,extract程式沒有報錯。對於DB2GoldenGate不支援DDL語句。Replicat程式也沒有報錯。

下面執行DML操作:

D:\>db2 update topicis.bb set name='a' where id=1

DB20000I  SQL 命令成功完成。

 

D:\>db2 select * from topicis.bb

 

ID          NAME

----------- ----------

          1 a

          2 -

          3 -

 

  3 條記錄已選擇。

 

 

這樣,源端沒有報錯,目標端產生了錯誤:

2013-02-04 10:23:30  ERROR   OGG-01168  Encountered an update for target table TOPICIS.BB, which has no unique key defined.  KEYCOLS can be used to define a key.  Use ALLOWNOOPUPDATES to process the update without applying it to the target database.  Use APPLYNOOPUPDATES to force the update to be applied using all columns in both the SET and WHERE clause.

 

提示使用ALLOWNOOPUPDATESAPPLYNOOPUPDATES引數,下面看兩個引數的作用:

 

ALLOWNOOPUPDATEScontrol how Replicat responds to a “no-op” operation.

什麼是no-op” operation操作呢?

A no-op operation is one in which there is no effect on the target table.For example:

l     The source table has a column that does not exist in the ta rget table, or has a column that was excluded from replication (with a  COLSEXCEPT clause). In either case, if that source column is updated, there will be  no target column  name to use in the  SET clause within the Replicat SQL statement.

l     An update is made that sets a column to the same value as the current one. The database does not log the new value, because it did not really change. However, Oracle GoldenGate extracts the operation as a change record because the primary key was logged — but there is no column value for the  SET clause in the Replicat SQL statement.

我這裡就是第一種情況,源端topicis.aa表有name列,而目標端沒有,當修改name列時,目標端找不到該列,就認為是”no-op” operations.

With  ALLOWNOOPUPDATES, Replicat ignores the operation instead of abending. The statistics reported by Replicat will show that an update was made, but the database will not be updated.

ALLOWNOOPUPDATES引數的作用是將這些”no-op”操作忽略,從而使replicat程式不abend

 

APPLYNOOPUPDATES:

Use APPLYNOOPUPDATES to force a “no-op” update to be applied using all columns in both the SET and  WHERE clauses.

APPLYNOOPUPDATES uses whatever data is in the trail. If there is a primary key update record, it uses the before columns from the source. If there is a regular (non-key) update, it assumes that the after value is the same as the before value (otherwise it would be a primary key update).

 

下面分別試一下這兩個引數:

 

GGSCI (localhost.localdomain) 18> edit params rtop1

--Control Replicat to ignores the "no-op" operations instead of abending

ALLOWNOOPUPDATES

 

GGSCI (localhost.localdomain) 20> info all

 

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

 

MANAGER     RUNNING                                          

REPLICAT    RUNNING     RTOP1       00:00:00      00:00:01   

 

GGSCI (localhost.localdomain) 24> stats rtop1

*** Latest statistics since 2013-02-04 10:56:08 ***

        Total inserts                                      0.00

        Total updates                                      1.00

        Total deletes                                      0.00

        Total discards                                     0.00

        Total operations                                   1.00

 

End of Statistics.

 

The statistics reported by Replicat will show that an update was made, but the database will not be updated.

實際上表沒有被update

 

[topicis@localhost ~]$ db2 "select * from topicis.bb"

 

ID        

-----------

          2

          3

          1

 

  3 record(s) selected.

 

 

下面測試另外一個引數APPLYNOOPUPDATES:

GGSCI (localhost.localdomain) 26> stop rtop1

GGSCI (localhost.localdomain) 27> edit params rtop1

--Control Replicat to ignores the "no-op" operations instead of abending

--ALLOWNOOPUPDATES

 

--Control Replicat to force "no-op" operations to be applied using all columns in both the set and where clause

APPLYNOOPUPDATES

 

GGSCI (localhost.localdomain) 29> info all

 

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

 

MANAGER     RUNNING                                          

REPLICAT    RUNNING     RTOP1       00:00:23      00:00:01    

 

源端操作:

D:\>db2 update topicis.bb set name='b' where id=2

DB20000I  SQL 命令成功完成。

 

 

GGSCI (localhost.localdomain) 31> stats rtop1

*** Latest statistics since 2013-02-04 11:04:57 ***

        Total inserts                                      0.00

        Total updates                                      1.00

        Total deletes                                      0.00

        Total discards                                     0.00

        Total operations                                   1.00

 

End of Statistics.

 

[topicis@localhost ~]$ db2 "select * from topicis.bb"

 

ID        

-----------

          2

          3

          1

 

  3 record(s) selected.

 

兩個引數的效果是一樣的。

但是這兩個引數只能解決Replicat程式不處於abend狀態。兩邊的表結構一直這樣處於不一致狀態,這樣肯定不行。由於GoldenGate目前對於DB2不支援DDL,所以解決問題的根本辦法就是讓目標端表結構和源端表結構一致。

方法是先在源端extract程式中指定TABLEEXCLUDE引數,從”TABLE *”語句中除去topicis.bb表。然後使該表在兩端一致化後,再將表新增進來。

 

GGSCI (localhost.localdomain) 36> stop rtop1

 

Sending STOP request to REPLICAT RTOP1 ...

Request processed.

 

 

GGSCI (localhost.localdomain) 37> edit params rtop1

--APPLYNOOPUPDATES

 

源端:

GGSCI (liuxiaohui) 108> stop *top1

 

Sending STOP request to EXTRACT ETOP1 ...

Request processed.

 

Sending STOP request to EXTRACT PTOP1 ...

Request processed.

 

GGSCI (liuxiaohui) 109> edit params etop1

--Explicitly exclude tables from a wildcard specification

TABLEEXCLUDE topicis.bb

 

--Specify tables to be captured

TABLE topicis.*;

 

這裡,測試環境中可以直接停掉extractreplicat程式,對錶進行同步後再開啟。然而,在生產環境中,對於一個較大的表,同步過程需要一定的時間,不可能等同步完成後再啟動程式。所以必須先新增上TABLEEXCLUDE引數,啟動兩端程式,這樣,同步過程影響到的只是這一張表。

 

GGSCI (liuxiaohui) 110> start *top1

 

Sending START request to MANAGER ('GGSMGR') ...

EXTRACT ETOP1 starting

 

Sending START request to MANAGER ('GGSMGR') ...

EXTRACT PTOP1 starting

 

GGSCI (liuxiaohui) 114> info all

 

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

 

MANAGER     RUNNING

EXTRACT     RUNNING     ETOP1       00:00:00      00:00:02

EXTRACT     RUNNING     PTOP1       00:00:00      00:00:09

 

bb表進行同步。我這裡直接給目標端bb表新增name列。生產環境中,不知道對錶具體有哪些DDL操作,而且DDL操作後也會有很多DML操作,所以必須停止該表的活動後將表導到目標端。

[topicis@localhost ~]$ db2 "alter table topicis.bb add column name varchar(10)"

DB20000I  The SQL command completed successfully.

 

然後就可以將TABLEEXCLUDE引數去掉了。

GGSCI (liuxiaohui) 118> edit params etop1

--TABLEEXCLUDE topicis.bb

 

GGSCI (liuxiaohui) 120> start *top1

 

Sending START request to MANAGER ('GGSMGR') ...

EXTRACT ETOP1 starting

 

Sending START request to MANAGER ('GGSMGR') ...

EXTRACT PTOP1 starting

 

 

GGSCI (liuxiaohui) 121> info all

 

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

 

MANAGER     RUNNING

EXTRACT     RUNNING     ETOP1       00:00:00      00:00:02

EXTRACT     RUNNING     PTOP1       00:00:00      00:00:49

 

目標端:

GGSCI (localhost.localdomain) 40> start rtop1

 

Sending START request to MANAGER ...

REPLICAT RTOP1 starting

 

 

GGSCI (localhost.localdomain) 41> info all

 

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

 

MANAGER     RUNNING                                           

REPLICAT    RUNNING     RTOP1       00:00:00      00:00:01    

 

 

測試一下再對源端操作:

D:\>db2 update topicis.bb set name='c'

DB20000I  SQL 命令成功完成。

 

D:\>db2 select * from topicis.bb

 

ID          NAME

----------- ----------

          1 c

          2 c

          3 c

 

  3 條記錄已選擇。

 

發現資料沒同步過去,extract程式abend了。

GGSCI (liuxiaohui) 123> info all

 

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

 

MANAGER     RUNNING

EXTRACT     ABENDED     ETOP1       00:00:00      00:00:35

EXTRACT     RUNNING     PTOP1       00:00:00      00:00:00

 

看一下report

GGSCI (liuxiaohui) 124> view report etop1

2013-02-04 11:34:13  ERROR   OGG-00823  A key column on the after image was not present in the before image on table TOPICIS.BB column index 1 This is because column(s) were added to the table after it was originally created. This is a known issue with DB2 LUW that can only be addressed by IBM.

2013-02-04 11:34:14  ERROR   OGG-01668  PROCESS ABENDING.

 

錯誤提示:在TOPICIS.BB表上,由於在初始化建立後新增了列,導致在after image的鍵列在before image中不存在。

發出如下命令:

GGSCI (liuxiaohui) 130> alter extract etop1 begin now

EXTRACT altered.

 

GGSCI (liuxiaohui) 131> start etop1

 

Sending START request to MANAGER ('GGSMGR') ...

EXTRACT ETOP1 starting

 

 

GGSCI (liuxiaohui) 132> info all

 

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

 

MANAGER     RUNNING

EXTRACT     RUNNING     ETOP1       00:00:00      00:00:06

EXTRACT     RUNNING     PTOP1       00:00:00      00:00:07

 

修改程式後可以啟動。

 

再次發出update命令:

D:\>db2 update topicis.bb set name='zz'

DB20000I  SQL 命令成功完成。

 

GGSCI (liuxiaohui) 180> info all

 

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

 

MANAGER     RUNNING

EXTRACT     RUNNING     ETOP1       00:00:00      00:00:06

EXTRACT     RUNNING     PTOP1       00:00:00      00:00:06

 

Extract程式沒有abend。檢視etop1ptop1程式,發現操作被程式傳遞到了目標端。但是目標端就是沒有資料,程式也沒有報錯,report也沒有記錄任何資訊:

[topicis@localhost ~]$ db2 "select * from topicis.bb"

 

ID          NAME     

----------- ----------

          2 -        

          3 -        

          1 -        

 

  3 record(s) selected.

 

沒有同步到目標端,而且程式竟然也沒有報錯。那就說明一定是程式做了某些處理。原來是引數檔案中指定了REPERROR(100,DISCARD).

檢視一下discard檔案:

[topicis@localhost ggs]$ more ./dirrpt/rtop1.dsc

有類似如下內容:

Current time: 2013-02-04 13:01:39

 

[SQL error 100][IBM][CLI Driver][DB2/LINUXX8664] SQL0100W  No row was f

ound for FETCH, UPDATE or DELETE; or the result of a query is an empty

table.  SQLSTATE=02000

 

Operation failed at seqno 13 rba 2148

Discarding record on action DISCARD on error 100

Problem replicating TOPICIS.BB to TOPICIS.BB

Record not found

Mapping problem with compressed key update record (target format)...

*

ID = 1

NAME = a

ID = 1

NAME = zz

 

說明gg還是認為兩端表結構不一致。

將目標端bb表中資料刪除,新增初始化程式進行初始化後,同步正常。

D:\>db2 update topicis.bb set name='keeptrying'

DB20000I  SQL 命令成功完成。

[topicis@localhost ~]$ db2 "select * from topicis.bb"

 

ID          NAME     

----------- ----------

          1 keeptrying

          2 keeptrying

          3 keeptrying

 

  3 record(s) selected.

 

 

 

 

 

 

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/25744374/viewspace-753898/,如需轉載,請註明出處,否則將追究法律責任。

相關文章