如何處理IN_DOUBT的分佈事物

tolywang發表於2006-12-29

什麼是分佈事物?

在一個事物中,包含對多個資料庫資料進行處理的DML語句。


在分佈事物中,為了保證全域性資料的一致性,採用兩階段提交。如下圖:

global coordinator Commit point site

commit_point_strength = 5 commit_point_strength = 10

RemoteLocal

Insert into table @ remote

Insert into table

Commit;

兩階段提交保證分佈事物全部提交或全部回滾

系統引數commit_point_strength 定義了分佈資料庫的提交順序,commit_point_strength值較高的資料庫為commit point site,在分佈事物中最先提交,分佈事物的狀態資訊也存在該資料庫中。一般將關鍵的資料庫作為commit point site 本地資料庫為Global Coordinator

分佈事物的兩階段提交分三個過程:

1. 準備階段(PREPARE PHASE

·本地資料庫Global Coordinator向其它資料庫發出COMMIT通知

·比較所有資料庫的SCN號,將最高的SCN號作為分佈事物的全域性SCN

·所有資料庫寫線上日誌

·對分佈事物修改的表加分佈鎖,防止被讀寫

·各資料庫向Global Coordinator發出已經準備好的通知

所有參與分佈事物的資料庫必須經過上述準備,才能進入下一階段。

2. 提交階段(COMMIT PHASE

·本地資料庫Global Coordinator通知commit point site首先提交。commit point site提交後,釋放其佔有的資源,通知Global Coordinator完成提交

·本地資料庫Global Coordinator通知其它資料庫提交

·提交節點在日誌中追加一條資訊,表示分佈事物已經完成提交,並通知Global Coordinator。此時所有資料庫的資料保持了一致性。

3. 登出階段(FORGET PHASE

·本地資料庫Global Coordinator通知commit point site所有資料庫已經完成提交

·commit point site清除分佈事物的記錄和狀態資訊,並通知Global Coordinator

·Global Coordinator清除本地分佈事物的記錄和狀態資訊

此時分佈事物的兩階段提交全部完成。

如果兩階段提交完成之前,資料庫或網路出現異常,應用就會報錯,分佈事物處於IN_DOUBT狀態。一旦資料庫或網路恢復正常,系統(RECO PROCESS)會自動處理IN_DOUBT狀態的分佈事物。有些情況需要管理員手工處理IN_DOUBT狀態的分佈事物:

·IN_DOUBT狀態的分佈事物,將關鍵表鎖住,造成應用不能正常工作

·分佈資料庫需要重建

下面舉例說明不同情況下,手工處理IN_DOUBT狀態分佈事物的詳細過程:

本地資料庫Global Coordinator

local.oracle.com

commit_point_strength=5

遠端資料庫commit point site:

remote.oracle.com

commit_point_strength=10

分佈事物:

/* DML remote -> remote.oracle.com */

insert into s_dept@remote values ();

/* DML local -> v817rep.be.oracle.com */

insert into emp values ();

commit;

1. 準備階段沒有完成時出現異常

ALERT FILE LOCAL.ORACLE.COM:

ORA-02050: transaction 3.4.270 rolled back, some REMOTE DBs may be in-doubt

ORA-02059: ORA-2PC-CRASH-TEST-2 in commit comment

DISTRIB TRAN LOCAL.ORACLE.COM.89f6eafb.3.4.270

is LOCAL tran 3.4.270 (hex=03.04.10e)

insert pending collecting tran, scn=196918 (hex=0.00030136)

ALERT FILE REMOTE.ORACLE.COM:

No entries

DBA_2PC_PENDING@LOCAL.oracle.com:

LOCAL_TRAN_ID|GLOBAL_TRAN_ID |STATE |MIX|HOST |COMMIT#

-------------|------------------------------|--------|---|----------|----------

3.4.270 |LOCAL.ORACLE.COM.89f6eafb |collecti|no |BE-ORACLE-|196918

|.3.4.270 |ng | |NTbel449 |

DBA_2PC_NEIGHBORS@LOCAL.oracle.com:

LOCAL_TRAN_ID|IN_OUT|DATABASE |DBUSER_OWNER |INT

-------------|------|-------------------------|---------------|---

3.4.270 |in | |SCOTT |N

3.4.270 |out |REMOTE.ORACLE.COM |SCOTT |C

DBA_2PC_PENDING@REMOTE.oracle.com:

no rows selected

DBA_2PC_NEIGHBORS@REMOTE.oracle.com:

no rows selected

由上邊資訊我們可以得知:

本地資料庫Global Coordinator

處於COLLECTING階段,等待所有資料庫返回已準備好的通知

遠端資料庫commit point site:什麼也沒發生(系統自動回退)

解決方法:

清除本地資料庫中IN_DOUBT狀態分佈事物的記錄

execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('3.4.270');

你使用了9iaum(auto undo management)

要先遮蔽掉對undo操作的錯誤提示:
sql>alter system set UNDO_SUPPRESS_ERRORS = TRUE
sql>EXECUTE DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('1.29.81672')
sql>alter system set UNDO_SUPPRESS_ERRORS = false

看看dba_2pc_pending中是否還有該記錄

2. 準備階段完成時出現異常

ALERT FILE LOCAL.ORACLE.COM:

ORA-02054: transaction 1.8.238 in-doubt

ORA-02059: ORA-2PC-CRASH-TEST-1 in commit comment

ORA-02063: preceding line from V817

DISTRIB TRAN LOCAL.ORACLE.COM.89f6eafb.1.8.238

is LOCAL tran 1.8.238 (hex=01.08.ee)

insert pending prepared tran, scn=194671 (hex=0.0002f86f)

ALERT FILE REMOTE.ORACLE.COM:

ORA-02059: ORA-2PC-CRASH-TEST-1 in commit comment

DBA_2PC_PENDING@LOCAL.oracle.com:

LOCAL_TRAN_ID|GLOBAL_TRAN_ID |STATE |MIX|HOST |COMMIT#

-------------|------------------------------|--------|---|----------|----------

1.8.238 |LOCAL.ORACLE.COM.89f6eafb |prepared|no |BE-ORACLE-|194671

|.1.8.238 | | |NTbel449 |

DBA_2PC_NEIGHBORS@LOCAL.oracle.com:

LOCAL_TRAN_ID|IN_OUT|DATABASE |DBUSER_OWNER |INT

-------------|------|-------------------------|---------------|---

1.8.238 |in | |SCOTT |N

1.8.238 |out |REMOTE.ORACLE.COM |SCOTT |C

DBA_2PC_PENDING@REMOTE.oracle.com:

no rows selected

DBA_2PC_NEIGHBORS@REMOTE.oracle.com:

no rows selected

由上邊資訊我們可以得知:

本地資料庫Global Coordinator:處於PREPARE階段,佔有分佈鎖,等待提交

遠端資料庫commit point site:什麼也沒發生(系統自動回退)

解決方法:

rollback force 'LOCAL.ORACLE.COM.89f6eafb.1.8.238';

-OR-

rollback force '1.8.238';

DBA_2PC_PENDING@LOCAL.oracle.com:

LOCAL_TRAN_ID|GLOBAL_TRAN_ID |STATE |MIX|HOST |COMMIT#

-------------|------------------------------|--------|---|----------|----------

1.8.238 |LOCAL.ORACLE.COM.89f6eafb |forced r|no |BE-ORACLE-|194671

|.1.8.238 |ollback | |NTbel449 |

清除本地資料庫中IN_DOUBT狀態分佈事物的記錄

execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('3.4.270');

3. 提交階段出現異常

·COMMIT POINT SITE 完成提交

ALERT FILE LOCAL.ORACLE.COM:

ORA-02054: transaction 3.38.281 in-doubt

ORA-02053: transaction 2.39.179 committed, some REMOTE DBs may be in-doubt

ORA-02059: ORA-2PC-CRASH-TEST-6 in commit comment

ORA-02063: preceding 2 lines from V817

ALERT FILE REMOTE.ORACLE.COM:

ORA-02053: transaction 2.39.179 committed, some REMOTE DBs may be in-doubt

ORA-02059: ORA-2PC-CRASH-TEST-6 in commit comment

DBA_2PC_PENDING@LOCAL.oracle.com:

LOCAL_TRAN_ID|GLOBAL_TRAN_ID |STATE |MIX|HOST |COMMIT#

-------------|------------------------------|--------|---|----------|----------

3.38.281 |LOCAL.ORACLE.COM.89f6eafb |prepared|no |BE-ORACLE-|201050

|.3.38.281 | | |NTbel449 |

DBA_2PC_NEIGHBORS@LOCAL.oracle.com:

LOCAL_TRAN_ID|IN_OUT|DATABASE |DBUSER_OWNER |INT

-------------|------|-------------------------|---------------|---

3.38.281 |in | |SCOTT |N

3.38.281 |out |REMOTE.ORACLE.COM |SCOTT |C

DBA_2PC_PENDING@REMOTE.oracle.com:

LOCAL_TRAN_ID|GLOBAL_TRAN_ID |STATE |MIX|HOST |COMMIT#

-------------|------------------------------|--------|---|----------|----------

2.39.179 |LOCAL.ORACLE.COM.89f6eafb |committe|no |BE-ORACLE-|201052

|.3.38.281 |d | |NTbel449 |

DBA_2PC_NEIGHBORS@REMOTE.oracle.com:

LOCAL_TRAN_ID|IN_OUT|DATABASE |DBUSER_OWNER |INT

-------------|------|-------------------------|---------------|---

2.39.179 |in |LOCAL.ORACLE.COM |SCOTT |C

由上邊資訊我們可以得知:

本地資料庫Global Coordinator:處於PREPARE階段,佔有分佈鎖,等待提交

遠端資料庫commit point site: 已經完成提交

解決方法:

commit force 'LOCAL.ORACLE.COM.89f6eafb.3.38.281','201052';

-OR-

commit force '3.38.281','201052';

注意使用較高的SCN號,保證資料的全域性一致性

DBA_2PC_PENDING@LOCAL.oracle.com:

LOCAL_TRAN_ID|GLOBAL_TRAN_ID |STATE |MIX|HOST |COMMIT#

-------------|------------------------------|--------|---|----------|----------

3.38.281 |LOCAL.ORACLE.COM.89f6eafb |forced c|no |BE-ORACLE-|201052

|.3.38.281 |ommit | |NTbel449 |

清除本地資料庫中IN_DOUBT狀態分佈事物的記錄

execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('3.38.281');

清除REMOTE資料庫中IN_DOUBT狀態分佈事物的記錄

execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('2.39.179');

·所有資料庫完成提交,但沒有進入登出階段

ALERT FILE LOCAL.ORACLE.COM:

ORA-02053: transaction 3.16.283 committed, some REMOTE DBs may be in-doubt

ORA-02059: ORA-2PC-CRASH-TEST-8 in commit comment

ALERT FILE REMOTE.ORACLE.COM:

DISTRIB TRAN LOCAL.ORACLE.COM.89f6eafb.3.16.283

is LOCAL tran 2.44.179 (hex=02.2c.b3)

insert pending committed tran, scn=201607 (hex=0.00031387)

DBA_2PC_PENDING@LOCAL.oracle.com:

LOCAL_TRAN_ID|GLOBAL_TRAN_ID |STATE |MIX|HOST |COMMIT#

-------------|------------------------------|--------|---|----------|----------

3.16.283 |LOCAL.ORACLE.COM.89f6eafb |committe|no |BE-ORACLE-|201607

|.3.16.283 |d | |NTbel449 |

DBA_2PC_NEIGHBORS@LOCAL.oracle.com:

LOCAL_TRAN_ID|IN_OUT|DATABASE |DBUSER_OWNER |INT

-------------|------|-------------------------|---------------|---

3.16.283 |in | |SCOTT |N

3.16.283 |out |REMOTE.ORACLE.COM |SCOTT |C

DBA_2PC_PENDING@REMOTE.oracle.com:

LOCAL_TRAN_ID|GLOBAL_TRAN_ID |STATE |MIX|HOST |COMMIT#

-------------|------------------------------|--------|---|----------|----------

2.44.179 |LOCAL.ORACLE.COM.89f6eafb |committe|no |BE-ORACLE-|201607

|.3.16.283 |d | |NTbel449 |

DBA_2PC_NEIGHBORS@REMOTE.oracle.com:

LOCAL_TRAN_ID|IN_OUT|DATABASE |DBUSER_OWNER |INT

-------------|------|-------------------------|---------------|---

2.44.179 |in |LOCAL.ORACLE.COM |SCOTT |C

由上邊資訊我們可以得知:

本地資料庫Global Coordinator:完成提交

遠端資料庫commit point site: 完成提交

清除本地資料庫中IN_DOUBT狀態分佈事物的記錄

execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('3.16.283');

清除REMOTE資料庫中IN_DOUBT狀態分佈事物的記錄

execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('2.44.179');

4. 登出階段出現異常

ALERT FILE LOCAL.ORACLE.COM:

ORA-02053: transaction 1.10.255 committed, some REMOTE DBs may be in-doubt

ORA-02059: ORA-2PC-CRASH-TEST-10 in commit comment

ALERT FILE REMOTE.ORACLE.COM:

No entries

DBA_2PC_PENDING@LOCAL.oracle.com:

LOCAL_TRAN_ID|GLOBAL_TRAN_ID |STATE |MIX|HOST |COMMIT#

-------------|------------------------------|--------|---|----------|----------

1.10.255 |LOCAL.ORACLE.COM.89f6eafb |committe|no |BE-ORACLE-|202241

|.1.10.255 |d | |NTbel449 |

DBA_2PC_NEIGHBORS@LOCAL.oracle.com:

LOCAL_TRAN_ID|IN_OUT|DATABASE |DBUSER_OWNER |INT

-------------|------|-------------------------|---------------|---

1.10.255 |in | |SCOTT |N

1.10.255 |out |REMOTE.ORACLE.COM |SCOTT |C

DBA_2PC_PENDING@REMOTE.oracle.com:

no rows selected

DBA_2PC_NEIGHBORS@REMOTE.oracle.com:

no rows selected

由上邊資訊我們可以得知:

本地資料庫Global Coordinator:完成提交

遠端資料庫commit point site: 完成FORGET

解決方法:

清除本地資料庫中IN_DOUBT狀態分佈事物的記錄

execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('1.10.255');

2pc

2 phase commit
未決兩階段提交事務
也就是具有分散式資料庫的概念所謂兩階段提交,就是同一個事務分佈在2個以上的資料庫上需要所有的資料庫上的操作成功才算成功
若其中有一個資料庫上的操作不成功則該事務屬於未完成事務
透過這個view,能把這些事務找出來,然後做出相應的處理通常資料庫是由 RECO 程式來處理這些事務的:在操作已經成功的那部分資料庫上回退操作!

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/35489/viewspace-84485/,如需轉載,請註明出處,否則將追究法律責任。

相關文章