Oracle分散式事務典型案例處理

龍山游龍發表於2022-10-27

一、分散式事務概念 介紹

資料庫分散式事務是指同一事務中 DML語句對兩個或多個資料庫進行修改,為確保事務原子性,事務內的所有操作只能一起提交或一起回滾。ORACLE透過使用DATABASE LINK實現分散式事務。

insert into dept@remote_db values (41,'SUPPORT','BRUSSELS'); -- 遠端資料庫插入

insert into emp values (1041,'MULDER',10);             -- 本地資料庫插入

commit;

1.1  基本概念

1.Global Coordinator :分佈事務的發起者,負責協調這個分佈事務。

2.Commit Point Site :在分佈事務中,首先執行 COMMIT ROLLBACK 操作的站點。一般可以把業務中的關鍵資料庫作為 Commit Point Site Commit Point Site 與其它站點不同,事務不會進入 prepared 狀態,不會存在 IN-DOUBT (懸疑)事務,也不會因為分散式事務的失敗而導致相關表被阻塞。

資料庫相關引數:

COMMIT_POINT_STRENGTH ,預設為 1 ,數值大的為 Commit Point Site 。兩值相同時一般遠端資料庫為 Commit Point Site

 

1.2  兩階段提交

Doc ID 126069.1

 

STAGES PHASE CRASH-TEST-NR's

------------ -------- ---------------

(02) -> (06) PREPARE 1, 2, 3, 4

(07) -> (13) COMMIT 5, 6, 7, 8, 9

(14) -> (16) FORGET 10

 

1.2.1  準備階段( PREPARE PHASE

(01) The global coordinator initiates and commits the distributed transaction.During execution of the SQL statements within the transaction, the definition of the session tree is completed (-> dba_2pc_neighbors).

全域性協調器發起分散式事務提交

 

(02) The commit point site is determined (commit_point_strength) and the SCNs (System Change Number) of the communicating nodes are oordinated.The highest SCN at all nodes is determined. This will be the commit SCN at the commit point site later on (-> highest global SCN -> global integrity).

確認 commit point site,取得所有節點中SCN最大的節點值,將最高的SCN號作為分佈事物的全域性SCN號,保證全域性完整性。

 

(03) The global coordinator asks participating參與節點 nodes other than the commit point site to promise to commit or roll back (-> prepare message) the transaction, even if there is a failure. If any node cannot prepare,the transaction is rolled back.

等待其它參與節點返回資訊確認是否提交或回滾事務,如果有節點不能 prepare,那事務進行回滾

 

(04) Every participating node allocates resources it needs to commit or rollback the transaction if data is changed.

It saves redo records corresponding to changes made by the transaction to its online redo log. This makes it possible to recover the database back to the prepare state in case of an instance failure.

The node guarantees that locks held for the transaction are able to survive a failure.

重新整理日誌到日誌檔案

 

(05) All participating nodes place a distributed lock on modified tables preventing reads/writes.

對分佈事物修改的表加分散式鎖,防止被

 

(06) All participating nodes respond with a prepared message to their global/local coordinator and wait until a commit or rollback request is received from the global/local coordinator.

After the nodes are prepared, the distributed transaction is said to be in-doubt.

Note that all participating nodes need to be prepared for the two phase commit to continue to the next phase (-> commit phase).

所有參與節點通知並等待協調器的提交或者回滾請求,並將事務切換成 in-doubt存疑狀態。

 

1.2.2  提交階段( COMMIT PHASE

(07) The global coordinator instructs the commit point site to commit.

全域性協調器發起提交點站點提交

(08) The commit point site commits with the highest SCN (see step 02).

提交點站點提交

(09) The commit point site informs the global coordinator of the commit.

提交點站點返回提交資訊

(10)  The global/local coordinator instructs all the participating nodes to commit.

其它參與者進行提交

(11)  Every node commits the local portion of the distributed transaction and releases locks.

提交本地事務並釋放鎖

(12)  Every node records an additional redo entry in the local redo log indicating that the transaction has commited.

寫日誌

(13) The participating nodes notify the global coordinator that they have committed.

On completion of the commit phase, the data on all nodes of the distributed system is consistent with one another.

所有參與節點返回提交完成資訊,完成提交

 

1.2.3  登出階段( FORGET PHASE

(14) After receiving notice from the global coordinator that all nodes have committed, the commit point site erases status information about this transaction.

接收到提交資訊後,提交點站點清理事務資訊

 

(15) The commit point site informs the global coordinator that it has erased the status information.

commit point site返回訊息

(16)  The global coordinator erases its own information about the transaction.

global coordinator清理本次事務的相關資訊。

 

此時分佈事物的兩階段提交全部完成。

 

分散式事務故障模擬

如果兩階段提交完成之前,資料庫或網路出現異常,分佈事物處於 IN_DOUBT狀態。相關狀態會記錄到dba_2pc_pending檢視中。一旦資料庫或網路恢復正常,RECO程式會自動處理IN_DOUBT狀態的分佈事物。少數情況需要DBA手工處理IN_DOUBT狀態的分佈事物。

2.1  Co mmit Comment

ORACLE 提供的用於模擬分散式事務各階段異常的 COMMIT COMMENT

參見: Doc ID 126069.

 

COMMIT COMMENT 'ORA-2PC-CRASH-TEST-n';

where n is one of the following integers:

n Effect

1 Crash commit point site after collect

2 Crash non-commit point site after collect

3 Crash before prepare (non-commit point site)

4 Crash after prepare (non-commit point site)

5 Crash commit point site before commit

6 Crash commit point site after commit

7 Crash non-commit point site before commit

8 Crash non-commit point site after commit

9 Crash commit point site before forget

10 Crash non-commit point site before forget

 

2.2  Testing the script   setup_rem.sql

-->>>>>>>>>> Begin of setup_rem.sql <<<<<<<<<<--

/* Execute at the remote site */

 

connect sys/change_on_install@v817

 

create user <USER> identified by tiger

default tablespace users

temporary tablespace temp;

 

grant dba to <USER>;

grant force transaction,force any transaction to <USER>;

/* To be able to crash a distributed transaction with

COMMIT COMMENT 'ORA-2PC-CRASH-n'; */

grant alter system to <USER>;

/* To be able to do ALTER SYSTEM DISABLE/ENABLE DISTRIBUTED

RECOVERY; */

grant delete on sys.pending_trans$ to <USER>;

grant delete on sys.pending_sessions$ to <USER>;

grant delete on sys.pending_sub_sessions$ to <USER>;

/* To be able to use DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY */

 

connect <USER>/tiger@v817

 

create database link v817rep.be.oracle.com connect to <USER>

identified by tiger using 'v817rep.be.oracle.com';

 

SET TERMOUT OFF

SET ECHO OFF

CREATE TABLE DEPT

(DEPTNO NUMBER(2) CONSTRAINT PK_DEPT PRIMARY KEY,

DNAME VARCHAR2(14) ,

LOC VARCHAR2(13) ) ;

CREATE TABLE EMP

(EMPNO NUMBER(4) CONSTRAINT PK_EMP PRIMARY KEY,

ENAME VARCHAR2(10),

DEPTNO NUMBER(2) CONSTRAINT FK_DEPTNO REFERENCES DEPT);

INSERT INTO DEPT VALUES (10,'ACCOUNTING','CITY1');

INSERT INTO DEPT VALUES (20,'RESEARCH','CITY2');

INSERT INTO DEPT VALUES (30,'SALES','CITY3');

INSERT INTO DEPT VALUES (40,'OPERATIONS','CITY4');

INSERT INTO EMP VALUES (1000,'NAME1',20);

INSERT INTO EMP VALUES (1001,'NAME2',30);

INSERT INTO EMP VALUES (1002,'NAME3',30);

INSERT INTO EMP VALUES (1003,'NAME4',20);

COMMIT;

CREATE SYNONYM S_DEPT FOR DEPT@v817rep.be.oracle.com;

CREATE SYNONYM S_EMP FOR EMP@v817rep.be.oracle.com;

SET TERMOUT ON

SET ECHO ON

 

-->>>>>>>>>> End of setup_rem.sql <<<<<<<<<<--

 

-->>>>>>>>>> Begin of setup_loc.sql <<<<<<<<<<--

 

/* Execute at the local site */

 

connect sys/change_on_install@v817rep

 

create user <USER> identified by tiger

default tablespace users

temporary tablespace temp;

 

grant dba to <USER>;

grant force transaction,force any transaction to <USER>;

/* To be able to crash a distributed transaction with

COMMIT COMMENT 'ORA-2PC-CRASH-n'; */

grant alter system to <USER>;

/* To be able to do ALTER SYSTEM DISABLE/ENABLE DISTRIBUTED

RECOVERY; */

grant delete on sys.pending_trans$ to <USER>;

grant delete on sys.pending_sessions$ to <USER>;

grant delete on sys.pending_sub_sessions$ to <USER>;

/* To be able to use DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY */

 

connect <USER>/tiger@v817rep

 

create database link v817.be.oracle.com connect to <USER>

identified by tiger using 'v817.be.oracle.com';

 

SET TERMOUT OFF

SET ECHO OFF

CREATE TABLE DEPT

(DEPTNO NUMBER(2) CONSTRAINT PK_DEPT PRIMARY KEY,

DNAME VARCHAR2(14) ,

LOC VARCHAR2(13) ) ;

CREATE TABLE EMP

(EMPNO NUMBER(4) CONSTRAINT PK_EMP PRIMARY KEY,

ENAME VARCHAR2(10),

DEPTNO NUMBER(2) CONSTRAINT FK_DEPTNO REFERENCES DEPT);

INSERT INTO DEPT VALUES (10,'ACCOUNTING','CITY1');

INSERT INTO DEPT VALUES (20,'RESEARCH','CITY2');

INSERT INTO DEPT VALUES (30,'SALES','CITY3');

INSERT INTO DEPT VALUES (40,'OPERATIONS','CITY4');

INSERT INTO EMP VALUES (1000,'NAME1',20);

INSERT INTO EMP VALUES (1001,'NAME2',30);

INSERT INTO EMP VALUES (1002,'NAME3',30);

INSERT INTO EMP VALUES (1003,'NAME4',20);

COMMIT;

CREATE SYNONYM S_DEPT FOR DEPT@v817.be.oracle.com;

CREATE SYNONYM S_EMP FOR EMP@v817.be.oracle.com;

SET TERMOUT ON

SET ECHO ON

 

-->>>>>>>>>> End of setup_loc.sql <<<<<<<<<<--

 

-->>>>>>>>>> Begin of crash_1.sql <<<<<<<<<<--

 

/* Crash Scenario 1 */

/* Crash commit point site after collect */

 

connect <USER>/tiger@v817rep

alter system disable distributed recovery;

 

/* DML remote */

/* object s_dept is a synonym for table dept@v817.be.oracle.com */

insert into s_dept values (41,'SUPPORT','BRUSSELS');

 

/* DML local */

insert into emp values (1041,'MULDER',10);

commit comment 'ORA-2PC-CRASH-TEST-1';

 

2.3  Crash commit point site after collect

commit comment 'ORA-2PC-CRASH-TEST-1';

-> after step (06) above

 

step (06) All participating nodes respond with a prepared message to their global/local coordinator and wait until a commit or rollback request is received from the global/local coordinator.

After the nodes are prepared, the distributed transaction is said to be in-doubt.

Note that all participating nodes need to be prepared for the two phase commit to continue to the next phase (-> commit phase).

 

SQL> connect local/local

Connected.

SQL> alter system disable distributed recovery;

System altered.

SQL> insert into s_dept values (42,'SUPPORT','BRUSSELS');

1 row created.

SQL> insert into emp values (1041,'MULDER',10);

1 row created.

SQL> commit comment 'ORA-2PC-CRASH-TEST-1';

commit comment 'ORA-2PC-CRASH-TEST-1'

*

ERROR at line 1:

ORA-02054: transaction 1.8.664 in-doubt

ORA-02059: ORA-2PC-CRASH-TEST-1 in commit comment

ORA-02063: preceding line from TONODE2

 

本地資料庫 alert 日誌

Error 2059 trapped in 2PC on transaction 1.8.664. Cleaning up.

Error stack returned to user:

ORA-02054: transaction 1.8.664 in-doubt

ORA-02059: ORA-2PC-CRASH-TEST-1 in commit comment

ORA-02063: preceding line from TONODE2

Mon Jan 03 13:52:54 2022

DISTRIB TRAN ORCL.6c28f3e5.1.8.664

  is local tran 1.8.664 (hex=01.08.298)

  insert pending prepared tran, scn=1002823 (hex=0.000f4d47)

 

遠端資料庫 alert 日誌

Mon Jan 03 13:52:54 2022

Error 2059 trapped in 2PC on transaction 1.11.660. Cleaning up.

Error stack returned to user:

ORA-02059: ORA-2PC-CRASH-TEST-1 in commit comment

 

遠端節點

SQL> select LOCAL_TRAN_ID,GLOBAL_TRAN_ID,STATE,COMMIT# from dba_2pc_pending;

no rows selected

 

本地節點

col GLOBAL_TRAN_ID for a40

select LOCAL_TRAN_ID,GLOBAL_TRAN_ID,STATE,COMMIT# from dba_2pc_pending ;

LOCAL_TRAN_ID          GLOBAL_TRAN_ID                           STATE            COMMIT#

---------------------- ---------------------------------------- ---------------- ----------------

1.8.664                ORCL.6c28f3e5.1.8.664                    prepared         1002823

 

SQL> ROLLBACK FORCE '1.8.664';

Rollback complete.

SQL> COMMIT FORCE '1.8.664';

 

SQL> select LOCAL_TRAN_ID,GLOBAL_TRAN_ID,STATE,COMMIT# from dba_2pc_pending;

LOCAL_TRAN_ID          GLOBAL_TRAN_ID                           STATE            COMMIT#

---------------------- ---------------------------------------- ---------------- ----------------

1.8.664                ORCL.6c28f3e5.1.8.664                    forced rollback  1002823

 

SQL> execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('1.8.664');

PL/SQL procedure successfully completed.

 

SQL> commit;

Commit complete.

 

2.4  Crash commit point site before commit

COMMIT COMMENT 'ORA-2PC-CRASH-TEST-5'

Crash commit point site before commit

-> after step (07) above

(07) The global coordinator instructs the commit point site to commit.

 

SQL> connect local/local

Connected.

SQL> alter system disable distributed recovery;

System altered.

SQL> insert into s_dept values (43,'SUPPORT','BRUSSELS');

1 row created.

SQL> insert into emp values (1043,'MULDER',10);

1 row created.

SQL> commit comment 'ORA-2PC-CRASH-TEST-5';

commit comment 'ORA-2PC-CRASH-TEST-5'

*

ERROR at line 1:

ORA-02054: transaction 5.9.879 in-doubt

ORA-02059: ORA-2PC-CRASH-TEST-5 in commit comment

ORA-02063: preceding line from TONODE2

 

本地 alert 日誌

Mon Jan 03 14:14:56 2022

Error 2059 trapped in 2PC on transaction 5.9.879. Cleaning up.

Mon Jan 03 14:14:56 2022

DISTRIB TRAN ORCL.6c28f3e5.5.9.879

   is local tran 5.9.879 (hex=05.09.36f)

   insert pending prepared tran, scn=1003725 (hex=0.000f50cd)

Error stack returned to user:

ORA-02054: transaction 5.9.879 in-doubt

ORA-02059: ORA-2PC-CRASH-TEST-5 in commit comment

ORA-02063: preceding line from TONODE2

 

遠端 alert 日誌

Mon Jan 03 14:14:56 2022

Error 2059 trapped in 2PC on transaction 1.14.661. Cleaning up.

Error stack returned to user:

ORA-02059: ORA-2PC-CRASH-TEST-5 in commit comment

 

遠端節點

SQL> select LOCAL_TRAN_ID,GLOBAL_TRAN_ID,STATE,COMMIT# from dba_2pc_pending;

no rows selected

 

本地節點

SQL> select LOCAL_TRAN_ID,GLOBAL_TRAN_ID,STATE,COMMIT# from dba_2pc_pending;

LOCAL_TRAN_ID          GLOBAL_TRAN_ID                           STATE            COMMIT#

---------------------- ---------------------------------------- ---------------- ----------------

5.9.879                ORCL.6c28f3e5.5.9.879                    prepared         1003725

 

SQL> ROLLBACK FORCE '5.9.879';

Rollback complete.

 

SQL> select LOCAL_TRAN_ID,GLOBAL_TRAN_ID,STATE,COMMIT# from dba_2pc_pending;

 

LOCAL_TRAN_ID          GLOBAL_TRAN_ID                           STATE            COMMIT#

---------------------- ---------------------------------------- ---------------- ----------------

5.9.879                ORCL.6c28f3e5.5.9.879                    forced rollback  1003725

 

SQL> execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('5.9.879');

PL/SQL procedure successfully completed.

 

SQL> commit;

Commit complete.

 

2.5  偽行構造

SQL > connect local/local

Connected.

SQL> alter system disable distributed recovery;

System altered.

SQL> insert into s_dept values (46,'SUPPORT','BRUSSELS');

1 row created.

SQL> insert into emp values (1046,'MULDER',10);

1 row created.

SQL> commit comment 'ORA-2PC-CRASH-TEST-4';

commit comment 'ORA-2PC-CRASH-TEST-4'

*

ERROR at line 1:

ORA-02054: transaction 5.26.883 in-doubt

ORA-02059: ORA-2PC-CRASH-TEST-4 in commit comment

 

- - 本地 alert 日誌

Mon Jan 03 18:51:33 2022

Error 2059 trapped in 2PC on transaction 5.26.883. Cleaning up.

Mon Jan 03 18:51:33 2022

DISTRIB TRAN ORCL.6c28f3e5.5.26.883

  is local tran 5.26.883 (hex=05.1a.373)

  insert pending prepared tran, scn=1045010 (hex=0.000ff212)

Error stack returned to user:

ORA-02054: transaction 5.26.883 in-doubt

ORA-02059: ORA-2PC-CRASH-TEST-4 in commit comment

 

- - 遠端 alert 日誌,無輸出

 

-- 遠端節點

SQL> select LOCAL_TRAN_ID,GLOBAL_TRAN_ID,STATE,COMMIT# from dba_2pc_pending;

no rows selected

 

本地節點

SQL> select LOCAL_TRAN_ID,GLOBAL_TRAN_ID,STATE,COMMIT# from dba_2pc_pending;

LOCAL_TRAN_ID          GLOBAL_TRAN_ID                           STATE            COMMIT#

---------------------- ---------------------------------------- ---------------- ----------------

5.26.883               ORCL.6c28f3e5.5.26.883                   prepared         1045010

 

SQL> ROLLBACK FORCE '5.26.883';

模擬 hang

 

SQL> select * from emp;

ERROR:

ORA-01591: lock held by in-doubt distributed transaction 5.26.883

 

刪除基表資訊

SQL> set transaction use rollback segment SYSTEM;

SQL> delete from sys.pending_trans$ where local_tran_id = '5.26.883';

SQL> delete from sys.pending_sessions$ where local_tran_id = '5.26.883';

SQL> delete from sys.pending_sub_sessions$ where local_tran_id = '5.26.883';

SQL> commit;

 

SQL> select * from emp;

ERROR:

ORA-01591: lock held by in-doubt distributed transaction 5.26.883

 

SQL> shutdown immediate

Database closed.

Database dismounted.

ORACLE instance shut down.

SQL> startup

ORACLE instance started.

 

Total System Global Area  400846848 bytes

Fixed Size                  2253664 bytes

Variable Size             176164000 bytes

Database Buffers          218103808 bytes

Redo Buffers                4325376 bytes

Database mounted.

Database opened.

SQL> conn local/local

Connected.

 

重啟資料庫後事務未釋放

SQL> select * from emp;

ERROR:

ORA-01591: lock held by in-doubt distributed transaction 5.26.883

 

SQL> select LOCAL_TRAN_ID,GLOBAL_TRAN_ID,STATE,COMMIT# from dba_2pc_pending;

no rows selected

 

SQL> SELECT KTUXEUSN, KTUXESLT, KTUXESQN, /* Transaction ID */ KTUXESTA Status,KTUXECFL Flags FROM x$ktuxe WHERE ktuxesta!='INACTIVE' AND ktuxeusn= 5;

  KTUXEUSN   KTUXESLT   KTUXESQN STATUS           FLAGS

---------- ---------- ---------- ---------------- ------------------------

         5         26        883 PREPARED         SCO|COL|REV|DEAD

 

SQL> commit force '5.26.883';

commit force '5.26.883'

*

ERROR at line 1:

ORA-02058: no prepared transaction found with ID 5.26.883

 

SQL> rollback force '5.26.883';

rollback force '5.26.883'

*

ERROR at line 1:

ORA-02058: no prepared transaction found with ID 5.26.883

 

raises ORA-02058 since dba_2pc views are empty. In order to use commit force or rollback force a dummy record should be inserted into pending_trans$ as follows:

insert into pending_trans$ (

LOCAL_TRAN_ID,

GLOBAL_TRAN_FMT,

GLOBAL_ORACLE_ID,

STATE,

STATUS,

SESSION_VECTOR,

RECO_VECTOR,

TYPE#,

FAIL_TIME,

RECO_TIME)

values( '5.26.883', /* <== Replace this with your local tran id */

306206, /* */

'XXXXXXX.12345.1.2.3', /* These values can be used without any */

'prepared','P', /* modification. Most of the values are */

hextoraw( '00000001' ), /* constant. */

hextoraw( '00000000' ), /* */

0, sysdate, sysdate );

 

insert into pending_sessions$

values( '5.26.883',/* <==Replace only this with your local tran id */

1, hextoraw('05004F003A1500000104'),

'C', 0, 30258592, '',

146

);

commit;

 

在插入模擬事務資料後,可查詢到事務資訊,強制提交後,事務完成

SQL> select LOCAL_TRAN_ID,GLOBAL_TRAN_ID,STATE,COMMIT# from dba_2pc_pending;

LOCAL_TRAN_ID          GLOBAL_TRAN_ID                           STATE            COMMIT#

---------------------- ---------------------------------------- ---------------- ----------------

5.26.883               XXXXXXX.12345.1.2.3                      prepared         1045010

 

commit force '5.26.883';

 

SQL> execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('5.26.883');

PL/SQL procedure successfully completed.

 

SQL> commit;

Commit complete.

 

SQL> select LOCAL_TRAN_ID,GLOBAL_TRAN_ID,STATE,COMMIT# from dba_2pc_pending;

no rows selected

 

分散式事務案例總結

分散式事務的手工處理主要按 dba_2pc_pending 的結果來靈活對應。

dba_2pc_pending 中的資料時,主要看 state欄位,

1) 如果 state狀態是prepared,表示事務未提交或回滾,此時需要手工進行強制回滾或提交。

ROLLBACK FORCE 'transaction_id';

-OR-

COMMIT FORCE 'transaction_id','commit#';

 

2)如果state狀態是committed, rollback forced或者commit forced狀態,表示事務已經完成了,但是在FORGET階段處理時,資料庫字典的資訊沒能及時清除。此時,我們呼叫oracle的清理丟失事務資訊的語句就可以完成處理:

execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('5.26.883');

Commit;

 

3 dba_2pc_pending 中無資料時

有可能基表 pending_trans$ pending_sessions$ 被執行過刪除操作,此時需要向表中重新插回偽造的資料後再進行強制回滾或提交。


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/30310891/viewspace-2920750/,如需轉載,請註明出處,否則將追究法律責任。

相關文章