一天學會PostgreSQL應用開發與管理-6事務和鎖

德哥發表於2017-04-12

本章大綱

一、什麼是事務

二、單使用者情況下的事務

三、多使用者情況下的事務

1 PostgreSQL如何處理事務隔離?

2 多使用者情況下的事務併發處理

四、鎖

五、資料恢復粒度

六、資料年齡

七、事務凍結

八、垃圾回收

九、flash back query

第四章:事務和鎖

https://www.postgresql.org/docs/9.6/static/mvcc.html

1. 什麼是事務

ACID介紹

1. 原子性:同一個事務中的所有操作,要麼全部成功,要麼全部失敗。即使資料庫恢復,也不能出現同一個事務中對資料庫的操作一部分成功一部分失敗的情況。

2. 一致性:任何時刻,資料庫對使用者的視角始終是按事務提交的順序一致的,即使資料庫恢復,也不能出現後提交的事務存在,而先提交的事務不存在的情況。

以轉賬案例為例,假設有五個賬戶,每個賬戶餘額是100元,那麼五個賬戶總額是500元,如果在這個5個賬戶之間同時發生多個轉賬,無論併發多少個,比如在A與B賬戶之間轉賬5元,在C與D賬戶之間轉賬10元,在B與E之間轉賬15元,五個賬戶總額也應該還是500元。

以插入資料為例,a插入1並提交,隨後b插入2並提交。當資料庫發生恢復時,要麼a存在,要麼a,b同時存在,絕對不可能出現b存在但是a不存在的情況。

3. 持久化:事務成功提交,表示事務對資料庫的修改已經持久化到永久儲存,即使斷電也不會丟失。除非永久儲存的介質損壞。

4. 隔離性:有4種隔離級別,讀未提交,讀已提交,可重複度,序列化。

postgres=# h begin  
Command:     BEGIN  
Description: start a transaction block  
Syntax:  
BEGIN [ WORK | TRANSACTION ] [ transaction_mode [, ...] ]  
  
where transaction_mode is one of:  
  
    ISOLATION LEVEL { SERIALIZABLE | REPEATABLE READ | READ COMMITTED | READ UNCOMMITTED }  
    READ WRITE | READ ONLY  
    [ NOT ] DEFERRABLE  

讀未提交,表示可以讀到其他會話未提交的資料。

PostgreSQL不支援讀未提交(髒讀)。

讀已提交,表示可以讀到其他會話已提交的資料。

postgres=# create table iso_test(id int, info text);  
CREATE TABLE  
postgres=# insert into iso_test values (1,`test`);  
INSERT 0 1  
  
會話 1  
postgres=# begin;  
BEGIN  
postgres=# select * from iso_test;  
 id | info   
----+------  
  1 | test  
(1 row)  
  
會話 2  
postgres=# begin;  
BEGIN  
postgres=# update iso_test set info=`new`;  
UPDATE 1  
  
會話 1  
postgres=# select * from iso_test;  
 id | info   
----+------  
  1 | test  
(1 row)  
  
會話 2  
postgres=# end;  
COMMIT  
  
會話 1  
postgres=# select * from iso_test;  
 id | info   
----+------  
  1 | new  
(1 row)  

可重複讀,表示在一個事務中,執行同一條SQL,讀到的是同樣的資料(即使被讀的資料可能已經被其他會話修改並提交)。

會話 1   
postgres=# begin transaction isolation level repeatable read ;  
BEGIN  
postgres=# select * from iso_test;  
 id | info   
----+------  
  1 | new  
(1 row)  
  
會話 2  
postgres=# begin;  
BEGIN  
postgres=# update iso_test set info=`digoal`;  
UPDATE 1  
postgres=# end;  
COMMIT  
  
會話 1   
postgres=# select * from iso_test;  
 id | info   
----+------  
  1 | new  
(1 row)  

序列化,表示並行事務模擬序列執行,違反序列執行規則的事務,將回滾。

會話 1   
postgres=# begin transaction isolation level serializable ;  
BEGIN  
postgres=# select count(*) from iso_test;  
 count   
-------  
     6  
(1 row)  
  
會話 2  
postgres=# begin transaction isolation level serializable ;  
BEGIN  
postgres=# select count(*) from iso_test;  
 count   
-------  
     6  
(1 row)  
  
會話 1  
postgres=# insert into iso_test select * from iso_test;  
INSERT 0 6  
postgres=# select count(*) from iso_test;  
 count   
-------  
    12  
(1 row)  
  
會話 2   
postgres=# insert into iso_test select * from iso_test;  
INSERT 0 6  
postgres=# select count(*) from iso_test;  
 count   
-------  
    12  
(1 row)  
  
會話 1  
postgres=# end;  
COMMIT  
  
會話 2  
postgres=# end;  
ERROR:  could not serialize access due to read/write dependencies among transactions  
DETAIL:  Reason code: Canceled on identification as a pivot, during commit attempt.  
HINT:  The transaction might succeed if retried.  

序列隔離級別鎖資訊檢視

select relation::regclass,* from pg_locks where pid in (序列事務會話pid);  

2. 單使用者情況下的事務

單使用者情況下的事務,沒有並行問題,屬於序列事務,僅僅體現一致性、原子性、持久化。

3. 多使用者情況下的事務

並行事務,體現事務隔離,鎖。

PostgreSQL如何處理事務隔離?

1. 多版本概念

資料插入時,在記錄的頭資訊中xmin欄位,記錄當前事務ID(xid)。

刪除資料時,在記錄的頭資訊中xmax欄位,記錄當前事務ID(xid),頭部infomask標記記錄已刪除。

更新資料時,在被刪除的記錄的頭資訊中xmax欄位,記錄當前事務ID(xid),頭部infomask標記記錄已刪除。同時插入一條新的記錄,新記錄的頭資訊中xmin欄位,記錄當前事務ID(xid)。

INSERT, xmin = current xid  
DELETE, xmax = current xid  
UPDATE, old tuple xmax = current xid, new tuple xmin = current xid  

2. 快照概念

當前資料庫中未提交的最小事務xmin,所有小於這個事務號的記錄,對使用者都可見。

當前資料庫中最小的未分配事務號xmax,所有大於這個事務號的記錄,對使用者都不可見。

在xmin,xmax區間內,仍未提交的事務號list,所有等於這些事務號的記錄,對使用者都不可見。

查詢資料時,根據事務隔離級別、以上快照資訊、行頭資訊中的xmin,xmax值、記錄對應的事務提交狀態,以及infomask標記資訊,判斷記錄是否可見。

3. HOT

為了降低索引的更新,當更新後的記錄在同一個資料塊中時,舊的記錄使用ctid引用到對應的新記錄的offset,如果更新後的記錄,索引欄位的值未編號,則不需要更新索引。

pic

pic

多使用者情況下的事務併發處理

1. 讀寫不衝突

2. 讀到什麼資料,取決於事務隔離級別。

4. 鎖

1. 鎖物件

src/include/storage/lock.h

/*  
 * LOCKTAG is the key information needed to look up a LOCK item in the  
 * lock hashtable.  A LOCKTAG value uniquely identifies a lockable object.  
 *  
 * The LockTagType enum defines the different kinds of objects we can lock.  
 * We can handle up to 256 different LockTagTypes.  
 */  
typedef enum LockTagType  
{  
        LOCKTAG_RELATION,                       /* whole relation */  
        /* ID info for a relation is DB OID + REL OID; DB OID = 0 if shared */  
        LOCKTAG_RELATION_EXTEND,        /* the right to extend a relation */  
        /* same ID info as RELATION */  
        LOCKTAG_PAGE,                           /* one page of a relation */  
        /* ID info for a page is RELATION info + BlockNumber */  
        LOCKTAG_TUPLE,                          /* one physical tuple */  
        /* ID info for a tuple is PAGE info + OffsetNumber */  
        LOCKTAG_TRANSACTION,            /* transaction (for waiting for xact done) */  
        /* ID info for a transaction is its TransactionId */  
        LOCKTAG_VIRTUALTRANSACTION, /* virtual transaction (ditto) */  
        /* ID info for a virtual transaction is its VirtualTransactionId */  
        LOCKTAG_SPECULATIVE_TOKEN,      /* speculative insertion Xid and token */  
        /* ID info for a transaction is its TransactionId */  
        LOCKTAG_OBJECT,                         /* non-relation database object */  
        /* ID info for an object is DB OID + CLASS OID + OBJECT OID + SUBID */  
  
        /*  
         * Note: object ID has same representation as in pg_depend and  
         * pg_description, but notice that we are constraining SUBID to 16 bits.  
         * Also, we use DB OID = 0 for shared objects such as tablespaces.  
         */  
        LOCKTAG_USERLOCK,                       /* reserved for old contrib/userlock code */  
        LOCKTAG_ADVISORY                        /* advisory user locks */  
} LockTagType;  

2. 表級鎖

src/include/storage/lockdefs.h

#define AccessShareLock                 1    /* SELECT */  
#define RowShareLock                    2    /* SELECT FOR UPDATE/FOR SHARE */  
#define RowExclusiveLock                3    /* INSERT, UPDATE, DELETE */  
#define ShareUpdateExclusiveLock        4    /* VACUUM (non-FULL), ANALYZE, CREATE INDEX CONCURRENTLY */  
#define ShareLock                       5    /* CREATE INDEX (WITHOUT CONCURRENTLY) */  
#define ShareRowExclusiveLock           6    /* CREATE TRIGGER and many forms of ALTER TABLE, like EXCLUSIVE MODE, but allows ROW SHARE */  
#define ExclusiveLock                   7    /* REFRESH MATERIALIZED VIEW CONCURRENTLY, blocks ROW SHARE/SELECT...FOR UPDATE */  
#define AccessExclusiveLock             8    /* many ALTER TABLE, DROP TABLE, TRUNCATE, REINDEX, CLUSTER,   
                                              * VACUUM FULL, and REFRESH MATERIALIZED VIEW (without CONCURRENTLY)  
                                              * and unqualified LOCK TABLE */  

alter table 不同的操作,請求的鎖也不一樣,詳見

https://www.postgresql.org/docs/9.6/static/sql-altertable.html

表級鎖衝突列表

Requested Lock Mode ACCESS SHARE ROW SHARE ROW EXCLUSIVE SHARE UPDATE EXCLUSIVE SHARE SHARE ROW EXCLUSIVE EXCLUSIVE ACCESS EXCLUSIVE
ACCESS SHARE X
ROW SHARE X X
ROW EXCLUSIVE X X X X
SHARE UPDATE EXCLUSIVE X X X X X
SHARE X X X X X
SHARE ROW EXCLUSIVE X X X X X X
EXCLUSIVE X X X X X X X
ACCESS EXCLUSIVE X X X X X X X X

3. 行級鎖

src/include/access/heapam.h

/*  
 * Possible lock modes for a tuple.  
 */  
typedef enum LockTupleMode  
{  
        /* SELECT FOR KEY SHARE */  
        LockTupleKeyShare,  
        /* SELECT FOR SHARE */  
        LockTupleShare,  
        /* SELECT FOR NO KEY UPDATE, and UPDATEs that don`t modify key columns */  
        LockTupleNoKeyExclusive,  
        /* SELECT FOR UPDATE, UPDATEs that modify key columns, and DELETE */  
        LockTupleExclusive  
} LockTupleMode;  

行級鎖衝突列表

Requested Lock Mode FOR KEY SHARE FOR SHARE FOR NO KEY UPDATE FOR UPDATE
FOR KEY SHARE X
FOR SHARE X X
FOR NO KEY UPDATE X X X
FOR UPDATE X X X X

例子

測試資料 :

digoal=# CREATE TABLE A (    
digoal(#         AID integer not null,    
digoal(#         Col1 integer,    
digoal(#         PRIMARY KEY (AID)    
digoal(#   );    
CREATE TABLE    
digoal=#     
digoal=#   CREATE TABLE B (    
digoal(#         BID integer not null,    
digoal(#         AID integer not null,    
digoal(#         Col2 integer,    
digoal(#         PRIMARY KEY (BID),    
digoal(#         FOREIGN KEY (AID) REFERENCES A(AID)    
digoal(#   );    
CREATE TABLE    
digoal=# INSERT INTO A (AID) VALUES (1),(2);    
INSERT 0 2    
digoal=#   INSERT INTO B (BID,AID) VALUES (2,1);    
INSERT 0 1    
digoal=# create extension pgrowlocks;  -- 觀察行鎖的外掛  
CREATE EXTENSION    

場景1 :

SESSION A :

digoal=# begin;    
BEGIN    
digoal=# insert into b (aid,bid) values (1,1);    
INSERT 0 1    

SESSION B :

digoal=# select * from pgrowlocks(`a`);    
 locked_row | locker | multi |  xids  |       modes       |  pids       
------------+--------+-------+--------+-------------------+---------    
 (0,1)      |   1706 | f     | {1706} | {"For Key Share"} | {18172}    
(1 row)    
    
-- 注意鎖模式是"For Key Share", 鎖的是a表的aid=1的記錄    
    
digoal=# select * from pgrowlocks(`b`);    
 locked_row | locker | multi | xids | modes | pids     
------------+--------+-------+------+-------+------    
(0 rows)    
    
digoal=# select * from a where ctid=`(0,1)`;    
 aid | col1     
-----+------    
   1 |         
(1 row)    
    
digoal=# update a set col1=22 where aid=1;    
UPDATE 1    
-- 更新A表的非鍵值無需等待    

場景2 :

SESSION A :

digoal=# begin;    
BEGIN    
digoal=# update b set aid=2 where aid<>2;    
UPDATE 2    
SESSION B :     
digoal=# select * from pgrowlocks(`a`);    
 locked_row | locker | multi |  xids  |       modes       |  pids       
------------+--------+-------+--------+-------------------+---------    
 (0,2)    |   1708 | f     | {1708} | {"For Key Share"} | {18172}    
(1 row)    
    
-- 注意鎖模式是"For Key Share" , 鎖的是a表的aid=2的記錄     
    
digoal=# select * from a where ctid=`(0,2)`;    
 aid | col1     
-----+------    
   2 |         
(1 row)    
    
digoal=# update a set col1=22 where aid=2;    
UPDATE 1    
    
-- 更新A表的非鍵值無需等待    

會話B更新A表的時候, 是什麼鎖?

場景3 :

SESSION A :

digoal=# begin;    
BEGIN    
digoal=# update b set aid=2 where aid<>2;    
UPDATE 2    

SESSION B :

digoal=# begin;    
BEGIN    
digoal=# update a set col1=22 where aid=1;    
UPDATE 1    
digoal=# update a set col1=22 where aid=2;    
UPDATE 1    

SESSION C :

-- 一定要在第三個會話才能看到這些鎖的狀態.    
-- 因為MVCC, 本地會話看到的是已經更新後的TUPLE, 新的tuple頭上的infomask的資訊.    
-- 要看舊版本的infomask, 必須在其他會話檢視.     
digoal=# select * from pgrowlocks(`a`);    
 locked_row | locker | multi |    xids     |             modes             |     pids          
------------+--------+-------+-------------+-------------------------------+---------------    
 (0,3)      |   1710 | f     | {1710}      | {Update}                      | {18332}    
 (0,4)      |      3 | t     | {1708,1710} | {"Key Share","No Key Update"} | {18172,18332}    
(2 rows)    
    
-- 注意會話B獲取了1個No Key Update鎖, 就是digoal=# update a set col1=22 where aid=2;這條產生的.    
因為更新的不是鍵值相關的.    
    
digoal=# select * from pgrowlocks(`b`);    
 locked_row | locker | multi |  xids  |  modes   |  pids       
------------+--------+-------+--------+----------+---------    
 (0,1)      |   1708 | f     | {1708} | {Update} | {18172}    
 (0,2)      |   1708 | f     | {1708} | {Update} | {18172}    
(2 rows)    

4. 使用者鎖

使用者輕量鎖,鎖ID值,支援事務級或者會話級持鎖粒度,支援touch鎖操作(touch不到鎖,則返回false)。

常用於長時間持鎖的場景,或者用於秒殺場景。

SELECT pg_advisory_lock(id) FROM foo WHERE id = 12345; -- ok  
SELECT pg_advisory_lock(id) FROM foo WHERE id > 12345 LIMIT 100; -- danger!  
SELECT pg_advisory_lock(q.id) FROM  
(  
  SELECT id FROM foo WHERE id > 12345 LIMIT 100  
) q; -- ok  

秒殺例子

create table test(id int primary key, crt_time timestamp);  
insert into test values (1);  
vi test.sql  
update test set crt_time=now() where id=1 and pg_try_advisory_xact_lock(1);  
  
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 100  
  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 100 s  
number of transactions actually processed: 39104368  
latency average = 0.163 ms  
latency stddev = 0.216 ms  
tps = 391012.743072 (including connections establishing)  
tps = 391175.983419 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.163  update test set crt_time=now() where id=1 and pg_try_advisory_xact_lock(1);  

5. 死鎖

當發生會話之間相互等待時,出現死鎖

例子

會話 1

postgres=# create table d_lock(id int, info text);  
CREATE TABLE  
postgres=# insert into d_lock values (1,`test`);  
INSERT 0 1  
postgres=# insert into d_lock values (2,`test`);  
INSERT 0 1  
  
postgres=# begin;  
BEGIN  
postgres=# update d_lock set info=`a` where id=1;  
UPDATE 1  

會話 2

postgres=# begin;  
BEGIN  
postgres=# update d_lock set info=`b` where id=2;  
UPDATE 1  
postgres=# update d_lock set info=`b` where id=1;  
等待  

會話 1

postgres=# update d_lock set info=`a` where id=2;  -- 等待,檢測到死鎖,自動回滾  
ERROR:  deadlock detected  
DETAIL:  Process 13602 waits for ShareLock on transaction 96548629; blocked by process 18060.  
Process 18060 waits for ShareLock on transaction 96548628; blocked by process 13602.  
HINT:  See server log for query details.  
CONTEXT:  while updating tuple (0,2) in relation "d_lock"  

會話 2

會話 1 自動釋放鎖後,會話2更新成功  
UPDATE 1  

死鎖檢測間隔配置

postgres=# show deadlock_timeout ;  
 deadlock_timeout   
------------------  
 1s  
(1 row)  

6. 檢視鎖狀態

with t_wait as                       
(select a.mode,a.locktype,a.database,a.relation,a.page,a.tuple,a.classid,  
a.objid,a.objsubid,a.pid,a.virtualtransaction,a.virtualxid,a,  
transactionid,b.query,b.xact_start,b.query_start,b.usename,b.datname   
  from pg_locks a,pg_stat_activity b where a.pid=b.pid and not a.granted),  
t_run as   
(select a.mode,a.locktype,a.database,a.relation,a.page,a.tuple,  
a.classid,a.objid,a.objsubid,a.pid,a.virtualtransaction,a.virtualxid,  
a,transactionid,b.query,b.xact_start,b.query_start,  
b.usename,b.datname from pg_locks a,pg_stat_activity b where   
a.pid=b.pid and a.granted)   
select r.locktype,r.mode r_mode,r.usename r_user,r.datname r_db,  
r.relation::regclass,r.pid r_pid,  
r.page r_page,r.tuple r_tuple,r.xact_start r_xact_start,  
r.query_start r_query_start,  
now()-r.query_start r_locktime,r.query r_query,w.mode w_mode,  
w.pid w_pid,w.page w_page,  
w.tuple w_tuple,w.xact_start w_xact_start,w.query_start w_query_start,  
now()-w.query_start w_locktime,w.query w_query    
from t_wait w,t_run r where  
  r.locktype is not distinct from w.locktype and  
  r.database is not distinct from w.database and  
  r.relation is not distinct from w.relation and  
  r.page is not distinct from w.page and  
  r.tuple is not distinct from w.tuple and  
  r.classid is not distinct from w.classid and  
  r.objid is not distinct from w.objid and  
  r.objsubid is not distinct from w.objsubid and  
  r.transactionid is not distinct from w.transactionid and  
  r.pid <> w.pid  
  order by   
  ((  case w.mode  
    when `INVALID` then 0  
    when `AccessShareLock` then 1  
    when `RowShareLock` then 2  
    when `RowExclusiveLock` then 3  
    when `ShareUpdateExclusiveLock` then 4  
    when `ShareLock` then 5  
    when `ShareRowExclusiveLock` then 6  
    when `ExclusiveLock` then 7  
    when `AccessExclusiveLock` then 8  
    else 0  
  end  ) +   
  (  case r.mode  
    when `INVALID` then 0  
    when `AccessShareLock` then 1  
    when `RowShareLock` then 2  
    when `RowExclusiveLock` then 3  
    when `ShareUpdateExclusiveLock` then 4  
    when `ShareLock` then 5  
    when `ShareRowExclusiveLock` then 6  
    when `ExclusiveLock` then 7  
    when `AccessExclusiveLock` then 8  
    else 0  
  end  )) desc,r.xact_start;  

5. 資料恢復粒度

資料庫的PITR恢復,支援事務粒度的恢復。

PITR恢復可以選擇截止在:

1. 事務結束時(COMMIT/ABORT);

2. 或者是使用者使用pg_create_restore_point()建立的還原點位置;

recovery.conf

#recovery_target_name = ``      # e.g. `daily backup 2011-01-26`  
#  
#recovery_target_time = ``      # e.g. `2004-07-14 22:39:00 EST`  
#  
#recovery_target_xid = ``  
#  
#recovery_target_inclusive = true , 當使用recovery_target_time時,false表示恢復到第一個時間點的第一個結束點,true表示恢復到時間點的最後一個結束點。  

6. 資料年齡

PostgreSQL目前使用的是32位的xid,所以需要輪詢使用。

表的年齡,取決於這張表的最老的記錄與當前事務直接經歷了多少個事務。

計算表的年齡

postgres=# select age(relfrozenxid) , relname from pg_class;  
    age     |                    relname                      
------------+-----------------------------------------------  
    1690473 | pg_type  
    1690304 | pg_toast_187550  
 2147483647 | new_type  
 2147483647 | pg_toast_187550_index  
    1032229 | pg_toast_187556  

7. 事務凍結

由於xid需要輪詢使用,為了防止出現”未來”的事務,一條記錄的事務號,必須在20億(約)個事務內設定為凍結事務號。

凍結的事務號=2,可以被所有事務可見。

例子

postgres=# set vacuum_freeze_min_age =0;  
SET  
postgres=# vacuum freeze test;  
VACUUM  

強制凍結配置

autovacuum_freeze_max_age = 150000000 # 年齡超過這個值,會強制執行凍結  

8. 垃圾回收

PostgreSQL 使用多版本的技術支援ACID,所以更新,刪除後,會產生垃圾。

autovacuum被用於自動的垃圾回收,當表中的垃圾版本超過一定的比例(可配置)後,會自動的觸發垃圾回收。

自動垃圾回收的引數設定

autovacuum = on  # 開啟自動垃圾回收  
autovacuum_naptime = 1min  # 垃圾回收程式喚醒間隔  
autovacuum_vacuum_scale_factor = 0.05  # 超過垃圾記錄佔比時,觸發垃圾回收  

手動垃圾回收

vacuum 表名;  

9. flash back query

閃回查詢

create table public."TBL" (    
  c1 int,    
  c2 int,    
  "C3" text,    
  c4 text,    
  c5 text,    
  c6 text,    
  c7 int,    
  crt_time timestamp,    
  primary key (c1,"C3",c6,c4)    
);    

建立記錄表, 跟蹤表的DML和truncate. 可以增加一列txid_snapshot型別儲存txid_current_snapshot(), 這樣就能回退到一個一致的點了.

CREATE TABLE public.undo_t (    
  id serial8 primary key,    
  xid int8,    
  relid oid,    
  table_schema text,    
  table_name text,    
  when_tg text,    
  level text,    
  op text,    
  encoding name,    
  old_rec public."TBL",    
  new_rec public."TBL",    
  crt_time timestamp without time zone DEFAULT now(),    
  username text,    
  client_addr inet,    
  client_port int    
);    

建立觸發器函式, 將DML, TRUNCATE的資料插入跟蹤表

CREATE OR REPLACE FUNCTION public.undo_t_trace()    
RETURNS trigger    
LANGUAGE plpgsql    
AS $BODY$    
DECLARE    
  v_username text := session_user;    
  v_client_addr inet := inet_client_addr();    
  v_client_port int := inet_client_port();    
  v_xid bigint := txid_current();  -- 記錄事務號, 回退時以事務號為界限.    
  v_encoding name := pg_client_encoding();    
BEGIN    
  case TG_OP    
  when `DELETE` then     
    insert into public.undo_t (xid, relid, table_schema, table_name, when_tg, level, op, encoding, old_rec, username, client_addr, client_port)    
      values (v_xid, tg_relid, tg_table_schema, tg_table_name, tg_when, tg_level, tg_op, v_encoding, OLD, v_username, v_client_addr, v_client_port);    
  when `INSERT` then     
    insert into public.undo_t (xid, relid, table_schema, table_name, when_tg, level, op, encoding, new_rec, username, client_addr, client_port)    
      values (v_xid, tg_relid, tg_table_schema, tg_table_name, tg_when, tg_level, tg_op, v_encoding, NEW, v_username, v_client_addr, v_client_port);    
  when `UPDATE` then     
    insert into public.undo_t (xid, relid, table_schema, table_name, when_tg, level, op, encoding, old_rec, new_rec, username, client_addr, client_port)    
      values (v_xid, tg_relid, tg_table_schema, tg_table_name, tg_when, tg_level, tg_op, v_encoding, OLD, NEW, v_username, v_client_addr, v_client_port);    
  when `TRUNCATE` then     
    insert into public.undo_t (xid, relid, table_schema, table_name, when_tg, level, op, encoding, old_rec, username, client_addr, client_port)    
      select v_xid, tg_relid, tg_table_schema, tg_table_name, tg_when, tg_level, tg_op, v_encoding, t, v_username, v_client_addr, v_client_port from public."TBL" AS t;    
  else    
    return null;    
  end case;    
  RETURN null;    
END;    
$BODY$ strict volatile;    

新增觸發器, 記錄表的dml和truncate.

CREATE TRIGGER tg1 AFTER DELETE or INSERT or UPDATE ON public."TBL" FOR EACH ROW EXECUTE PROCEDURE public.undo_t_trace();    
CREATE TRIGGER tg2 BEFORE TRUNCATE ON public."TBL" FOR EACH STATEMENT EXECUTE PROCEDURE public.undo_t_trace();    

插入測試資料, 為了增加難度, 我們使用了轉義字元. 確保前後資料一致.

insert into "TBL" values (1,1,`te\s	`,`c4`,`c5`,`c6`,1,now());    
insert into "TBL" values (2,1,`te\s	`,`c4`,`c5`,`c6`,1,now());    
insert into "TBL" values (3,1,`te\s	`,`c4`,`c5`,`c6`,1,now());    
insert into "TBL" values (4,1,`te\s	`,`c4`,`c5`,`c6`,1,now());    
insert into "TBL" values (5,1,`te\s	`,`c4`,`c5`,`c6`,1,now());    
insert into "TBL" values (6,1,`te\s	`,`c4`,`c5`,`c6`,1,now());    

插入後, 可以看到 INSERT被跟蹤了, 並且我們儲存了插入資料時的客戶端編碼. 方便解決編碼問題.

postgres=# select * from undo_t;    
 id |   xid   |  relid   | table_schema | table_name | when_tg | level |   op   | encoding | old_rec |                          new_    
rec                           |          crt_time          | username | client_addr | client_port     
----+---------+----------+--------------+------------+---------+-------+--------+----------+---------+------------------------------    
------------------------------+----------------------------+----------+-------------+-------------    
  1 | 1301665 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |         | (1,1,"te\\s\t",c4,c5,c6,1,    
"2014-08-28 23:06:09.790227") | 2014-08-28 23:06:09.790227 | postgres |             |                
  2 | 1301666 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |         | (2,1,"te\\s\t",c4,c5,c6,1,    
"2014-08-28 23:06:09.79597")  | 2014-08-28 23:06:09.79597  | postgres |             |                
  3 | 1301667 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |         | (3,1,"te\\s\t",c4,c5,c6,1,    
"2014-08-28 23:06:09.80206")  | 2014-08-28 23:06:09.80206  | postgres |             |                
  4 | 1301668 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |         | (4,1,"te\\s\t",c4,c5,c6,1,    
"2014-08-28 23:06:09.80903")  | 2014-08-28 23:06:09.80903  | postgres |             |                
  5 | 1301669 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |         | (5,1,"te\\s\t",c4,c5,c6,1,    
"2014-08-28 23:06:09.819092") | 2014-08-28 23:06:09.819092 | postgres |             |                
  6 | 1301670 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |         | (6,1,"te\\s\t",c4,c5,c6,1,    
"2014-08-28 23:06:10.228624") | 2014-08-28 23:06:10.228624 | postgres |             |                
(6 rows)    
    
postgres=# select * from "TBL";    
 c1 | c2 |   C3    | c4 | c5 | c6 | c7 |          crt_time              
----+----+---------+----+----+----+----+----------------------------    
  1 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.790227    
  2 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.79597    
  3 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.80206    
  4 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.80903    
  5 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.819092    
  6 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:10.228624    
(6 rows)    

回退操作我們這裡用一個inline plpgsql 程式碼來處理, 如果你要寫成函式也可以, 只需要傳入一個XID即可.

回退最後一個事務, 即c1=6的那條記錄. 以事務號1301670為界限.

注意變數使用標量, 因為在for 和 cursor fetch到一個變數時, 變數必須是標量.

參考程式碼

src/pl/plpgsql/src/pl_gram.y

do language plpgsql $$    
declare    
  v_op text;    
  v_encoding_curr text := pg_client_encoding();     
  v_encoding_tmp text;    
  v_old text;  -- 本來這裡打算用public."TBL"來作為變數型別, 不過for, cursor都不允許儲存非標量型別, 所以還是選擇了標量text, 使用時轉換.    
  v_new text;    
  v_xid int8 := 1301670;     
begin    
  for v_op, v_encoding_tmp, v_old, v_new in     
    select op,encoding,old_rec::text,new_rec::text from undo_t where xid>=v_xid order by xid desc,id desc    
  LOOP    
    execute `set client_encoding=```||v_encoding_tmp||````;    
    case v_op     
    when `INSERT` then     
      delete from public."TBL" t where t=v_new::public."TBL";     
    when `DELETE` then    
      insert into public."TBL" values ((v_old::public."TBL").*);     
    when `TRUNCATE` then    
      insert into public."TBL" values ((v_old::public."TBL").*);     
    when `UPDATE` then    
      delete from public."TBL" t where t=v_new::public."TBL";     
      insert into public."TBL" values ((v_old::public."TBL").*);     
    else    
    end case;     
  end loop;     
  execute `set client_encoding=```||v_encoding_curr||````;     
end;     
$$;    

回退成功

postgres=# select * from "TBL";    
 c1 | c2 |   C3    | c4 | c5 | c6 | c7 |          crt_time              
----+----+---------+----+----+----+----+----------------------------    
  1 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.790227    
  2 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.79597    
  3 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.80206    
  4 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.80903    
  5 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.819092    
(5 rows)    

回退操作同樣會產生undo記錄.

postgres=# select * from undo_t;    
 id |   xid   |  relid   | table_schema | table_name | when_tg | level |   op   | encoding |                          old_rec           
                    |                          new_rec                           |          crt_time          | username | client_ad    
dr | client_port     
----+---------+----------+--------------+------------+---------+-------+--------+----------+----------------------------------------    
--------------------+------------------------------------------------------------+----------------------------+----------+----------    
---+-------------    
  1 | 1301665 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                    | (1,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.790227") | 2014-08-28 23:06:09.790227 | postgres |              
   |                
  2 | 1301666 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                    | (2,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.79597")  | 2014-08-28 23:06:09.79597  | postgres |              
   |                
  3 | 1301667 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                    | (3,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.80206")  | 2014-08-28 23:06:09.80206  | postgres |              
   |                
  4 | 1301668 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                    | (4,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.80903")  | 2014-08-28 23:06:09.80903  | postgres |              
   |                
  5 | 1301669 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                    | (5,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.819092") | 2014-08-28 23:06:09.819092 | postgres |              
   |                
  6 | 1301670 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                    | (6,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:10.228624") | 2014-08-28 23:06:10.228624 | postgres |              
   |                
  7 | 1301671 | 50534894 | public       | TBL        | AFTER   | ROW   | DELETE | UTF8     | (6,1,"te\\s\t",c4,c5,c6,1,"2014-08-2    
8 23:06:10.228624") |                                                            | 2014-08-28 23:07:07.750644 | postgres |              
   |                
(7 rows)    

現在執行一個UPDATE, 把所有的記錄更新掉.

postgres=# update "TBL" set c7=100;    
UPDATE 5    
postgres=# select * from "TBL";    
 c1 | c2 |   C3    | c4 | c5 | c6 | c7  |          crt_time              
----+----+---------+----+----+----+-----+----------------------------    
  1 |  1 | te\s	 | c4 | c5 | c6 | 100 | 2014-08-28 23:06:09.790227    
  2 |  1 | te\s	 | c4 | c5 | c6 | 100 | 2014-08-28 23:06:09.79597    
  3 |  1 | te\s	 | c4 | c5 | c6 | 100 | 2014-08-28 23:06:09.80206    
  4 |  1 | te\s	 | c4 | c5 | c6 | 100 | 2014-08-28 23:06:09.80903    
  5 |  1 | te\s	 | c4 | c5 | c6 | 100 | 2014-08-28 23:06:09.819092    
(5 rows)    
    
postgres=# select * from undo_t;    
 id |   xid   |  relid   | table_schema | table_name | when_tg | level |   op   | encoding |                          old_rec           
                    |                           new_rec                            |          crt_time          | username | client_    
addr | client_port     
----+---------+----------+--------------+------------+---------+-------+--------+----------+----------------------------------------    
--------------------+--------------------------------------------------------------+----------------------------+----------+--------    
-----+-------------    
  1 | 1301665 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                    | (1,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.790227")   | 2014-08-28 23:06:09.790227 | postgres |            
     |                
  2 | 1301666 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                    | (2,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.79597")    | 2014-08-28 23:06:09.79597  | postgres |            
     |                
  3 | 1301667 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                    | (3,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.80206")    | 2014-08-28 23:06:09.80206  | postgres |            
     |                
  4 | 1301668 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                    | (4,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.80903")    | 2014-08-28 23:06:09.80903  | postgres |            
     |                
  5 | 1301669 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                    | (5,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.819092")   | 2014-08-28 23:06:09.819092 | postgres |            
     |                
  6 | 1301670 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                    | (6,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:10.228624")   | 2014-08-28 23:06:10.228624 | postgres |            
     |                
  7 | 1301671 | 50534894 | public       | TBL        | AFTER   | ROW   | DELETE | UTF8     | (6,1,"te\\s\t",c4,c5,c6,1,"2014-08-2    
8 23:06:10.228624") |                                                              | 2014-08-28 23:07:07.750644 | postgres |            
     |                
  8 | 1301672 | 50534894 | public       | TBL        | AFTER   | ROW   | UPDATE | UTF8     | (1,1,"te\\s\t",c4,c5,c6,1,"2014-08-2    
8 23:06:09.790227") | (1,1,"te\\s\t",c4,c5,c6,100,"2014-08-28 23:06:09.790227") | 2014-08-28 23:08:52.887568 | postgres |            
     |                
  9 | 1301672 | 50534894 | public       | TBL        | AFTER   | ROW   | UPDATE | UTF8     | (2,1,"te\\s\t",c4,c5,c6,1,"2014-08-2    
8 23:06:09.79597")  | (2,1,"te\\s\t",c4,c5,c6,100,"2014-08-28 23:06:09.79597")  | 2014-08-28 23:08:52.887568 | postgres |            
     |                
 10 | 1301672 | 50534894 | public       | TBL        | AFTER   | ROW   | UPDATE | UTF8     | (3,1,"te\\s\t",c4,c5,c6,1,"2014-08-2    
8 23:06:09.80206")  | (3,1,"te\\s\t",c4,c5,c6,100,"2014-08-28 23:06:09.80206")  | 2014-08-28 23:08:52.887568 | postgres |            
     |                
 11 | 1301672 | 50534894 | public       | TBL        | AFTER   | ROW   | UPDATE | UTF8     | (4,1,"te\\s\t",c4,c5,c6,1,"2014-08-2    
8 23:06:09.80903")  | (4,1,"te\\s\t",c4,c5,c6,100,"2014-08-28 23:06:09.80903")  | 2014-08-28 23:08:52.887568 | postgres |        
     |                
 12 | 1301672 | 50534894 | public       | TBL        | AFTER   | ROW   | UPDATE | UTF8     | (5,1,"te\\s\t",c4,c5,c6,1,"2014-08-2    
8 23:06:09.819092") | (5,1,"te\\s\t",c4,c5,c6,100,"2014-08-28 23:06:09.819092") | 2014-08-28 23:08:52.887568 | postgres |            
     |                
(12 rows)    

回退到更新前, 即1301672 這個XID需要回退掉.

do language plpgsql $$    
declare    
  v_op text;    
  v_encoding_curr text := pg_client_encoding();    
  v_encoding_tmp text;    
  v_old text;    
  v_new text;    
  v_xid int8 := 1301672;     
begin    
  for v_op, v_encoding_tmp, v_old, v_new in     
    select op,encoding,old_rec::text,new_rec::text from undo_t where xid>=v_xid order by xid desc,id desc    
  LOOP    
    execute `set client_encoding=```||v_encoding_tmp||````;     
    case v_op     
    when `INSERT` then     
      delete from public."TBL" t where t=v_new::public."TBL";     
    when `DELETE` then    
      insert into public."TBL" values ((v_old::public."TBL").*);     
    when `TRUNCATE` then    
      insert into public."TBL" values ((v_old::public."TBL").*);     
    when `UPDATE` then    
      delete from public."TBL" t where t=v_new::public."TBL";     
      insert into public."TBL" values ((v_old::public."TBL").*);     
    else    
    end case;     
  end loop;     
  execute `set client_encoding=```||v_encoding_curr||````;     
end;     
$$;    
    
postgres=# select * from "TBL";    
 c1 | c2 |   C3    | c4 | c5 | c6 | c7 |          crt_time              
----+----+---------+----+----+----+----+----------------------------    
  5 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.819092    
  4 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.80903    
  3 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.80206    
  2 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.79597    
  1 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.790227    
(5 rows)    

現在把所有記錄刪除掉

postgres=# delete from "TBL";    
DELETE 5    
postgres=# select * from undo_t;    
 id |   xid   |  relid   | table_schema | table_name | when_tg | level |   op   | encoding |                           old_rec          
                      |                           new_rec                            |          crt_time          | username | clien    
t_addr | client_port     
----+---------+----------+--------------+------------+---------+-------+--------+----------+----------------------------------------    
----------------------+--------------------------------------------------------------+----------------------------+----------+------    
-------+-------------    
  1 | 1301665 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                      | (1,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.790227")   | 2014-08-28 23:06:09.790227 | postgres |          
       |                
  2 | 1301666 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                      | (2,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.79597")    | 2014-08-28 23:06:09.79597  | postgres |          
       |                
  3 | 1301667 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                      | (3,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.80206")    | 2014-08-28 23:06:09.80206  | postgres |          
       |                
  4 | 1301668 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                      | (4,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.80903")    | 2014-08-28 23:06:09.80903  | postgres |          
       |                
  5 | 1301669 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                      | (5,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.819092")   | 2014-08-28 23:06:09.819092 | postgres |          
       |                
  6 | 1301670 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                      | (6,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:10.228624")   | 2014-08-28 23:06:10.228624 | postgres |          
       |                
  7 | 1301671 | 50534894 | public       | TBL        | AFTER   | ROW   | DELETE | UTF8     | (6,1,"te\\s\t",c4,c5,c6,1,"2014-08-2    
8 23:06:10.228624")   |                                                              | 2014-08-28 23:07:07.750644 | postgres |          
       |                
  8 | 1301672 | 50534894 | public       | TBL        | AFTER   | ROW   | UPDATE | UTF8     | (1,1,"te\\s\t",c4,c5,c6,1,"2014-08-2    
8 23:06:09.790227")   | (1,1,"te\\s\t",c4,c5,c6,100,"2014-08-28 23:06:09.790227") | 2014-08-28 23:08:52.887568 | postgres |          
       |                
  9 | 1301672 | 50534894 | public       | TBL        | AFTER   | ROW   | UPDATE | UTF8     | (2,1,"te\\s\t",c4,c5,c6,1,"2014-08-2    
8 23:06:09.79597")    | (2,1,"te\\s\t",c4,c5,c6,100,"2014-08-28 23:06:09.79597")  | 2014-08-28 23:08:52.887568 | postgres |          
       |                
 10 | 1301672 | 50534894 | public       | TBL        | AFTER   | ROW   | UPDATE | UTF8     | (3,1,"te\\s\t",c4,c5,c6,1,"2014-08-2    
8 23:06:09.80206")    | (3,1,"te\\s\t",c4,c5,c6,100,"2014-08-28 23:06:09.80206")  | 2014-08-28 23:08:52.887568 | postgres |          
       |                
 11 | 1301672 | 50534894 | public       | TBL        | AFTER   | ROW   | UPDATE | UTF8     | (4,1,"te\\s\t",c4,c5,c6,1,"2014-08-2    
8 23:06:09.80903")    | (4,1,"te\\s\t",c4,c5,c6,100,"2014-08-28 23:06:09.80903")  | 2014-08-28 23:08:52.887568 | postgres |          
       |                
 12 | 1301672 | 50534894 | public       | TBL        | AFTER   | ROW   | UPDATE | UTF8     | (5,1,"te\\s\t",c4,c5,c6,1,"2014-08-2    
8 23:06:09.819092")   | (5,1,"te\\s\t",c4,c5,c6,100,"2014-08-28 23:06:09.819092") | 2014-08-28 23:08:52.887568 | postgres |          
       |                
 13 | 1301673 | 50534894 | public       | TBL        | AFTER   | ROW   | DELETE | UTF8     | (5,1,"te\\s\t",c4,c5,c6,100,"2014-08    
-28 23:06:09.819092") |                                                              | 2014-08-28 23:09:50.590689 | postgres |          
       |                
 14 | 1301673 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                      | (5,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.819092")   | 2014-08-28 23:09:50.590689 | postgres |          
       |                
 15 | 1301673 | 50534894 | public       | TBL        | AFTER   | ROW   | DELETE | UTF8     | (4,1,"te\\s\t",c4,c5,c6,100,"2014-08    
-28 23:06:09.80903")  |                                                              | 2014-08-28 23:09:50.590689 | postgres |          
       |                
 16 | 1301673 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                      | (4,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.80903")    | 2014-08-28 23:09:50.590689 | postgres |          
       |                
 17 | 1301673 | 50534894 | public       | TBL        | AFTER   | ROW   | DELETE | UTF8     | (3,1,"te\\s\t",c4,c5,c6,100,"2014-08    
-28 23:06:09.80206")  |                                                              | 2014-08-28 23:09:50.590689 | postgres |          
       |                
 18 | 1301673 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                      | (3,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.80206")    | 2014-08-28 23:09:50.590689 | postgres |          
       |                
 19 | 1301673 | 50534894 | public       | TBL        | AFTER   | ROW   | DELETE | UTF8     | (2,1,"te\\s\t",c4,c5,c6,100,"2014-08    
-28 23:06:09.79597")  |                                                              | 2014-08-28 23:09:50.590689 | postgres |          
       |                
 20 | 1301673 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                      | (2,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.79597")    | 2014-08-28 23:09:50.590689 | postgres |          
       |                
 21 | 1301673 | 50534894 | public       | TBL        | AFTER   | ROW   | DELETE | UTF8     | (1,1,"te\\s\t",c4,c5,c6,100,"2014-08    
-28 23:06:09.790227") |                                                              | 2014-08-28 23:09:50.590689 | postgres |          
       |                
 22 | 1301673 | 50534894 | public       | TBL        | AFTER   | ROW   | INSERT | UTF8     |                                            
                      | (1,1,"te\\s\t",c4,c5,c6,1,"2014-08-28 23:06:09.790227")   | 2014-08-28 23:09:50.590689 | postgres |          
       |                
 23 | 1301674 | 50534894 | public       | TBL        | AFTER   | ROW   | DELETE | UTF8     | (5,1,"te\\s\t",c4,c5,c6,1,"2014-08-2    
8 23:06:09.819092")   |                                                              | 2014-08-28 23:10:17.32766  | postgres |          
       |                
 24 | 1301674 | 50534894 | public       | TBL        | AFTER   | ROW   | DELETE | UTF8     | (4,1,"te\\s\t",c4,c5,c6,1,"2014-08-2    
8 23:06:09.80903")    |                                                              | 2014-08-28 23:10:17.32766  | postgres |          
       |                
 25 | 1301674 | 50534894 | public       | TBL        | AFTER   | ROW   | DELETE | UTF8     | (3,1,"te\\s\t",c4,c5,c6,1,"2014-08-2    
8 23:06:09.80206")    |                                                              | 2014-08-28 23:10:17.32766  | postgres |          
       |                
 26 | 1301674 | 50534894 | public       | TBL        | AFTER   | ROW   | DELETE | UTF8     | (2,1,"te\\s\t",c4,c5,c6,1,"2014-08-2    
8 23:06:09.79597")    |                                                              | 2014-08-28 23:10:17.32766  | postgres |          
       |                
 27 | 1301674 | 50534894 | public       | TBL        | AFTER   | ROW   | DELETE | UTF8     | (1,1,"te\\s\t",c4,c5,c6,1,"2014-08-2    
8 23:06:09.790227")   |                                                              | 2014-08-28 23:10:17.32766  | postgres |          
       |                
(27 rows)    

回退到刪除前, 即1301674回退掉.

do language plpgsql $$    
declare    
  v_op text;    
  v_encoding_curr text := pg_client_encoding();    
  v_encoding_tmp text;    
  v_old text;    
  v_new text;    
  v_xid int8 := 1301674;     
begin    
  for v_op, v_encoding_tmp, v_old, v_new in     
    select op,encoding,old_rec::text,new_rec::text from undo_t where xid>=v_xid order by xid desc,id desc    
  LOOP    
    execute `set client_encoding=```||v_encoding_tmp||````;     
    case v_op     
    when `INSERT` then     
      delete from public."TBL" t where t=v_new::public."TBL";     
    when `DELETE` then    
      insert into public."TBL" values ((v_old::public."TBL").*);     
    when `TRUNCATE` then    
      insert into public."TBL" values ((v_old::public."TBL").*);     
    when `UPDATE` then    
      delete from public."TBL" t where t=v_new::public."TBL";     
      insert into public."TBL" values ((v_old::public."TBL").*);     
    else    
    end case;     
  end loop;     
  execute `set client_encoding=```||v_encoding_curr||````;     
end;     
$$;    
    
postgres=# select * from "TBL";    
 c1 | c2 |   C3    | c4 | c5 | c6 | c7 |          crt_time              
----+----+---------+----+----+----+----+----------------------------    
  1 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.790227    
  2 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.79597    
  3 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.80206    
  4 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.80903    
  5 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.819092    
(5 rows)    

現在回退到只有一條記錄的時候. 即1301666

postgres=#  do language plpgsql $$    
declare    
  v_op text;    
  v_encoding_curr text := pg_client_encoding();    
  v_encoding_tmp text;    
  v_old text;    
  v_new text;    
  v_xid int8 := 1301666;     
begin    
  for v_op, v_encoding_tmp, v_old, v_new in     
    select op,encoding,old_rec::text,new_rec::text from undo_t where xid>=v_xid order by xid desc,id desc    
  LOOP    
    execute `set client_encoding=```||v_encoding_tmp||````;     
    case v_op     
    when `INSERT` then     
      delete from public."TBL" t where t=v_new::public."TBL";     
    when `DELETE` then    
      insert into public."TBL" values ((v_old::public."TBL").*);     
    when `TRUNCATE` then    
      insert into public."TBL" values ((v_old::public."TBL").*);     
    when `UPDATE` then    
      delete from public."TBL" t where t=v_new::public."TBL";     
      insert into public."TBL" values ((v_old::public."TBL").*);     
    else    
    end case;     
  end loop;     
  execute `set client_encoding=```||v_encoding_curr||````;     
end;     
$$;    
DO    
postgres=# select * from "TBL";    
 c1 | c2 |   C3    | c4 | c5 | c6 | c7 |          crt_time              
----+----+---------+----+----+----+----+----------------------------    
  1 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.790227    
(1 row)    

接下來測試一下新增欄位後的回退.

postgres=# alter table "TBL" add column c8 text;    
ALTER TABLE    
postgres=# insert into "TBL" values (2,1,`test`,`c4`,`c5`,`c6`,1,now(),`c8`);    
INSERT 0 1    
postgres=# insert into "TBL" values (3,1,`test`,`c4`,`c5`,`c6`,1,now(),`c8`);    
INSERT 0 1    
postgres=# select * from "TBL";    
 c1 | c2 |   C3    | c4 | c5 | c6 | c7 |          crt_time          | c8     
----+----+---------+----+----+----+----+----------------------------+----    
  1 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.790227 |     
  2 |  1 | test    | c4 | c5 | c6 |  1 | 2014-08-28 23:14:00.235677 | c8    
  3 |  1 | test    | c4 | c5 | c6 |  1 | 2014-08-28 23:14:35.012675 | c8    

回退到新增欄位前1301666.

postgres=#  do language plpgsql $$    
declare    
  v_op text;    
  v_encoding_curr text := pg_client_encoding();    
  v_encoding_tmp text;    
  v_old text;    
  v_new text;    
  v_xid int8 := 1301666;     
begin    
  for v_op, v_encoding_tmp, v_old, v_new in     
    select op,encoding,old_rec::text,new_rec::text from undo_t where xid>=v_xid order by xid desc,id desc    
  LOOP    
    execute `set client_encoding=```||v_encoding_tmp||````;     
    case v_op     
    when `INSERT` then     
      delete from public."TBL" t where t=v_new::public."TBL";     
    when `DELETE` then    
      insert into public."TBL" values ((v_old::public."TBL").*);     
    when `TRUNCATE` then    
      insert into public."TBL" values ((v_old::public."TBL").*);     
    when `UPDATE` then    
      delete from public."TBL" t where t=v_new::public."TBL";     
      insert into public."TBL" values ((v_old::public."TBL").*);     
    else    
    end case;     
  end loop;     
  execute `set client_encoding=```||v_encoding_curr||````;     
end;     
$$;    
DO    
postgres=# select * from "TBL";    
 c1 | c2 |   C3    | c4 | c5 | c6 | c7 |          crt_time          | c8     
----+----+---------+----+----+----+----+----------------------------+----    
  1 |  1 | te\s	 | c4 | c5 | c6 |  1 | 2014-08-28 23:06:09.790227 |     
(1 row)    

接下來刪除欄位測試

postgres=# alter table "TBL" drop column c5;    
ALTER TABLE    
postgres=# select * from "TBL";    
 c1 | c2 |   C3    | c4 | c6 | c7 |          crt_time          | c8     
----+----+---------+----+----+----+----------------------------+----    
  1 |  1 | te\s	 | c4 | c6 |  1 | 2014-08-28 23:06:09.790227 |     
(1 row)    
    
postgres=# insert into "TBL" values (3,1,`test`,`c4`,`c6`,1,now(),`c8`);    
INSERT 0 1    
postgres=# select * from "TBL";    
 c1 | c2 |   C3    | c4 | c6 | c7 |          crt_time          | c8     
----+----+---------+----+----+----+----------------------------+----    
  1 |  1 | te\s	 | c4 | c6 |  1 | 2014-08-28 23:06:09.790227 |     
  3 |  1 | test    | c4 | c6 |  1 | 2014-08-28 23:17:24.722663 | c8    
(2 rows)    

回退到1301666

postgres=#  do language plpgsql $$    
declare    
  v_op text;    
  v_encoding_curr text := pg_client_encoding();    
  v_encoding_tmp text;    
  v_old text;    
  v_new text;    
  v_xid int8 := 1301666;     
begin    
  for v_op, v_encoding_tmp, v_old, v_new in     
    select op,encoding,old_rec::text,new_rec::text from undo_t where xid>=v_xid order by xid desc,id desc    
  LOOP    
    execute `set client_encoding=```||v_encoding_tmp||````;     
    case v_op     
    when `INSERT` then     
      delete from public."TBL" t where t=v_new::public."TBL";     
    when `DELETE` then    
      insert into public."TBL" values ((v_old::public."TBL").*);     
    when `TRUNCATE` then    
      insert into public."TBL" values ((v_old::public."TBL").*);     
    when `UPDATE` then    
      delete from public."TBL" t where t=v_new::public."TBL";     
      insert into public."TBL" values ((v_old::public."TBL").*);     
    else    
    end case;     
  end loop;     
  execute `set client_encoding=```||v_encoding_curr||````;     
end;     
$$;    
DO    
postgres=# select * from "TBL";    
 c1 | c2 |   C3    | c4 | c6 | c7 |          crt_time          | c8     
----+----+---------+----+----+----+----------------------------+----    
  1 |  1 | te\s	 | c4 | c6 |  1 | 2014-08-28 23:06:09.790227 |     
(1 row)    


相關文章