PostgreSQL10.0preview功能增強-回滾範圍可精細控制(事務、語句級)

德哥發表於2017-03-24

標籤

PostgreSQL , 10.0 , 事務回滾 , 語句回滾 , 服務端特性


背景

資料庫的原子操作單位是事務,那麼在一個事務中的SQL,正常情況下,應該是這樣的:要麼全部提交,要麼全部回滾。

為了保證永續性,資料庫還有一個概念,事務日誌,每當事務提交時,都需要確保REDO持久化(REDO的寫IO會比較多,IO延遲直接關係到tps吞吐量,特別是小事務)。

因此,有些業務,為了提高整體的吞吐(比如資料插入的吞吐),通常會使用分批提交的方式,比如一個事務中封裝1000條INSERT語句。

但是問題來了,如果其中任何一條SQL失敗,會導致這個事務之前執行的所有SQL全部回滾,如果使用者只想迴歸當前異常的SQL,目前有以下幾種方法。

1. 每條SQL後面加一個SAVE POINT,如果遇到異常SQL,迴歸到前一個SAVE POINT即可。

這個操作在psql客戶端中有實現,需要設定這個變數

ON_ERROR_ROLLBACK  
  
When set to on, if a statement in a transaction block generates an error, the error is ignored and the transaction continues. When set to interactive, such errors are only ignored in interactive  
sessions, and not when reading script files. When unset or set to off, a statement in a transaction block that generates an error aborts the entire transaction. The error rollback mode works by  
issuing an implicit SAVEPOINT for you, just before each command that is in a transaction block, and then rolling back to the savepoint if the command fails.  
  
  
psql   
postgres=# set ON_ERROR_ROLLBACK  

相關程式碼,設定了ON_ERROR_ROLLBACK時,psql會自動在執行SQL前設定savepoint。

src/bin/psql/common.c

        if (transaction_status == PQTRANS_INTRANS &&  
                pset.on_error_rollback != PSQL_ERROR_ROLLBACK_OFF &&  
                (pset.cur_cmd_interactive ||  
                 pset.on_error_rollback == PSQL_ERROR_ROLLBACK_ON))  
        {  
                if (on_error_rollback_warning == false && pset.sversion < 80000)  
                {  
                        char            sverbuf[32];  
  
                        psql_error("The server (version %s) does not support savepoints for ON_ERROR_ROLLBACK.
",  
                                           formatPGVersionNumber(pset.sversion, false,  
                                                                                         sverbuf, sizeof(sverbuf)));  
                        on_error_rollback_warning = true;  
                }  
                else  
                {  
                        results = PQexec(pset.db, "SAVEPOINT pg_psql_temporary_savepoint");  
                        if (PQresultStatus(results) != PGRES_COMMAND_OK)  
                        {  
                                psql_error("%s", PQerrorMessage(pset.db));  
                                ClearOrSaveResult(results);  
                                ResetCancelConn();  
                                goto sendquery_cleanup;  
                        }  
                        ClearOrSaveResult(results);  
                        on_error_rollback_savepoint = true;  
                }  
        }  

如果SQL執行失敗,自動迴歸到SAVEPOINT

                switch (transaction_status)  
                {  
                        case PQTRANS_INERROR:  
                                /* We always rollback on an error */  
                                svptcmd = "ROLLBACK TO pg_psql_temporary_savepoint";  
                                break;  

如果SQL執行成功,自動釋放savepoint

                                else  
                                        svptcmd = "RELEASE pg_psql_temporary_savepoint";  
                                break;  

對於使用JDBC驅動的使用者,也支援這樣的功能,不需要使用者干預。

https://github.com/pgjdbc/pgjdbc/commit/adc08d57d2a9726309ea80d574b1db835396c1c8

1) If "DEALLOCATE" or "DISCARD" command status is observed, the driver would invalidate cached statements,  
and subsequent executions would go through parse, describe, etc.  
  
This feature is enabled by deafault.  
  
2) If fails with "cached plan must not change result type", then re-parse might solve the problem.  
However, if there a pending transaction, then the error would kill the transaction.  
For that purpose, the driver sets a savepoint before each statement.  
  
Automatic savepoint is configured via autosave property that can take the following values:  
 * conservative (default) -- rollback to savepoint only in case of "prepared statement does not exist" and  
   "cached plan must not change result type". Then the driver would re-execute the statement ant it would pass through  
 * never -- never set automatic safepoint. Note: in this mode statements might still fail with "cached plan must not change result type"  
   in autoCommit=FALSE mode  
 * always -- always rollback to "before statement execution" state in case of failure. This mode prevents "current transaction aborted" errors.  
   It is similar to psql`s ON_ERROR_ROLLBACK.  
  
The overhead of additional savepoint is like 3us (see #477).  

引入savepoint會有一定的開銷,所以PostgreSQL還有幾種方法來提高高併發小事務的效能,比如非同步提交,分組提交。

1. 非同步提交

commit時,不等待redo落盤即返回,從而提升小事務吞吐。PostgreSQL的非同步提交併不會造成資料的不一致,因為shared buffer裡面的髒頁在刷盤前,會確保對應的REDO頁先落盤。

但是非同步提交也有一定的風險,比如資料庫crash,redo buffer中的沒有落盤的事務會回滾(即使事務已提交)。好在PostgreSQL wal writer程式的排程是非常緊密的,最大10毫秒排程刷一次redo buffer。

2. group commit

組提交,也是常用手段,將同時提交的事務的REDO IO請求合併成1個請求,從而減少高併發小事務的REDO IO寫請求量。提升小事務的吞吐率。

組提交只在高併發時才能發揮效果,非同步提交可以在任意場景發揮效果。

組提交相比非同步提交的好處, 不會造成資料丟失。

《PostgreSQL 可靠性分析 – 關於redo block原子寫》

說完前面的,進入正題,savepoint是客戶端行為,而不是服務端行為,因為客戶端需要在每一次QUERY發生前後開啟和釋放SAVEPOINT。雖然一些驅動封裝了這個功能。

那麼資料庫本身能提供這樣的功能嗎?

PostgreSQL 10.0 服務端自動savepoint

10.0 將加入一個語法,啟動事務時,指定該事務發生異常時,要求語句級別回滾還是事務級別回滾。

如果選擇了語句級別回滾,那麼當提交的SQL發生異常時,可以繼續後面的SQL,否則必須回滾整個事務。

語法如下

START TRANSACTION ROLLBACK SCOPE { TRANSACTION | STATEMENT }  

討論詳情

Hello,  
  
As I stated here and at the PGConf.ASIA developer meeting last year, I`d like to propose statement-level rollback feature.  To repeat myself, this is requested for users to migrate from other DBMSs to PostgreSQL.  They expect that a failure of one SQL statement should not abort the entire transaction and their apps (client programs and stored procedures) can continue the transaction with a different SQL statement.  
  
  
SPECIFICATION  
==================================================  
  
START TRANSACTION ROLLBACK SCOPE { TRANSACTION | STATEMENT };  
  
This syntax controls the behavior of the transaction when an SQL statement fails.  TRANSACTION (default) is the traditional behavior (i.e. rolls back the entire transaction or subtransaction).  STATEMENT rolls back the failed SQL statement.  
  
Just like the isolation level and access mode, default_transaction_rollback_scope GUC variable is also available.  
  
  
DESIGN  
==================================================  
  
Nothing much to talk about... it merely creates a savepoint before each statement execution and destroys it after the statement finishes.  This is done in postgres.c for top-level SQL statements.  
  
The stored function hasn`t been handled yet; I`ll submit the revised patch soon.  
  
  
CONSIDERATIONS AND REQUESTS  
==================================================  
  
The code for stored functions is not written yet, but I`d like your feedback for the specification and design based on the current patch.  I`ll add this patch to CommitFest 2017-3.  
  
The patch creates and destroys a savepoint for each message of the extended query protocol (Parse, Bind, Execute and Describe).  I`m afraid this will add significant overhead, but I don`t find a better way, because those messages could be send arbitrarily for different statements, e.g. Parse stmt1, Parse stmt2, Bind stmt1, Execute stmt1, Bind stmt2, Execute stmt2.  
  
  
Regards  
Takayuki Tsunakawa  

這個patch的討論,詳見郵件組,本文末尾URL。

PostgreSQL社群的作風非常嚴謹,一個patch可能在郵件組中討論幾個月甚至幾年,根據大家的意見反覆的修正,patch合併到master已經非常成熟,所以PostgreSQL的穩定性也是遠近聞名的。

參考

https://commitfest.postgresql.org/14/1050/

https://www.postgresql.org/message-id/flat/0A3221C70F24FB45833433255569204D1F6A9286@G01JPEXMBYT05#0A3221C70F24FB45833433255569204D1F6A9286@G01JPEXMBYT05

https://github.com/pgjdbc/pgjdbc/commit/adc08d57d2a9726309ea80d574b1db835396c1c8


相關文章