Greenplummax_prepared_transactions設定不正確時的症狀-toomanyclientsalready

德哥發表於2018-06-21

背景

Greenplum分散式事務使用2PC提交,因此需要依賴max_prepared_transactions引數的配置。

原文

https://www.cnblogs.com/chenminklutz/p/8946269.html

max_prepared_transactions 應該至少設定為 max_connections 一樣大。

正文

max_prepared_transactions是greenplum的一個引數,以下是官方文件

Sets the maximum number of transactions that can be in the prepared state simultaneously. Greenplum uses prepared transactions internally to ensure data integrity across the segments. This value must be at least as large as the value of max_connections on the master. Segment instances should be set to the same value as the master.

Value Range Default Set Classifications
integer 250 on master 
250 on segments
local 
system 
restart

引數型別:本地,系統級,需要重啟生效。

當這個引數設定的足夠小時,如果max_prepared_transactions使用耗盡,會報錯如下:

FATAL: sorry, too many clients already.  
DETAIL: There are no more available slots in the sharedSnapshotArray.  
HINT: Another piece of code should have detected that we have too many clients.this probably means that someone isn`t releasing their slot properly.  

然後其他人就連不上資料庫。

開始解決問題:

檢視資料庫狀態,gpstate -s

正常,segment沒有掛

但是發現master current role變成了utility,正常應該是是dispatch啊。

肯定有問題,繼續找。

手動pg_terminate_backend()最早的30個程式,正常啦,master current role也回去了,其他人也能連上了。

過一會兒又不行啦。

上原始碼搜sharedSnapshotArray,發現如下判斷就會報這個錯

    if (arrayP->numSlots >= arrayP->maxSlots || arrayP->nextSlot == -1)  
    {  
        /*  
         * Ooops, no room.  this shouldn`t happen as something else should have  
         * complained if we go over MaxBackends.  
         */  
        LWLockRelease(SharedSnapshotLock);  
        ereport(FATAL,  
                (errcode(ERRCODE_TOO_MANY_CONNECTIONS),  
                 errmsg("sorry, too many clients already."),  
                 errdetail("There are no more available slots in the sharedSnapshotArray."),  
                 errhint("Another piece of code should have detected that we have too many clients."  
                         " this probably means that someone isn`t releasing their slot properly.")));  
    }  

繼續順藤摸瓜搜maxSlots,看是怎麼確定的值。

        /*  
         * We`re the first - initialize.  
         */  
        sharedSnapshotArray->numSlots = 0;  
  
        /* TODO:  MaxBackends is only somewhat right.  What we really want here  
         *        is the MaxBackends value from the QD.  But this is at least  
         *          safe since we know we dont need *MORE* than MaxBackends.  But  
         *        in general MaxBackends on a QE is going to be bigger than on a  
         *          QE by a good bit.  or at least it should be.  
         *  
         * But really, max_prepared_transactions *is* what we want (it  
         * corresponds to the number of connections allowed on the  
         * master).  
         *  
         * slotCount is initialized in SharedSnapshotShmemSize().  
         */  
        sharedSnapshotArray->maxSlots = slotCount;  
        sharedSnapshotArray->nextSlot = 0;  
  
        sharedSnapshotArray->slots = (SharedSnapshotSlot *)&sharedSnapshotArray->xips;  
  
        /* xips start just after the last slot structure */  
        xip_base = (TransactionId *)&sharedSnapshotArray->slots[sharedSnapshotArray->maxSlots];  

繼續slotCount,上邊有註釋說slotCount在SharedSnapshotShmemSize裡被初始化

/*  
 * Report shared-memory space needed by CreateSharedSnapshot.  
 */  
Size  
SharedSnapshotShmemSize(void)  
{  
    Size        size;  
  
    xipEntryCount = MaxBackends + max_prepared_xacts;  
  
    slotSize = sizeof(SharedSnapshotSlot);  
    slotSize += mul_size(sizeof(TransactionId), (xipEntryCount));  
    slotSize = MAXALIGN(slotSize);  
  
    /*  
     * We only really need max_prepared_xacts; but for safety we  
     * multiply that by two (to account for slow de-allocation on  
     * cleanup, for instance).  
     */  
    slotCount = NUM_SHARED_SNAPSHOT_SLOTS;  
  
    size = offsetof(SharedSnapshotStruct, xips);  
    size = add_size(size, mul_size(slotSize, slotCount));  
  
    return MAXALIGN(size);  
}  

全域性變數NUM_SHARED_SNAPSHOT_SLOTS

#define NUM_SHARED_SNAPSHOT_SLOTS (2 * max_prepared_xacts)  

二倍的max_prepared_transactions引數值。其實上邊的英文註釋也說啦:我們實際上只需要max_prepared_transactions,但是為了安全我們把他乘二,比如清除時的緩慢反分配。

所以檢視系統的max_prepared_transactions的值,發現只有50,此處笑cry,捂臉哭。

修改max_prepared_transactions等於master的max_connections,gpconfig -c max_prepared_transactions -v 1500,重啟資料庫gpstop -a -M fast

檢視max_prepared_transactions,已經生效。

OK問題解決!!

問題模擬

gpconfig -c max_prepared_transactions -v 1
gpstop -M fast -a
gpstart -a

postgres=# show max_prepared_transactions ;
 max_prepared_transactions 
---------------------------
 1
(1 row)


postgres=# create table test(id int, info text);
NOTICE:  Table doesn`t have `DISTRIBUTED BY` clause -- Using column named `id` as the Greenplum Database data distribution key for this table.
HINT:  The `DISTRIBUTED BY` clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
postgres=# begin;
BEGIN
postgres=# insert into test select generate_series(1,1000),`test`;
INSERT 0 1000


-- 會話2
postgres=# set VERBOSITY verbose
postgres=# insert into test select generate_series(1,1000),`test`;
FATAL:  XX000: the limit of 1 distributed transactions has been reached. (cdbtm.c:2569)
DETAIL:  The global user configuration (GUC) server parameter max_prepared_transactions controls this limit.
LOCATION:  createDtx, cdbtm.c:2569
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.


相關文章