Greenplummax_prepared_transactions設定不正確時的症狀-toomanyclientsalready

德哥發表於2018-06-21

原文網址 : https://flycode.co/archives/185661

OOMclient

背景

Greenplum分散式事務使用2PC提交，因此需要依賴max_prepared_transactions引數的配置。

原文

https://www.cnblogs.com/chenminklutz/p/8946269.html

max_prepared_transactions 應該至少設定為 max_connections 一樣大。

正文

max_prepared_transactions是greenplum的一個引數，以下是官方文件

Sets the maximum number of transactions that can be in the prepared state simultaneously. Greenplum uses prepared transactions internally to ensure data integrity across the segments. This value must be at least as large as the value of max_connections on the master. Segment instances should be set to the same value as the master.

Value Range	Default	Set Classifications
integer	250 on master 250 on segments	local system restart

引數型別：本地，系統級，需要重啟生效。

當這個引數設定的足夠小時，如果max_prepared_transactions使用耗盡，會報錯如下：

FATAL: sorry, too many clients already.  
DETAIL: There are no more available slots in the sharedSnapshotArray.  
HINT: Another piece of code should have detected that we have too many clients.this probably means that someone isn`t releasing their slot properly.

然後其他人就連不上資料庫。

開始解決問題：

檢視資料庫狀態，gpstate -s

正常，segment沒有掛

但是發現master current role變成了utility，正常應該是是dispatch啊。

肯定有問題，繼續找。

手動pg_terminate_backend()最早的30個程式，正常啦，master current role也回去了，其他人也能連上了。

過一會兒又不行啦。

上原始碼搜sharedSnapshotArray，發現如下判斷就會報這個錯

    if (arrayP->numSlots >= arrayP->maxSlots || arrayP->nextSlot == -1)  
    {  
        /*  
         * Ooops, no room.  this shouldn`t happen as something else should have  
         * complained if we go over MaxBackends.  
         */  
        LWLockRelease(SharedSnapshotLock);  
        ereport(FATAL,  
                (errcode(ERRCODE_TOO_MANY_CONNECTIONS),  
                 errmsg("sorry, too many clients already."),  
                 errdetail("There are no more available slots in the sharedSnapshotArray."),  
                 errhint("Another piece of code should have detected that we have too many clients."  
                         " this probably means that someone isn`t releasing their slot properly.")));  
    }

繼續順藤摸瓜搜maxSlots，看是怎麼確定的值。

        /*  
         * We`re the first - initialize.  
         */  
        sharedSnapshotArray->numSlots = 0;  
  
        /* TODO:  MaxBackends is only somewhat right.  What we really want here  
         *        is the MaxBackends value from the QD.  But this is at least  
         *          safe since we know we dont need *MORE* than MaxBackends.  But  
         *        in general MaxBackends on a QE is going to be bigger than on a  
         *          QE by a good bit.  or at least it should be.  
         *  
         * But really, max_prepared_transactions *is* what we want (it  
         * corresponds to the number of connections allowed on the  
         * master).  
         *  
         * slotCount is initialized in SharedSnapshotShmemSize().  
         */  
        sharedSnapshotArray->maxSlots = slotCount;  
        sharedSnapshotArray->nextSlot = 0;  
  
        sharedSnapshotArray->slots = (SharedSnapshotSlot *)&sharedSnapshotArray->xips;  
  
        /* xips start just after the last slot structure */  
        xip_base = (TransactionId *)&sharedSnapshotArray->slots[sharedSnapshotArray->maxSlots];

繼續slotCount，上邊有註釋說slotCount在SharedSnapshotShmemSize裡被初始化

/*  
 * Report shared-memory space needed by CreateSharedSnapshot.  
 */  
Size  
SharedSnapshotShmemSize(void)  
{  
    Size        size;  
  
    xipEntryCount = MaxBackends + max_prepared_xacts;  
  
    slotSize = sizeof(SharedSnapshotSlot);  
    slotSize += mul_size(sizeof(TransactionId), (xipEntryCount));  
    slotSize = MAXALIGN(slotSize);  
  
    /*  
     * We only really need max_prepared_xacts; but for safety we  
     * multiply that by two (to account for slow de-allocation on  
     * cleanup, for instance).  
     */  
    slotCount = NUM_SHARED_SNAPSHOT_SLOTS;  
  
    size = offsetof(SharedSnapshotStruct, xips);  
    size = add_size(size, mul_size(slotSize, slotCount));  
  
    return MAXALIGN(size);  
}

全域性變數NUM_SHARED_SNAPSHOT_SLOTS

#define NUM_SHARED_SNAPSHOT_SLOTS (2 * max_prepared_xacts)

二倍的max_prepared_transactions引數值。其實上邊的英文註釋也說啦：我們實際上只需要max_prepared_transactions，但是為了安全我們把他乘二，比如清除時的緩慢反分配。

所以檢視系統的max_prepared_transactions的值，發現只有50，此處笑cry，捂臉哭。

修改max_prepared_transactions等於master的max_connections，gpconfig -c max_prepared_transactions -v 1500，重啟資料庫gpstop -a -M fast。

檢視max_prepared_transactions，已經生效。

OK問題解決！！

問題模擬

gpconfig -c max_prepared_transactions -v 1
gpstop -M fast -a
gpstart -a

postgres=# show max_prepared_transactions ;
 max_prepared_transactions 
---------------------------
 1
(1 row)


postgres=# create table test(id int, info text);
NOTICE:  Table doesn`t have `DISTRIBUTED BY` clause -- Using column named `id` as the Greenplum Database data distribution key for this table.
HINT:  The `DISTRIBUTED BY` clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
postgres=# begin;
BEGIN
postgres=# insert into test select generate_series(1,1000),`test`;
INSERT 0 1000


-- 會話2
postgres=# set VERBOSITY verbose
postgres=# insert into test select generate_series(1,1000),`test`;
FATAL:  XX000: the limit of 1 distributed transactions has been reached. (cdbtm.c:2569)
DETAIL:  The global user configuration (GUC) server parameter max_prepared_transactions controls this limit.
LOCATION:  createDtx, cdbtm.c:2569
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

keycloak~時間不正確的問題
2023-04-12
mysql怎麼設定密碼都不正確的一個解決方法
2018-05-19
MySql密碼
繫結域名時域名解析狀態顯示解析失敗或不正確的解決方法
2024-11-02
髕骨軟化症的症狀和治療方法
2020-10-29
數字型別的不正確轉換漏洞
2021-09-07
型別
MySQL order by 排序結果不正確
2020-04-01
MySql排序
網站設計時，網站主題怎樣確定？
2021-01-08
網站
Jni CallVoidMethod回撥傳值不正確
2020-09-30
通過 VGA 介面連線顯示器時解析度不正確
2020-04-10
用 NTP 設定 CentOS 和 Ubuntu 伺服器的正確時區
2022-05-23
CentOSUbuntu伺服器
RxJava 異常時堆疊顯示不正確？解決方法都在這裡
2023-02-20
RxJava
瀏覽器證書機構不正確
2024-10-10
瀏覽器
相對論為什麼不正確呢？
2018-03-19
這是一個不確定的時代
2022-07-04
996icu的症狀-展望Swift5
2019-03-28
996Swift
java定時任務巢狀
2020-10-09
Java巢狀
win10 office元件安裝不正確怎麼辦_win10電腦中office元件安裝不正確的解決教程
2020-06-11
Win10元件
javascript中如何設定滑鼠的形狀
2020-04-06
JavaScript
封裝定時任務框架的正確方式
2019-02-28
封裝框架
如何正確設定動態TextView的textSize
2018-03-28
TextView
碰見一個BLOCKING_SESSION顯示不正確的BUG
2019-01-05
BloCSession
DKP 駭客分析——不正確的代幣對比率計算
2023-04-06
協作機器人案例：在不確定的時期提供確定性
2020-10-10
機器人
咳兩聲就能鎖定新冠！MIT收集20萬咳嗽樣本，用AI辨別無症狀感染者，準確率100%
2020-11-05
MITAI
Jenkins定時設定
2024-04-07
Jenkins
開機時的設定
2020-09-21
GPUDepthStencilState物件設定depthStencil狀態
2022-02-28
GPU物件
easycode生成程式碼報錯：配置資訊不正確
2024-08-23
設定openjdkdocker容器的時區
2019-01-21
JDKDocker
MySQL中的時區設定
2020-12-18
MySql
MUI——設定沉浸式狀態列
2018-11-08
UI
CSS設定滑鼠指標形狀
2018-08-06
CSS指標
echarts 設定柱狀圖寬度
2024-03-20
Echarts
UI設計培訓分享：ui的字型怎麼正確設定?
2021-11-03
UI
Swift 的坑：static var 的初始化時機並不確定
2019-02-08
Swift
linux定時任務的設定 crontab 配置指南
2018-05-16
Linux
Jenkins+Gitlab 執行 selenium 為何報版本不正確的問題
2020-10-28
JenkinsGitlab
Could not load file or assembly or one of its dependencies 試圖載入格式不正確的程式
2019-05-11

Greenplummax_prepared_transactions設定不正確時的症狀-toomanyclientsalready

背景

原文

正文

問題模擬

相關文章