PostgreSQL無法kill(pg_terminate_backend,pg_cancel_backend)的情況分析-程式hangstrace,pstack

德哥發表於2018-10-05

背景

當PostgreSQL程式無法被cancel, terminate時，程式處於什麼狀態？為什麼無法退出？

例子

1、無法被kill的程式

Type "help" for help.  
  
postgres=# select pg_cancel_backend(60827);  
 pg_cancel_backend   
-------------------  
 t  
(1 row)  
  
postgres=# select pg_terminate_backend(60827);  
 pg_terminate_backend   
----------------------  
 t  
(1 row)  
  
postgres=# select pg_terminate_backend(60827);  
 pg_terminate_backend   
----------------------  
 t  
(1 row)

2、檢視程式當時的STACK，卡在__epoll_wait_nocancel

$pstack 60827  
  
#0  0x00007f4bced78f13 in __epoll_wait_nocancel () from /lib64/libc.so.6  
#1  0x0000000000753c35 in WaitEventSetWait ()  
#2  0x000000000076d103 in ConditionVariableSleep ()  
#3  0x00000000004cc4e1 in _bt_parallel_seize ()  
#4  0x00000000004ce433 in ?? ()  
#5  0x00000000004ce72e in ?? ()  
#6  0x00000000004cf071 in _bt_first ()  
#7  0x00000000004ccc2d in btgettuple ()  
#8  0x00000000004c617a in index_getnext_tid ()  
#9  0x0000000000650f87 in ?? ()  
#10 0x000000000063efa1 in ExecScan ()  
#11 0x000000000063d7c7 in ?? ()  
#12 0x000000000064719e in ?? ()  
#13 0x000000000064903c in ?? ()  
#14 0x000000000063d7c7 in ?? ()  
#15 0x000000000064c0c1 in ?? ()  
#16 0x000000000063d7c7 in ?? ()  
#17 0x000000000064719e in ?? ()  
#18 0x000000000064903c in ?? ()  
#19 0x000000000063d7c7 in ?? ()  
#20 0x000000000063c4f0 in standard_ExecutorRun ()  
#21 0x00007f4bc4cd7288 in ?? () from pg_stat_statements.so  
#22 0x00007f4bc48cf87f in ?? () from auto_explain.so  
#23 0x000000000077ed0b in ?? ()  
#24 0x00000000007800d0 in PortalRun ()  
#25 0x000000000077dc88 in PostgresMain ()  
#26 0x000000000070782c in PostmasterMain ()  
#27 0x000000000067d060 in main ()

3、檢視程式的strace

$strace -e trace=all -T -tt -p 60827  
Process 60827 attached - interrupt to quit  
19:21:14.881369 epoll_wait(270,   
  
  
^C <unfinished ...>  
Process 60827 detached

4、檢視這個系統呼叫的描述，等待某個FD的IO

$man epoll_wait  
EPOLL_WAIT(2)              Linux Programmer’s Manual             EPOLL_WAIT(2)  
  
NAME  
       epoll_wait, epoll_pwait - wait for an I/O event on an epoll file descriptor  
  
SYNOPSIS  
       #include <sys/epoll.h>  
  
       int epoll_wait(int epfd, struct epoll_event *events,  
                      int maxevents, int timeout);  
       int epoll_pwait(int epfd, struct epoll_event *events,  
                      int maxevents, int timeout,  
                      const sigset_t *sigmask);

5、檢視epoll_wait(270, 這個270 FD對應的是什麼

#cd /proc/60827/fd  
  
#ll 270  
lrwx------ 1 xxxxxx xxxxxxxxxxx 64 Jul 19 15:01 270 -> anon_inode:[eventpoll]

6、引起epoch_wait的PG呼叫WaitEventSetWait

src/backend/storage/ipc/latch.c

/*  
 * Wait for events added to the set to happen, or until the timeout is  
 * reached.  At most nevents occurred events are returned.  
 *  
 * If timeout = -1, block until an event occurs; if 0, check sockets for  
 * readiness, but don`t block; if > 0, block for at most timeout milliseconds.  
 *  
 * Returns the number of events occurred, or 0 if the timeout was reached.  
 *  
 * Returned events will have the fd, pos, user_data fields set to the  
 * values associated with the registered event.  
 */  
int  
WaitEventSetWait(WaitEventSet *set, long timeout,  
                                 WaitEvent *occurred_events, int nevents,  
                                 uint32 wait_event_info)  
{  
        int                     returned_events = 0;  
        instr_time      start_time;  
        instr_time      cur_time;  
        long            cur_timeout = -1;  
  
        Assert(nevents > 0);  
  
        /*  
         * Initialize timeout if requested.  We must record the current time so  
         * that we can determine the remaining timeout if interrupted.  
         */  
        if (timeout >= 0)  
        {  
                INSTR_TIME_SET_CURRENT(start_time);  
                Assert(timeout >= 0 && timeout <= INT_MAX);  
                cur_timeout = timeout;  
        }  
  
        pgstat_report_wait_start(wait_event_info);  
  
#ifndef WIN32  
        waiting = true;  
#else  
        /* Ensure that signals are serviced even if latch is already set */  
        pgwin32_dispatch_queued_signals();  
#endif  
        while (returned_events == 0)  
        {  
                int                     rc;  
  
                /*  
                 * Check if the latch is set already. If so, leave the loop  
                 * immediately, avoid blocking again. We don`t attempt to report any  
                 * other events that might also be satisfied.  
                 *  
                 * If someone sets the latch between this and the  
                 * WaitEventSetWaitBlock() below, the setter will write a byte to the  
                 * pipe (or signal us and the signal handler will do that), and the  
                 * readiness routine will return immediately.  
                 *  
                 * On unix, If there`s a pending byte in the self pipe, we`ll notice  
                 * whenever blocking. Only clearing the pipe in that case avoids  
                 * having to drain it every time WaitLatchOrSocket() is used. Should  
                 * the pipe-buffer fill up we`re still ok, because the pipe is in  
                 * nonblocking mode. It`s unlikely for that to happen, because the  
                 * self pipe isn`t filled unless we`re blocking (waiting = true), or  
                 * from inside a signal handler in latch_sigusr1_handler().  
                 *  
                 * On windows, we`ll also notice if there`s a pending event for the  
                 * latch when blocking, but there`s no danger of anything filling up,  
                 * as "Setting an event that is already set has no effect.".  
                 *  
                 * Note: we assume that the kernel calls involved in latch management  
                 * will provide adequate synchronization on machines with weak memory  
                 * ordering, so that we cannot miss seeing is_set if a notification  
                 * has already been queued.  
                 */  
                if (set->latch && set->latch->is_set)  
                {  
                        occurred_events->fd = PGINVALID_SOCKET;  
                        occurred_events->pos = set->latch_pos;  
                        occurred_events->user_data =  
                                set->events[set->latch_pos].user_data;  
                        occurred_events->events = WL_LATCH_SET;  
                        occurred_events++;  
                        returned_events++;  
  
                        break;  
                }  
  
                /*  
                 * Wait for events using the readiness primitive chosen at the top of  
                 * this file. If -1 is returned, a timeout has occurred, if 0 we have  
                 * to retry, everything >= 1 is the number of returned events.  
                 */  
                rc = WaitEventSetWaitBlock(set, cur_timeout,  
                                                                   occurred_events, nevents);  
  
                if (rc == -1)  
                        break;                          /* timeout occurred */  
                else  
                        returned_events = rc;  
  
                /* If we`re not done, update cur_timeout for next iteration */  
                if (returned_events == 0 && timeout >= 0)  
                {  
                        INSTR_TIME_SET_CURRENT(cur_time);  
                        INSTR_TIME_SUBTRACT(cur_time, start_time);  
                        cur_timeout = timeout - (long) INSTR_TIME_GET_MILLISEC(cur_time);  
                        if (cur_timeout <= 0)  
                                break;  
                }  
        }  
#ifndef WIN32  
        waiting = false;  
#endif  
  
        pgstat_report_wait_end();  
  
        return returned_events;  
}

strace, pstack的使用教程（轉載）

如何使用strace+pstack利器分析程式效能

http://www.cnblogs.com/bangerlee/archive/2012/04/30/2476190.html

http://www.cnblogs.com/bangerlee/archive/2012/02/20/2356818.html

引言

有時我們需要對程式進行優化、減少程式響應時間。除了一段段地對程式碼進行時間複雜度分析，我們還有更便捷的方法嗎？

若能直接找到影響程式執行時間的函式呼叫，再有針對地對相關函式進行程式碼分析和優化，那相比漫無目的地看程式碼，效率就高多了。

將strace和pstack工具結合起來使用，就可以達到以上目的。strace跟蹤程式使用的底層系統呼叫，可輸出系統呼叫被執行的時間點以及各個呼叫耗時；pstack工具對指定PID的程式輸出函式呼叫棧。

下面我們通過一個簡單的訊息收發程式，說明使用strace、pstack進行程式分析的具體方法。

程式說明

該程式是一個簡單的socket程式，由server/client組成。server端監聽某埠，等待client的連線，client連線server後定時向server傳送訊息，server每接收一條訊息後向client傳送響應訊息。程式server與client互動如下圖示：

在程式執行起來之後，發現server接收到client的submit訊息之後，需要較長時間才發出resp響應。通過tcpdump抓包發現，time2與time1的時間間隔在1s左右：

由上初步分析可知，訊息響應慢是server端程式問題。下面我們來看如何使用strace和pstack分析server端程式響應慢的原因。

strace檢視系統呼叫

首先我們拉起server/client程式，並使用strace對server程式進行跟蹤：

# ps -elf | grep server | grep -v grep  
0 S root 16739 22642 0 76 0 - 634 1024 14:26 pts/2 00:00:00 ./server  
# strace -o server.strace -Ttt -p 16739  
Process 16739 attached - interrupt to quit

稍等一段時間之後，我們將strace停掉， server.strace檔案中有以下輸出：

14:46:39.741366 select(8, [3 4], NULL, NULL, {1, 0}) = 1 (in [4], left {0, 1648}) <0.998415>  
14:46:40.739965 recvfrom(4, "hello", 6, 0, NULL, NULL) = 5 <0.000068>  
14:46:40.740241 write(1, "hello
", 6)  = 6 <0.000066>  
14:46:40.740414 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 <0.000046>  
14:46:40.740565 rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0 <0.000048>  
14:46:40.740715 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000046>  
14:46:40.740853 nanosleep({1, 0}, {1, 0}) = 0 <1.000276>  
14:46:41.741284 sendto(4, "hello ", 6, 0, NULL, 0) = 6 <0.000111>

可以看到server接收資料之後(對應recvfrom呼叫)，經過1s左右時間將訊息發出(對應sendto呼叫)，從響應時間看，與抓包的結果吻合。又可以看出nanosleep系統呼叫耗費了1s時間。

因而可以斷定響應延時由nanosleep對應的函式呼叫造成。

那具體是哪一個函式呼叫呢？在strace輸出結果中並不能找到答案，因其輸出顯示都是系統呼叫，要顯示程式中函式呼叫棧資訊，就輪到pstack上場了。

pstack檢視函式堆疊

pstack是一個指令碼工具，其核心實現就是使用了gdb以及thread apply all bt命令，下面我們使用pstack檢視server程式函式堆疊：

# sh pstack.sh 16739  
#0 0x00002ba1f8152650 in __nanosleep_nocancel () from /lib64/libc.so.6  
#1 0x00002ba1f8152489 in sleep () from /lib64/libc.so.6  
#2 0x00000000004007bb in ha_ha ()  
#3 0x0000000000400a53 in main ()

從以上資訊可以看出，函式呼叫關係為：main->ha_ha->sleep，因而我們可以找到ha_ha函式進行分析和優化修改。

小結

本文通過一個server/client程式事例，說明了使用strace和pstack分析響應延時的方法。

由最初server端響應慢現象，到使用strace跟蹤出具體耗時的系統呼叫，再到使用pstack查到程式中具體的耗時函式，一步步找到了影響程式執行時間的程式程式碼。

更多地瞭解底層，從作業系統層面著手，更有助於程式效能分析與優化。

本文中使用的server/client程式和pstack指令碼可從這裡下載。

strace 通用的完整用法：

strace -o output.txt -T -tt -e trace=all -p 10423

上面的含義是跟蹤28979程式的所有系統呼叫（-e trace=all），並統計系統呼叫的花費時間，以及開始時間（並以視覺化的時分秒格式顯示），最後將記錄結果存在

output.txt檔案裡面。

限制strace只跟蹤特定的系統呼叫 :

如果你已經知道你要找什麼，你可以讓strace只跟蹤一些型別的系統呼叫。例如，在nginx執行程式時，你需要監視的系統呼叫epoll_wait。

讓strace只記錄epoll_wait的呼叫用這個命令：

strace -f -o epoll-strace.txt -e epoll_wait -p 10423

命令strace跟蹤的是系統呼叫，對於nginx本身的函式呼叫關係無法給出更為明朗的資訊，如果我們發現nginx當前執行不正常，想知道nginx當前內部到底在執行什麼函式，

那麼命令pstack就是一個非常方便實用的工具。pstack的使用也非常簡單，後面跟程式id即可，比如在無客戶端請求的情況下，nginx阻塞在epoll_wait系統呼叫處，此時

利用pstack檢視到的nginx函式呼叫堆疊關係如下：

從main()函式到epoll_wait()函式的呼叫關係一目瞭然，和在gdb內看到的堆疊資訊一樣。我們可以利用此進行分析優化等。

小結

參考

《PostgreSQL cancel 通訊協議、訊號和程式碼》

《PostgreSQL cancel 安全漏洞》

https://blog.csdn.net/tycoon1988/article/details/39030985

[20211209]pdb資料庫kill job遇到的奇怪情況.txt
2021-12-09
資料庫
[20180310]12c exp 無法dirct的情況.txt
2018-03-12
圖片無法載入的情況下的優化
2019-02-16
優化
流失原因分析方法6 版本消化情況分析法
2019-12-31
fastadmin 偽靜態nginx 無法訪問後端的情況
2024-08-23
ASTNginx後端
分析針對EFS加密檔案無法開啟的情況資料恢復的解決方式
2019-07-29
加密資料恢復
docker容器無法啟動的情況下，如果修改配置檔案
2024-08-07
Docker
pstack
2018-04-26
switch不加break情況分析
2018-09-04
如何解決使用mac聚焦搜尋無法搜尋軟體的情況
2021-08-13
Mac
針對Sybase資料庫無法啟動的情況，我有話要說
2020-01-02
資料庫
PbootCms模板搭建網站，可能會遇到內頁無法正常訪問的情況
2024-09-13
boot網站
Linux 檢視程式情況
2019-10-18
Linux
CIO：採用多雲策略的情況分析
2019-11-06
無腦批量kill session
2020-11-16
Session
ash報告中無sql_id的情況
2024-02-21
SQL
出現ESXi系統無法連線FreeNAS的情況？90%以上的人都做錯了！
2019-03-22
iPhone 在使用 Charles 抓包情況下無法開啟 APPstore 解決方法
2020-09-27
iPhoneAPP
kill 已啟動的程式
2024-10-12
MySQL中slave監控的延遲情況分析
2021-09-09
MySql
什麼情況下需要進行靜態程式分析?常用Java靜態程式碼分析工具的優勢
2020-08-06
Java
無GPU情況下對cuda程式進行功能性測試
2024-07-14
GPU
Java程式異常處理的特殊情況
2022-05-10
Java
ubuntu核心切換失敗，無法啟動，bios不停檢測情況修復
2021-01-01
UbuntuiOS
[20200319]KILL STATUS ='KILLED'的程式.txt
2020-03-19
linux 檢視程式 kill程式
2020-12-16
Linux
eclipse 專案gradle無反應的幾種特殊情況
2021-09-09
EclipseGradle
Cirium：疫情之下航空業復甦情況分析
2022-01-06
使用 VSTS 進行 CI 的過程中，無法識別 .NET Core 2.x 的情況處理
2018-05-05
Win10系統無法更改ip提示出現了一個意外的情況如何解決
2018-12-18
Win10
win10系統無法開機的情況下如何開啟命令提示符操作
2020-01-05
Win10
程式命令ps/top/kill
2019-01-27
如何從最壞、平均、最好的情況分析複雜度？
2020-07-22
複雜度
vue.js - 過渡&動畫 - 無效情況
2018-08-22
Vue.js動畫
phpredis 3.1.6 擴充套件，出現指定 database 無效的情況。
2019-05-28
PHPRedis套件Database
低程式碼開發需要 DevSecOps 的四種情況
2023-11-15
dev
在不影響程式使用的情況下新增shellcode
2020-08-19
Redis 實用小技巧—— key 分佈情況分析
2023-05-09
Redis

PostgreSQL無法kill(pg_terminate_backend,pg_cancel_backend)的情況分析-程式hangstrace,pstack

標籤

背景

例子

strace, pstack的使用教程（轉載）

引言

程式說明

strace檢視系統呼叫

pstack檢視函式堆疊

小結

小結

參考

相關文章