Greenplumsegment級問題的排查-gp_session_role=utility(含csvlog日誌格式解讀)
標籤
PostgreSQL , Greenplum , segment , utility
背景
有些時候,我們在使用Greenplum時,你會發現明明連線數是夠的,但是報錯的時候segment連線滿無法正確連線,因為對外的介面都是MASTER,如果發現這裡問題如何排查?
00000 LOG: Failed connection to seg0
00000 STATEMENT: INSERT INTO gps_track(
58M01 LOG: (58M01) Master unable to connect to seg0 with options : FATAL: Sorry, too many clients already
58M01 STATEMENT: INSERT INTO gps_track(
00000 LOG: Failed connection to seg1
00000 STATEMENT: INSERT INTO gps_track(
58M01 LOG: (58M01) Master unable to connect to seg1 with options : FATAL: Sorry, too many clients already
58M01 STATEMENT: INSERT INTO gps_track(
00000 LOG: The previous session was reset because its gang was disconnected (session id = 318465). The new session id = 318471
00000 STATEMENT: INSERT INTO gps_track(
00000 LOG: FTS rescanned, get new component databases info.
00000 STATEMENT: INSERT INTO gps_track(
00000 LOG: Failed connection to seg0
00000 STATEMENT: INSERT INTO gps_track(
58M01 LOG: (58M01) Master unable to connect to seg0 with options : FATAL: Sorry, too many clients already
58M01 STATEMENT: INSERT INTO gps_track(
00000 LOG: Failed connection to seg1
00000 STATEMENT: INSERT INTO gps_track(
58M01 LOG: (58M01) Master unable to connect to seg1 with options : FATAL: Sorry, too many clients already
58M01 STATEMENT: INSERT INTO gps_track(
00000 LOG: The previous session was reset because its gang was disconnected (session id = 318471). The new session id = 318482
00000 STATEMENT: INSERT INTO gps_track(
00000 LOG: no primary segworker group allocated
00000 STATEMENT: INSERT INTO gps_track(
XX000 ERROR: (XX000) Unexpected internal error (cdbgang.c:1636)
XX000 LOCATION: allocateWriterGang, cdbgang.c:1636
XX000 STATEMENT: INSERT INTO gps_track(
XX000 Traceback 0: postgres: port INSERT() [0x9099af]
XX000 Traceback 1: postgres: port INSERT(elog_internalerror+0x20) [0x90cdc0]
XX000 Traceback 2: postgres: port INSERT(allocateWriterGang+0x248) [0x9d6cb8]
XX000 Traceback 3: postgres: port INSERT(AssignGangs+0x28b) [0x6b689b]
XX000 Traceback 4: postgres: port INSERT(ExecutorStart+0xe5f) [0x69e28f]
XX000 Traceback 5: postgres: port INSERT() [0x8304b7]
XX000 Traceback 6: postgres: port INSERT() [0x830c54]
XX000 Traceback 7: postgres: port INSERT(PortalRun+0x322) [0x8311c2]
XX000 Traceback 8: postgres: port INSERT() [0x82abc1]
XX000 Traceback 9: postgres: port INSERT(PostgresMain+0xbd8) [0x82d418]
XX000 Traceback 10: postgres: port INSERT() [0x7d3637]
XX000 Traceback 11: postgres: port INSERT(PostmasterMain+0xd2e) [0x7d514e]
XX000 Traceback 12: postgres: port INSERT(main+0x193) [0x730363]
XX000 Traceback 13: /lib64/libc.so.6(__libc_start_main+0xfd) [0x2b6682187d5d]
XX000 Traceback 14: postgres: port INSERT() [0x4d6389]
備庫到底是什麼狀態?
例子
方法1
通過utility模式連線(前提是預留了足夠多的super user connection = superuser_reserved_connections),允許直接連到SEGMENT節點。
PGOPTIONS=`-c gp_session_role=utility` psql -h xxx -p xxx -U xxx xxx
到segment節點一探究竟。
1、當前系統中的會話,評判標準:不能超過當前segment節點配置的max_connections 否則就會報連線不足。
select * from pg_stat_activity;
2、2PC會話,評判標準:不能超過當前segment節點配置的max_prepared_transactions 否則就會報連線不足。
postgres=# select * from pg_prepared_xacts ;
transaction | gid | prepared | owner | database
-------------+-----+----------+-------+----------
(0 rows)
3、segment級的鎖等待狀態
《Greenplum segment級鎖問題排查方法 – 阿里雲HybridDB for PostgreSQL最佳實踐》
4、segment級日誌排查。
cd $PGDATA/pg_log
檢視日誌
5、檢視日誌時,注意對應session id,可以找出它在master, primary segment上的所有相關日誌。
日誌格式如下:
src/backend/postmaster/syslogger.c
/*
* syslogger_write_errordata
* Write the GpErrorData to the log.
*/
static void
syslogger_write_errordata(PipeProtoHeader *chunkHeader, GpErrorData *errorData, bool csv)
{
syslogger_append_current_timestamp(true);
/* username */
syslogger_write_str_with_comma(errorData->username, true, csv, true);
/* databasename */
syslogger_write_str_with_comma(errorData->databasename, true, csv, true);
/* Process id, thread id */
syslogger_write_int32(false, "p", chunkHeader->pid, true, true);
syslogger_write_int32(false, "th", chunkHeader->thid, true, true);
/* Remote host */
syslogger_write_str_with_comma(errorData->remote_host, true, csv, true);
/* Remote port */
syslogger_write_str_with_comma(errorData->remote_port, true, csv, true);
/* session start timestamp */
syslogger_append_timestamp(errorData->fix_fields.session_start_time, true, true);
/* Transaction id */
syslogger_write_int32(false, "", errorData->fix_fields.top_trans_id, true, true);
/* GPDB specific options. */
syslogger_write_int32(true, "con", errorData->fix_fields.gp_session_id, true, true);
syslogger_write_int32(true, "cmd", errorData->fix_fields.gp_command_count, true, true);
syslogger_write_int32(false, errorData->fix_fields.gp_is_primary == `t`? "seg" : "mir", errorData->fix_fields.gp_segment_id,
true, true);
syslogger_write_int32(true, "slice", errorData->fix_fields.slice_id, true, true);
syslogger_write_int32(true, "dx", errorData->fix_fields.dist_trans_id, true, true);
syslogger_write_int32(true, "x", errorData->fix_fields.local_trans_id, true, true);
syslogger_write_int32(true, "sx", errorData->fix_fields.subtrans_id, true, true);
/* error severity */
syslogger_write_str_with_comma(errorData->error_severity, true, csv, true);
/* sql state code */
syslogger_write_str_with_comma(errorData->sql_state, true, csv, true);
/* errmsg */
syslogger_write_str_with_comma(errorData->error_message, true, csv, true);
/* errdetail */
syslogger_write_str_with_comma(errorData->error_detail, true, csv, true);
/* errhint */
syslogger_write_str_with_comma(errorData->error_hint, true, csv, true);
/* internal query */
syslogger_write_str_with_comma(errorData->internal_query, true, csv, true);
/* internal query pos */
syslogger_write_int32(true, "", errorData->fix_fields.internal_query_pos, true, true);
/* err ctxt */
syslogger_write_str_with_comma(errorData->error_context, true, csv, true);
/* user query */
syslogger_write_str_with_comma(errorData->debug_query_string, true, csv, true);
/* cursor pos */
syslogger_write_int32(false, "", errorData->fix_fields.error_cursor_pos, true, true);
/* func name */
syslogger_write_str_with_comma(errorData->error_func_name, true, csv, true);
/* file name */
syslogger_write_str_with_comma(errorData->error_filename, true, csv, true);
/* line number */
syslogger_write_int32(true, "", errorData->fix_fields.error_fileline, true, true);
/* stack trace */
if (errorData->stacktrace != NULL)
{
if (csv)
{
write_syslogger_file_binary(""", 1, LOG_DESTINATION_STDERR);
}
syslogger_write_str(errorData->stacktrace, strlen(errorData->stacktrace), true, csv);
if (csv)
{
write_syslogger_file_binary(""", 1, LOG_DESTINATION_STDERR);
}
}
/* EOL */
write_syslogger_file_binary(LOG_EOL, strlen(LOG_EOL), LOG_DESTINATION_STDERR);
/*
* Send alerts when needed. The alerts are sent only by the master.
* If the alert is failed for whatever reason, log a message and continue.
*/
if (errorData->fix_fields.send_alert == `t` &&
Gp_entry_postmaster && Gp_role == GP_ROLE_DISPATCH)
{
PG_TRY();
{
send_alert(errorData);
}
PG_CATCH();
{
elog(LOG,"Failed to send alert.");
}
PG_END_TRY();
}
}
方法2
另一種方式,我們也能夠使用gp_dist_random下發指令給SEGMENT節點,(通過SQL介面連線到master節點呼叫SQL即可)
《Greenplum通過gp_dist_random(`gp_id`) 在所有節點呼叫某個函式》
參考
《Greenplum segment級鎖問題排查方法 – 阿里雲HybridDB for PostgreSQL最佳實踐》
《Greenplum segment節點直接讀寫配置與效能》
《Use pgbouncer connect to GreenPlum`s segment node》
《Greenplum通過gp_dist_random(`gp_id`) 在所有節點呼叫某個函式》
《Greenplum & PostgreSQL UPSERT udf 實現 – 2 batch批量模式》
《Greenplum & PostgreSQL UPSERT udf 實現 – 1 單行模式》
相關文章
- 【Go語言探險】線上奇怪日誌問題的排查Go
- 記一次線上報錯日誌問題排查
- 排查log4j不輸出日誌到檔案的問題
- 解決ELK日誌被截斷的問題
- 排查錯誤日誌
- Nginx訪問日誌詳解——各個部分含義——非常簡單Nginx
- 解決生產日誌重複列印的問題
- Dev 日誌 | 一次 Segmentation Fault 和 GCC Illegal Instruction 編譯問題排查devSegmentationGCStruct編譯
- SVN解決不能檢視最近提交日誌的問題
- 如何提高後臺服務應用問題的排查效率?日誌 VS 遠端除錯除錯
- 解決 AS 3.1.0 版本 Logcat 合併多條日誌的問題GC
- Logstash 配置Java日誌格式的方法Java
- Laravel 框架的日誌許可權問題Laravel框架
- logback 日誌輸出格式
- Nginx日誌格式設定Nginx
- 自定義Nginx日誌格式Nginx
- 徹底解決SLF4J的日誌衝突的問題
- IDEA裡Tomcat Console日誌中文亂碼問題解決IdeaTomcat
- guava cache大量的WARN日誌的問題分析Guava
- 【Logback日誌級別】動態調整Logback的日誌級別
- java問題排查Java
- 框架問題排查框架
- Laravel Daily 日誌許可權問題LaravelAI
- 9 個技巧,解決 K8s 中的日誌輸出問題K8S
- Redis 大key(bigkey)問題的排查與解決方案Redis
- Redis日誌級別Redis
- 多執行緒 日誌 和截圖的問題執行緒
- 踩坑日誌--CEPH叢集常見問題解決辦法
- 【日誌亂碼】解決Tomcat啟動控制檯亂碼問題Tomcat
- 【YashanDB知識庫】windows配置ODBC跟蹤日誌, 使用日誌定位問題Windows
- 【MySQL日誌】MySQL日誌檔案初級管理MySql
- 一起分析Nginx 日誌及效能排查Nginx
- Oracle歸檔日誌暴增排查優化Oracle優化
- 一文讀懂DNS解析故障的排查思路,輕鬆解決解析常見問題DNS
- #解決opencv與matlab對於BMP格式讀出的畫素值不同的問題OpenCVMatlab
- Laravel 日誌檔案許可權問題Laravel
- kubernetesgraceperiod失效問題排查
- SDK與問題排查