MogDB/openGauss誤刪未歸檔的xlog日誌如何解決
在使用MogDB/openGauss資料庫的過程中,有時候大量業務,或者導資料會導致pg_xlog下的日誌數量持續增長,此時如果xlog的產生頻率太快,而來不及自動清理,極有可能造成pg_xlog目錄的打滿。如果對資料庫的xlog不太瞭解的時候,可能造成誤刪未歸檔的xlog日誌,或者更嚴重地,把對應操作還未寫入資料檔案的xlog也刪除了。
本文將講解了通常情況下pg_xlog下的xlog檔案所處狀態,並總結了資料已經落盤但未進行歸檔的xlog日誌被誤刪時,日誌週期產生缺失xlog日誌報錯和歸檔失敗問題的幾種解決方法。
一、pg_xlog下xlog檔案的狀態
通常情況下我們是不建議手動刪除pg_xlog下的日誌的,因為pg_xlog下的xlog有自動清理機制,可以根據需求配置引數調整清理速度。
而正常情況下,pg_xlog下應該存在如下的三種狀態的xlog檔案,在開啟歸檔的情況下,可以進行相關討論:
第一種:對應資料已經落盤,已經進行完歸檔。pg_xlog/archive_status中的狀態為.done
可以手動刪除,對資料庫無影響,但是不建議手動刪除,因為pg_xlog下的xlog有自動清理機制,可以根據需求配置引數調整清理速度
第二種:對應資料已經落盤,未進行歸檔。pg_xlog/archive_status中的狀態為.ready
資料已落盤,但是未歸檔,刪除pg_xlog下的xlog後,對當前的資料庫裡的資料無影響,但是如果想基於全量備份和連續的歸檔日誌做PITR,則會缺少日誌,而且歸檔會因為缺失被刪除的這部分xlog而失敗,後續歸檔都不成功,從而阻塞pg_xlog下xlog日誌的正常的自動清理,資料庫會列印相關報錯:
DETAIL: The failed archive command was: "cp pg_xlog/000000010000000200000071 /data/om3/data/archivedir/000000010000000200000071 " cp: cannot stat 'pg_xlog/000000010000000200000071': No such file or directory
第三種:對應資料未落盤,未進行歸檔
剛寫完xlog,但是資料還未落盤,此時刪除xlog可能會丟資料,而且資料庫可能服務出現問題,資料庫無法啟動,可能需要使用pg_resetxlog工具清理xlog,並重置pg_control檔案中的一些其他控制資訊,來保證資料庫正常啟動。pg_resetxlog將作為資料庫修復的最後手段使用。而且修復而啟動資料庫後,可能會由於部分提交的事務,導致資料庫和之前的資料不一致的情況。
二、處於第二種時,誤刪未歸檔的xlog日誌報錯如何解決
本篇測試內容使用的主要歸檔引數是archive_mode和archive_command。資料庫版本是MogDB-3.0.5。
MogDB=# show archive_mode ; archive_mode -------------- on (1 row) MogDB=# show archive_command ; archive_command ------------------------------------- cp %p /data/om3/data/archivedir/%f (1 row) MogDB=# show archive_dest ; archive_dest -------------- (1 row)
1、臨時調整archive_command
如果是使用archive_command這個引數決定歸檔行為的時候,可以從archive_command命令下手,修改這個歸檔命令,騙過資料庫說歸檔成功了。
如下環境已經模擬出了誤刪未歸檔的xlog的現象
om3@lmt0003 archive_status]$ rm ../000000010000000200000074 [om3@lmt0003 archive_status]$ cat /data/om3/log/pg_log/dn_6001/postgresql-2023-11-01_115121.log | grep 000000010000000200000074|more cp: cannot create regular file '/data/om3/data/archivedir/1/000000010000000200000074': No such file or directory2023-11-01 14:55:21.521 [unk nown] [unknown] localhost 70393223549024 0[0:0#0] 0 [BACKEND] LOG: archive command failed with exit code 1 2023-11-01 14:55:21.521 [unknown] [unknown] localhost 70393223549024 0[0:0#0] 0 [BACKEND] DETAIL: The failed archive command was: "cp pg_xl og/000000010000000200000074 /data/om3/data/archivedir/1/000000010000000200000074 " cp: cannot create regular file '/data/om3/data/archivedir/1/000000010000000200000074': No such file or directory2023-11-01 14:55:22.527 [unk nown] [unknown] localhost 70393223549024 0[0:0#0] 0 [BACKEND] LOG: archive command failed with exit code 1 2023-11-01 14:55:22.527 [unknown] [unknown] localhost 70393223549024 0[0:0#0] 0 [BACKEND] DETAIL: The failed archive command was: "cp pg_xl og/000000010000000200000074 /data/om3/data/archivedir/1/000000010000000200000074 " cp: cannot create regular file '/data/om3/data/archivedir/1/000000010000000200000074': No such file or directory2023-11-01 14:55:23.532 [unk nown] [unknown] localhost 70393223549024 0[0:0#0] 0 [BACKEND] LOG: archive command failed with exit code 1 2023-11-01 14:55:23.532 [unknown] [unknown] localhost 70393223549024 0[0:0#0] 0 [BACKEND] DETAIL: The failed archive command was: "cp pg_xl og/000000010000000200000074 /data/om3/data/archivedir/1/000000010000000200000074 " 2023-11-01 14:55:23.532 [unknown] [unknown] localhost 70393223549024 0[0:0#0] 0 [BACKEND] WARNING: xlog file "000000010000000200000074" cou ld not be archived: too many failures [om3@lmt0003 pg_xlog]$ cd archive_status/ [om3@lmt0003 archive_status]$ ls 00000001000000020000006F.done 000000010000000200000071.done 000000010000000200000073.done 000000010000000200000070.done 000000010000000200000072.done 000000010000000200000074.ready [om3@lmt0003 archive_status]$ gsql -r gsql ((MogDB 3.0.5 build 76182eb6) compiled at 2023-07-20 16:53:13 commit 0 last mr 1801 ) Non-SSL connection (SSL connection is recommended when requiring high-security) Type "help" for help. MogDB=# select pg_switchover_xlog(); MogDB=# select pg_switch_xlog(); pg_switch_xlog ---------------- 2/750019D8 (1 row) MogDB=# \q [om3@lmt0003 archive_status]$ ls 00000001000000020000006F.done 000000010000000200000071.done 000000010000000200000073.done 000000010000000200000075.ready 000000010000000200000070.done 000000010000000200000072.done 000000010000000200000074.ready
1、修改postgresql.conf
[om3@lmt0003 archive_status]$ vi ../../postgresql.conf archive_mode = on #archive_command = 'cp %p /data/om3/data/archivedir/%f ' archive_command = 'ls -l /data/om3/data/ ' #別的命令也可以,只要執行的時候不報錯就可以。達到騙過資料庫的目的就可以。
2.重新整理配置
[om3@lmt0003 archive_status]$ gs_ctl reload
3.不產生error日誌,並且archive_status的狀態變為done
[om3@lmt0003 archive_status]$ ls 00000001000000020000006F.done 000000010000000200000071.done 000000010000000200000073.done 000000010000000200000075.ready 000000010000000200000070.done 000000010000000200000072.done 000000010000000200000074.ready [om3@lmt0003 archive_status]$ ls 00000001000000020000006F.done 000000010000000200000071.done 000000010000000200000073.done 000000010000000200000075.ready 000000010000000200000070.done 000000010000000200000072.done 000000010000000200000074.ready [om3@lmt0003 archive_status]$ ls 00000001000000020000006F.done 000000010000000200000071.done 000000010000000200000073.done 000000010000000200000075.done 000000010000000200000070.done 000000010000000200000072.done 000000010000000200000074.done -----歸檔的報錯之前大概每一分鐘列印一次,每次列印多行。 [om3@lmt0003 archive_status]$ cat /data/om3/log/pg_log/dn_6001/postgresql-2023-11-01_115121.log | grep 000000010000000200000074|tail -n 5 cp: cannot stat 'pg_xlog/000000010000000200000074': No such file or directory2023-11-01 15:00:17.297 [unknown] [unknown] localhost 70393223549024 0[0:0#0] 0 [BACKEND] LOG: archive command failed with exit code 1 2023-11-01 15:00:17.297 [unknown] [unknown] localhost 70393223549024 0[0:0#0] 0 [BACKEND] DETAIL: The failed archive command was: "cp pg_xlog/000000010000000200000074 /data/om3/data/archivedir/000000010000000200000074 " cp: cannot stat 'pg_xlog/000000010000000200000074': No such file or directory2023-11-01 15:00:18.302 [unknown] [unknown] localhost 70393223549024 0[0:0#0] 0 [BACKEND] LOG: archive command failed with exit code 1 2023-11-01 15:00:18.302 [unknown] [unknown] localhost 70393223549024 0[0:0#0] 0 [BACKEND] DETAIL: The failed archive command was: "cp pg_xlog/000000010000000200000074 /data/om3/data/archivedir/000000010000000200000074 " 2023-11-01 15:00:18.302 [unknown] [unknown] localhost 70393223549024 0[0:0#0] 0 [BACKEND] WARNING: xlog file "000000010000000200000074" could not be archived: too many failures [om3@lmt0003 archive_status]$ date Wed Nov 1 15:13:43 CST 2023
4、修改postgresql.conf為正常。
archive_mode = on archive_command = 'cp %p /data/om3/data/archivedir/%f ' #archive_command = 'ls -l /data/om3/data/ '
然後重新整理配置。這樣就一切恢復正常了。只是缺少了刪除的這部分xlog以及欺騙資料庫歸檔命令期間的xlog,引數調整回來的後續日誌可以繼續歸檔。也解決了持續產生日誌報錯的問題。
[om3@lmt0003 archive_status]$ gs_ctl reload
2、修改archive_status目錄下誤刪的xlog對應的xxx.ready狀態檔案
如下環境已經模擬出了誤刪未歸檔的xlog的現象
om3@lmt0003 pg_xlog]$ rm 000000010000000200000071
日誌出現相關報錯
並且後續的xlog日誌
檢視日誌列印頻率,每一分鐘列印一次,一次列印多行
手動將archive_status下日誌提示的缺少的xlog對應的狀態檔案的xxx.ready改成xxx.done
om3@lmt0003 archive_status]$ cp 000000010000000200000071.ready 000000010000000200000071.ready_bak om3@lmt0003 archive_status]$ mv 000000010000000200000071.ready 000000010000000200000071.done
日誌不再報錯,除了丟失的xlog外,後續日誌可以正常進行歸檔。
3.刪除archive_status目錄下誤刪的xlog對應的xxx.ready狀態檔案
模擬誤刪未歸檔的xlog的現象
[om3@lmt0003 archive_status]$ ls 00000001000000020000006F.done 000000010000000200000071.done 000000010000000200000073.done 000000010000000200000075.done 000000010000000200000070.done 000000010000000200000072.done 000000010000000200000074.done [om3@lmt0003 archive_status]$ gsql -r gsql ((MogDB 3.0.5 build 76182eb6) compiled at 2023-07-20 16:53:13 commit 0 last mr 1801 ) Non-SSL connection (SSL connection is recommended when requiring high-security) Type "help" for help. MogDB=# select pg_switch_xlog(); pg_switch_xlog ---------------- 2/7600BAC0 (1 row) MogDB=# \q [om3@lmt0003 archive_status]$ ls 00000001000000020000006F.done 000000010000000200000071.done 000000010000000200000073.done 000000010000000200000075.done 000000010000000200000070.done 000000010000000200000072.done 000000010000000200000074.done 000000010000000200000076.ready [om3@lmt0003 archive_status]$ rm ../000000010000000200000076 [om3@lmt0003 archive_status]$ cat /data/om3/log/pg_log/dn_6001/postgresql-2023-11-01_115121.log | grep 000000010000000200000076|tail -n 5cp: cannot create regular file '/data/om3/data/archivedir/1/000000010000000200000076': No such file or directory2023-11-01 15:25:37.642 [unknown] [unknown] localhost 70393223549024 0[0:0#0] 0 [BACKEND] LOG: archive command failed with exit code 1 2023-11-01 15:25:37.642 [unknown] [unknown] localhost 70393223549024 0[0:0#0] 0 [BACKEND] DETAIL: The failed archive command was: "cp pg_xlog/000000010000000200000076 /data/om3/data/archivedir/1/000000010000000200000076 " cp: cannot create regular file '/data/om3/data/archivedir/1/000000010000000200000076': No such file or directory2023-11-01 15:25:38.647 [unknown] [unknown] localhost 70393223549024 0[0:0#0] 0 [BACKEND] LOG: archive command failed with exit code 1 2023-11-01 15:25:38.647 [unknown] [unknown] localhost 70393223549024 0[0:0#0] 0 [BACKEND] DETAIL: The failed archive command was: "cp pg_xlog/000000010000000200000076 /data/om3/data/archivedir/1/000000010000000200000076 " 2023-11-01 15:25:38.647 [unknown] [unknown] localhost 70393223549024 0[0:0#0] 0 [BACKEND] WARNING: xlog file "000000010000000200000076" could not be archived: too many failures MogDB=# select pg_switch_xlog(); pg_switch_xlog ---------------- 2/77001AC8 (1 row) MogDB=# \q [om3@lmt0003 archive_status]$ ls 00000001000000020000006F.done 000000010000000200000072.done 000000010000000200000075.done 000000010000000200000070.done 000000010000000200000073.done 000000010000000200000076.ready 000000010000000200000071.done 000000010000000200000074.done 000000010000000200000077.ready
在pg_xlog/archive_status下刪除缺失的xlog對應的xxx.ready的狀態檔案
00000001000000020000006F.done 000000010000000200000072.done 000000010000000200000075.done 000000010000000200000070.done 000000010000000200000073.done 000000010000000200000076.ready 000000010000000200000071.done 000000010000000200000074.done 000000010000000200000077.ready [om3@lmt0003 archive_status]$ mv 000000010000000200000076.ready 000000010000000200000076.ready_bak [om3@lmt0003 archive_status]$ rm -rf 000000010000000200000076.ready [om3@lmt0003 archive_status]$ ls 00000001000000020000006F.done 000000010000000200000072.done 000000010000000200000075.done 000000010000000200000070.done 000000010000000200000073.done 000000010000000200000076.ready_bak 000000010000000200000071.done 000000010000000200000074.done 000000010000000200000077.ready
發現日誌已經不再報缺失xlog以及歸檔失敗的error了,而且後續pg_xlog下的xlog日誌可以正常進行歸檔。
[om3@lmt0003 archive_status]$ ls 00000001000000020000006F.done 000000010000000200000072.done 000000010000000200000075.done 000000010000000200000070.done 000000010000000200000073.done 000000010000000200000076.ready_bak 000000010000000200000071.done 000000010000000200000074.done 000000010000000200000077.ready [om3@lmt0003 archive_status]$ tail -f /data/om3/log/pg_log/dn_6001/postgresql-2023-11-01_115121.log | grep 000000010000000200000076^C [om3@lmt0003 archive_status]$ ls 00000001000000020000006F.done 000000010000000200000072.done 000000010000000200000075.done 000000010000000200000070.done 000000010000000200000073.done 000000010000000200000076.ready_bak 000000010000000200000071.done 000000010000000200000074.done 000000010000000200000077.done [om3@lmt0003 archive_status]$ cd ../../archivedir/ [om3@lmt0003 archivedir]$ ls 000000010000000200000077 000000010000000200000077
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/69990629/viewspace-2993186/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- PostgreSQL的xlog/Wal歸檔及日誌清理SQL
- 手工rm刪除歸檔日誌對備份歸檔日誌的影響
- Oralce資料庫關閉歸檔日誌並且刪除歸檔日誌資料庫
- oracle 刪除過期的歸檔日誌Oracle
- RMAN刪除歸檔日誌出現RMAN-0813錯誤的處理
- 【Oracle】歸檔日誌管理-設定歸檔日誌路徑以及歸檔日誌冗餘Oracle
- [20221121]rman刪除歸檔日誌問題.txt
- 歸檔日誌挖掘
- PostgreSQL 歸檔日誌SQL
- oracle歸檔日誌Oracle
- 在Oracle中,如何定時刪除歸檔日誌檔案?Oracle
- openGauss/MogDB的uncommitted xmin問題解決MIT
- 歸檔oracle alert日誌Oracle
- 14. 日誌歸檔
- Oracle歸檔日誌清理Oracle
- 配置rman來自動刪除應用過的歸檔日誌
- 對歸檔模式下CLEAR 未歸檔日誌後恢復資料庫的一點看法模式資料庫
- 錯誤碼[-718]:收集到的歸檔日誌不連續
- 批量註冊歸檔日誌
- 通過RMAN設定standby接收日誌後主庫歸檔日誌才可刪除
- openGauss/MogDB的TPCH測試
- 達夢資料庫DM8之刪除歸檔日誌檔案資料庫
- 在ASM磁碟組中刪除歸檔日誌報ORA-15028ASM
- PostgreSQL如何刪除不使用的xlog檔案SQL
- DG歸檔日誌缺失恢復
- ArgoWorkflow教程(四)---Workflow & 日誌歸檔Go
- logminer異機挖掘歸檔日誌
- MogDB openGauss故障排查流程
- MogDB/openGauss 生態工具-MTK對glibc版本的解決
- 【REDO】Oracle 日誌挖掘,分析歸檔日誌線上日誌主要步驟Oracle
- DG_保證歸檔日誌不能隨意被刪除的四種方法
- git恢復誤刪未提交的檔案Git
- Oracle歸檔日誌所在目錄時間不對&&Oracle叢集日誌時間顯示錯誤Oracle
- rman 還原歸檔日誌(restore archivelogRESTHive
- oracle11G歸檔日誌管理Oracle
- oracle rman備份歸檔日誌需要先切換日誌嗎Oracle
- openGauss/MOGDB與PG等待事件事件
- 測試在丟失歸檔日誌的情況下,跳過部分歸檔日誌進行資料恢復資料恢復