postgresql物理備份工具pg_rman的使用詳解

lusklusklusk發表於2020-01-17

pg_rman是一個開源的PostgreSQL備份軟體,使用的是pg_start_backup(), copy, pg_stop_backup()的備份模式,因為是檔案複製模式,所以pg_rman必須在資料庫節點上執行

pg_rman的使用方法說明


pg_rman的下載地址


OS是CentOS Linux release 7.6,postgresql版本是PostgreSQL 11.3,則pg_rman的下載版本可以1.3.8或1.3.9
/download/V1.3.8/pg_rman-1.3.8-1.pg11.rhel7.x86_64.rpm
/download/V1.3.9/pg_rman-1.3.9-1.pg11.rhel7.x86_64.rpm

pg_rman的安裝
[root@FRSPGSQLDEV2 ~]# rpm -ivh pg_rman-1.3.9-1.pg11.rhel7.x86_64.rpm
安裝好後,pg_rman會生成到/usr/pgsql-11/bin/目錄下,我們直接配置/etc/profile就可以使用pg_rman了




pg_rman的使用方法
1、初始化,實際上就是需要一個目錄,這個目錄將用於存放備份的檔案
-bash-4.2$ pg_rman init -B /pgdata/backup
WARNING: ARCLOG_PATH is not set yet
DETAIL: The archive_command is not set in postgresql.conf.
HINT: Please set ARCLOG_PATH in pg_rman.ini or environmental variable.
INFO: SRVLOG_PATH is set to '/pgdata/log'

配置歸檔,主要是修改postgresql.conf中的以下三個引數,以下/XX就是歸檔目錄
wal_level=replica
archive_mode =on
archive_command ='cp %p /XX/%f'
配置ARCLOG_PATH,就是修改備份目的地目錄/pgdata/backup中的pg_rman.ini檔案的ARCLOG_PATH引數,這個值就是歸檔目錄/XX
ARCLOG_PATH=/XX

2、全量備份(會備份資料檔案和歸檔的日誌檔案)
-bash-4.2$ pg_rman backup -b full -B /pgdata/backup

3、驗證備份
-bash-4.2$ pg_rman validate -B /pgdata/backup
備註:如果好幾個備份都沒有執行驗證的話,執行一次以上命令就會都驗證這些沒有驗證的備份
      沒有驗證的話,pg_rman show也能看到這個沒有驗證的備份,但是狀態顯示DONE而非OK

4、檢視備份
-bash-4.2$ pg_rman show -B /pgdata/backup

5、增量備份(會備份資料檔案和歸檔的日誌檔案)
-bash-4.2$ pg_rman backup -b incremental  -B /pgdata/backup

6、備份歸檔日誌
-bash-4.2$ pg_rman backup -b archive  -B /pgdata/backup

7、刪除備份,後面必須接具體的時間,否則會報錯ERROR: delete range option not specified  HINT: Please run with 'pg_rman delete DATE'.
-bash-4.2$ pg_rman delete -B /pgdata/backup "2020-01-05 23:13:19"
--例如我只需要我的備份集能恢復到"2020-01-05 23:13:19",只會保留一份在"2020-01-05 23:13:19"之前的最新備份,其他的備份都會刪除。
--以上刪除後pg_rman show看不到這些備份,每次備份的物理目錄比如/pgdata/backup/20200115還是存在的,但是/pgdata/backup/20200115下面的子目錄arclog、database、srvlog被物理刪除了
--當然也可以採用備份策略來自動刪除,這個自動刪除的動作發生在每次備份的時候,以下配置可以寫在pg_rman.ini檔案中
KEEP_DATA_GENERATIONS = 3   -- 備份集冗餘度是3,比如做了3次全備份,做第4次全備份的時候會自動把第1次的備份從catalog裡面刪除,第四次備份的時候會出現這樣的資訊INFO: delete the backup with start time:"第1次備份的時間點"
KEEP_DATA_DAYS = 10         -- 備份集保留日期10天

8、刪除每次物理備份的目錄,基於上面7的基礎,即只有執行了delete的才會被purge掉
-bash-4.2$ pg_rman purge -B /pgdata/backup
--delete的時候保留了每次備份的物理目錄,執行purge後會把delete殘留下來的物理目錄比如/pgdata/backup/20200115刪除了
--沒有執行步驟7的delete,直接執行步驟8的purge不會刪除任何東西

9、恢復
-bash-4.2$ pg_rman restore -B /pgdata/backup --recovery-target-time "2020-01-15 22:11:39" --hard-copy
--如果不指定recovery-target-time,則恢復到最新時間
--如果不指定hard-copy,則歸檔日誌目錄裡的歸檔日誌是使用的硬連線指向備份目錄中的歸檔日誌,加了這個引數的話,則是直接把備份目錄中的歸檔日誌複製到歸檔日誌目錄

恢復的一些備註:
9.1、恢復好後,如果遇到資料庫啟動後變成了只讀模式ERROR:  cannot execute XX in a read-only transaction,請檢查資料目錄下的recovery.conf檔案裡的資訊,把它改名或刪除再重新啟動
9.2、正常情況資料目錄沒有recovery.conf檔案的,備份目錄裡也沒有recovery.conf檔案的,這個檔案是每次執行恢復後自動生成的,存放在資料目錄中,可以刪除或改名。
9.3、恢復時,可以選擇原地恢復(覆蓋式),也可以使用新的$PGDATA作為恢復目標。原地恢復(覆蓋式)時pg_rman會覆蓋原有的資料檔案,arch, pg_wal目錄中的檔案,所以,如果你要保留原資料,建議先將原資料目錄重新命名。
9.4、恢復除了recovery-target-time和hard-copy兩個引數外,還有recovery-target-timeline和recovery-target-xid兩個引數
--recovery-target-timeline TIMELINE  
如果不指定時間線,則使用$PGDATA/global/pg_control,如果沒有$PGDATA/global/pg_control,則使用最新的全量備份集的時間線。
Specifies recovering into a particular timeline. If not specified, the current timeline from ($PGDATA/global/pg_control) is used.  
-bash-4.2$ pg_rman restore -B /home/backup
WARNING: pg_controldata file "/pgdata/global/pg_control" does not exist
INFO: the recovery target timeline ID is not given
INFO: use timeline ID of latest full backup as recovery target: 1

--recovery-target-xid XID
如果不指定,則恢復到最新xid
This parameter specifies the transaction ID up to which recovery will proceed. If not specified, continue recovery to the latest xid.  



實踐發現的問題
備份目錄一定不能是資料目錄的子目錄,否則會發現備份越來越大,incremental都比full備份大,原因就是因為資料目錄是/pgdata,而備份目錄是/pgdata/backup,如此一來,每次備份都會把/pgdata下面的目錄都備份一遍,沒錯,也就是說每次備份都會把上次的備份再備份一遍






備份恢復案例
資料目錄是/pgdata,備份目錄是/home/backup

1、全備份(時間2020-01-15 21:53:48,database有test1,schema有s1,表有public.table1)
postgres=# create database test1;
postgres=# \c test1
test1=# create schema s1;
test1=# create table table1 (hid int);
test1=# insert into table1 values(1);

-bash-4.2$ pg_rman backup -b full -B /home/backup
-bash-4.2$ pg_rman validate -B /home/backup
INFO: validate: "2020-01-15 21:53:48" backup and archive log files by CRC
INFO: backup "2020-01-15 21:53:48" is valid

2、增量備份(時間2020-01-15 22:11:39,database有test1,schema有s2,表有s2.t1)
postgres=# \c test1
test1=# create table table2(hid int);
test1=# create schema s2;
test1=# create table s2.t1(hid int);

-bash-4.2$ pg_rman backup -b incremental -B /home/backup
-bash-4.2$ pg_rman validate -B /home/backup
INFO: validate: "2020-01-15 22:11:39" backup and archive log files by CRC
INFO: backup "2020-01-15 22:11:39" is valid

3、全量備份(時間2020-01-15 22:21:48,database有test2,schema有s3,表有public.t3和s3.s3)
postgres=# create database test2;
postgres=# \c test2
test2=# create schema s3;
test2=# create table t3(hid int);
test2=# create table s3.s3(hid int);
test2=# insert into t3 values(1);
test2=# insert into s3.s3 values(1);

-bash-4.2$ pg_rman backup -b full -B /home/backup
-bash-4.2$ pg_rman validate -B /home/backup
INFO: validate: "2020-01-15 22:21:48" backup and archive log files by CRC
INFO: backup "2020-01-15 22:21:48" is valid


4、停止服務,刪除目錄/pgdata
[root@FRSPGSQLDEV2 ~]pkill -9 postgres

5、恢復,不加任何引數
-bash-4.2$ pg_rman restore -B /home/backup
WARNING: pg_controldata file "/pgdata/global/pg_control" does not exist
INFO: the recovery target timeline ID is not given
INFO: use timeline ID of latest full backup as recovery target: 1
INFO: calculating timeline branches to be used to recovery target point
INFO: searching latest full backup which can be used as restore start point
INFO: found the full backup can be used as base in recovery: "2020-01-15 22:21:48"
INFO: copying online WAL files and server log files
INFO: clearing restore destination
INFO: validate: "2020-01-15 22:21:48" backup and archive log files by SIZE
INFO: backup "2020-01-15 22:21:48" is valid
INFO: restoring database files from the full mode backup "2020-01-15 22:21:48"
INFO: searching incremental backup to be restored
INFO: searching backup which contained archived WAL files to be restored
INFO: backup "2020-01-15 22:21:48" is valid
INFO: restoring WAL files from backup "2020-01-15 22:21:48"
INFO: restoring online WAL files and server log files
INFO: generating recovery.conf
INFO: restore complete
HINT: Recovery will start automatically when the PostgreSQL server is started.
--經檢測,恢復到最近的資料

6、停止服務,刪除目錄/pgdata,恢復到2020-01-15 22:11:39
-bash-4.2$ pg_rman restore -B /home/backup --recovery-target-time "2020-01-15 22:11:39"
WARNING: pg_controldata file "/pgdata/global/pg_control" does not exist
INFO: the recovery target timeline ID is not given
INFO: use timeline ID of latest full backup as recovery target: 1
INFO: calculating timeline branches to be used to recovery target point
INFO: searching latest full backup which can be used as restore start point
INFO: found the full backup can be used as base in recovery: "2020-01-15 21:53:48"
INFO: copying online WAL files and server log files
INFO: clearing restore destination
INFO: validate: "2020-01-15 21:53:48" backup and archive log files by SIZE
INFO: backup "2020-01-15 21:53:48" is valid
INFO: restoring database files from the full mode backup "2020-01-15 21:53:48"
INFO: searching incremental backup to be restored
INFO: searching backup which contained archived WAL files to be restored
INFO: backup "2020-01-15 21:53:48" is valid
INFO: restoring WAL files from backup "2020-01-15 21:53:48"
INFO: backup "2020-01-15 22:11:39" is valid
INFO: restoring WAL files from backup "2020-01-15 22:11:39"
INFO: backup "2020-01-15 22:21:48" is valid
INFO: restoring WAL files from backup "2020-01-15 22:21:48"
INFO: restoring online WAL files and server log files
INFO: generating recovery.conf
INFO: restore complete
HINT: Recovery will start automatically when the PostgreSQL server is started.
--經檢測,恢復到了2020-01-15 22:11:39,即增量備份的時刻點,有test1庫,沒有test2庫,test1庫有s1和s2兩個schema,s2.t1存在


7、停止服務,刪除目錄/pgdata,恢復到2020-01-15 22:11:39
-bash-4.2$ pg_rman restore -B /home/backup --recovery-target-time "2020-01-15 21:53:48"
WARNING: pg_controldata file "/pgdata/global/pg_control" does not exist
INFO: the recovery target timeline ID is not given
INFO: use timeline ID of latest full backup as recovery target: 1
INFO: calculating timeline branches to be used to recovery target point
INFO: searching latest full backup which can be used as restore start point
INFO: found the full backup can be used as base in recovery: "2020-01-15 02:04:57"
INFO: copying online WAL files and server log files
INFO: clearing restore destination
INFO: validate: "2020-01-15 02:04:57" backup and archive log files by SIZE
INFO: backup "2020-01-15 02:04:57" is valid
INFO: restoring database files from the full mode backup "2020-01-15 02:04:57"
INFO: searching incremental backup to be restored
INFO: searching backup which contained archived WAL files to be restored
INFO: backup "2020-01-15 02:04:57" is valid
INFO: restoring WAL files from backup "2020-01-15 02:04:57"
INFO: backup "2020-01-15 21:53:48" is valid
INFO: restoring WAL files from backup "2020-01-15 21:53:48"
INFO: backup "2020-01-15 22:11:39" is valid
INFO: restoring WAL files from backup "2020-01-15 22:11:39"
INFO: backup "2020-01-15 22:21:48" is valid
INFO: restoring WAL files from backup "2020-01-15 22:21:48"
INFO: restoring online WAL files and server log files
INFO: generating recovery.conf
INFO: restore complete
HINT: Recovery will start automatically when the PostgreSQL server is started.
--經檢測,恢復到了2020-01-15 22:11:39,即第一次備份時刻點,有test1庫,沒有test2庫,test1庫有s1但是沒有s2這個schema

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/30126024/viewspace-2673866/,如需轉載,請註明出處,否則將追究法律責任。

相關文章