PostgreSQL 的熱備和恢復

roninwei發表於2006-10-07
作為最強大的開源資料庫,PostgreSQL擁有一切商業資料庫所擁有的功能,甚至比商業資料庫更好。
在以前的版本中,它在資料恢復,可靠性方面做的不太好,但經過最近幾年的發展,已經可以和Oracle
媲美了。


在PostgreSQL7的時候就引入了WAL(Write Ahead Logging)的概念,即預寫日誌,所有對資料庫的更改,
在更改之前必須寫到該LOG中,這樣,就算機器斷電,PostgreSQL也可以從該LOG中知道資料庫在斷電前做
了什麼操作,已經做到第幾步了,這樣保證了所有事務的完整性,但PostgreSQL7沒有提供很好的災難恢復
機制,一旦資料庫崩潰,除非你曾經對資料庫作過pg_dump或者file system level backup,否則,你的資料
將全部丟失,並且,就算你曾經對資料庫做過備份,也只能恢復到你備份的那一刻的資料,這對一個生產資料庫
(特別是24*7生產庫)來說,是無法容忍的。

PostgreSQL8的推出,使PostgreSQL的穩定性和可靠性又邁出了劃時代的一步。
除了提供對tablespace的支援外,PostgreSQL8提供了支援時間點的恢復---PITR.
其基本原理和Oracle的熱備份完全一樣:

首先,對資料庫在file system level做一個backup(PostgreSQL是首先用pg_start_backup('label')命令,
然後用tar直接tar整個data目錄,假設命名為base.tar,然後pg_stop_backup();結束熱備。
Oracle首先是用alter tablespace xxx begin backup,然後直接cp資料檔案);

然後,備份相關的配置檔案(PostgreSQL只需備份postgresql.conf,pg_hba.conf,pg_ident.conf就可以了,其實,
前面的tar已經將這些檔案備份了,Oracle需要alter database backup control file......);

最後,備份WAL(
可以設定postgresql.conf中的archive_command,
該命令可以讓PostgreSQL8自動將需要的歸檔的日誌檔案備份的其他地方中。
但是注意:如果你是讓PostgreSQL8呼叫archive_command來備份WAL的話,
可能根本就做不到PITR,我做過實驗,如果依靠base.tar和archive_command產生的WAL其實只能恢復到最後一個
archive_command儲存的WAL的資料,pg_xlog/下面可能還有資料,如果PostgreSQL8的資料目錄徹底損壞的話,還是會
丟失資料,所以,我建議,在寫資料備份指令碼的時候,最好將pg_xlog/下面的WAL也一起備份,見下面的cpArch.sh。
)。

如果資料庫崩潰,我們就可以使用熱備產生的base.tar和archive_command產生的WAL和我們自己備份的WAL(pg_xlog)來進行資料庫的
recovery.

下面舉例來說明:
我的PostgreSQL執行在:/home/pgsql/下面
資料目錄在:/home/pgsql/database/
將熱備資料檔案備份到/disk3/PostgreSQL/base/下面
將WAL備份到/disk3/PostgreSQL/archives/下面

postgresql.conf中定義瞭如下的archive_command:
archive_command = 'cp -f %p /disk3/PostgreSQL/archives/%f'
該命令會將PostgreSQL產生的WAL cp到/disk3/PostgreSQL/archives/中。

我的熱備指令碼如下:
(1)為了使丟失的資料在一分鐘之內,在crontab中每分鐘將pg_xlog/下面的WAL
backup到/disk3/PostgreSQL/archives/。

crontab:
*/1 * * * * /home/pgsql/bin/cpArch.sh

cpArch.sh:
#!/bin/sh

cp -f /home/pgsql/database/pg_xlog/[0-9]* /disk3/PostgreSQL/archives/


(2)編寫熱備指令碼hotBackup.pl(我用perl):
#!/usr/bin/perl

#############################################################
# hotBackup.pl
# Use to hot backup the PostgreSQL database.
# Author:Seamus Dean
# Date:2005-04-11
##############################################################

my($datadir) ="/home/pgsql/database";
my($bindir) ="/home/pgsql/bin";
my($backupdir) ="/disk3/PostgreSQL/base";
my($receiver) ="ljh13@sina.com.cn";

sub begin_backup()
{
open(PSQL,"|$bindir/psql") or mail_user("begin backup error.") && exit(100);
print PSQL "select pg_start_backup('backupnow');n";
close(PSQL);

}

sub end_backup()
{
open(PSQL,"|$bindir/psql") or mail_user("end backup error.") && exit(100);
print PSQL "select pg_end_backup();n";
close(PSQL);
}


sub do_backup()
{
system("/bin/tar cvf base.tar $datadir");
system("/bin/mv -f base.tar $backupdir/");
}


sub mail_user()
{
my($msg) =@_;
open(MAIL,"|/bin/mail -s backup-result $receiver") or die("can not talk to:mail command.n");
print MAIL $msg;
close(MAIL);
}

###################################
# tell psql begin our backup
###################################
&begin_backup();

###################################
# do tar
###################################
&do_backup();

####################################
# tell psql end backup
####################################
&end_backup();

####################################
# mail the user about the result
####################################
&mail_user("PostgreSQL backup successfully.");


到這裡,備份指令碼基本上就完了,你可以將hotBackup.pl放在crontab中週期性的執行。

就算/home/pgsql/database目錄徹底崩潰,我們可以像下面這樣迅速恢復到1分鐘內的資料:
#cp /disk3/PostgreSQL/base/base.tar ./
#tar xvf base.tar
#cd database/
#vi recovery.conf
輸入如下內容:
restore_command='cp /disk3/PostgreSQL/archives/%f "%p"'
然後將/home/pgsql/database/pg_xlog/下面的WAL清空。
啟動PostgreSQL,我們可以看到如下的LOG資訊:
LOG: could not create IPv6 socket: Address family not supported by protocol
LOG: database system was interrupted at 2005-04-11 23:13:28 PDT
LOG: starting archive recovery
LOG: restore_command = "cp /disk3/PostgreSQL/archives/%f "%p""
cp: cannot stat `/disk3/PostgreSQL/archives/00000001.history': No such file or directory
LOG: restored log file "00000001000000000000002E.008EFCAC.backup" from archive
LOG: restored log file "00000001000000000000002E" from archive
LOG: checkpoint record is at 0/2E8EFCAC
LOG: redo record is at 0/2E8EFCAC; undo record is at 0/0; shutdown FALSE
LOG: next transaction ID: 5271; next OID: 6351357
LOG: automatic recovery in progress
LOG: redo starts at 0/2E8EFCE8
LOG: restored log file "00000001000000000000002F" from archive
LOG: restored log file "000000010000000000000030" from archive
LOG: restored log file "000000010000000000000031" from archive
LOG: restored log file "000000010000000000000032" from archive
LOG: restored log file "000000010000000000000033" from archive
LOG: restored log file "000000010000000000000034" from archive
LOG: restored log file "000000010000000000000035" from archive
LOG: restored log file "000000010000000000000036" from archive
LOG: restored log file "000000010000000000000037" from archive
LOG: restored log file "000000010000000000000038" from archive
LOG: restored log file "000000010000000000000039" from archive
LOG: restored log file "00000001000000000000003A" from archive
LOG: restored log file "00000001000000000000003B" from archive
LOG: restored log file "00000001000000000000003C" from archive
LOG: restored log file "00000001000000000000003D" from archive
LOG: restored log file "00000001000000000000003E" from archive
LOG: restored log file "00000001000000000000003F" from archive
LOG: restored log file "000000010000000000000040" from archive
LOG: restored log file "000000010000000000000041" from archive
LOG: restored log file "000000010000000000000042" from archive
LOG: restored log file "000000010000000000000043" from archive
LOG: restored log file "000000010000000000000044" from archive
LOG: restored log file "000000010000000000000045" from archive
LOG: restored log file "000000010000000000000046" from archive
LOG: restored log file "000000010000000000000047" from archive
LOG: restored log file "000000010000000000000048" from archive
LOG: restored log file "000000010000000000000049" from archive
LOG: restored log file "00000001000000000000004A" from archive
LOG: restored log file "00000001000000000000004B" from archive
LOG: restored log file "00000001000000000000004C" from archive
LOG: record with zero length at 0/4C2BABE4
LOG: redo done at 0/4C2BABA8
LOG: restored log file "00000001000000000000004C" from archive
LOG: archive recovery complete
LOG: database system is ready


顯示資料已經成功恢復。
/home/pgsql/database/下面的recovery.conf會變為:recovery.done.

結論:
PostgreSQL8的PITR已經做得非常的成功,完全有可能替代Oracle,Sqlserver
而成為企業的首選。所以,我們玩PostgreSQL的兄弟們,一定要對它有信心![@more@]

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/7507082/viewspace-870723/,如需轉載,請註明出處,否則將追究法律責任。

相關文章