問題描述:

因為一個系統2010-06-17日發生了一個異常資料操作問題,更新了一個表的200多萬條記錄,屬於邏輯操作錯誤。維護人員與開發商提出查詢2010－06－17日歸檔日誌的要求，想透過日誌分析瞭解是否進行了這個變更。因為需求提出是2010－06－22日，所以時間過了近一週，資料庫的歸檔日誌需要從netbackup伺服器的帶庫中匯出到本地，以便進行日誌的分析工作。

[@more@]

在進行日誌恢復的操作時，發生了錯誤，無法將日誌恢復到本地盤,具體操作及描述如下:

透過RMAN 可以查詢出2010－06－17日日誌檔案的序列號(SEQUENCE)的範圍是:17018 and 17105

RMAN>LIST BACKUP OF ARCHIVELOG ALL;

1 17017 10123519578102 16-JUN-10 10123540590941 17-JUN-10

1 17018 10123540590941 17-JUN-10 10123541270660 17-JUN-10

…...

1 17104 10125218669876 17-JUN-10 10125219611361 17-JUN-10

1 17105 10125219611361 17-JUN-10 10125220525468 18-JUN-10

1 17106 10125220525468 18-JUN-10 10125222513280 18-JUN-10

…….

RMAN>run{

ALLOCATE CHANNEL ch00 TYPE 'SBT_TAPE';

SET ARCHIVELOG DESTINATION TO '/archivelog';

RESTORE ARCHIVELOG SEQUENCE BETWEEN 17018 and 17105;

release channel ch00;}

輸出:

channel ch00: starting archive log restore to user-specified destination

archive log destination=/archivelog

channel ch00: restoring archive log

archive log thread=1 sequence=17018

channel ch00: restoring archive log

。。。。。。

channel ch00: restoring archive log

archive log thread=1 sequence=17035

channel ch00: reading from backup piece al_1029_1_721945418

ORA-19870: error reading backup piece al_1029_1_721945418

ORA-19507: failed to retrieve sequential file, handle="al_1029_1_721945418", parms=""

ORA-27029: skgfrtrv: sbtrestore returned error

ORA-19511: Error received from media manager layer, error text:

Failed to open backup file for restore.

channel ch00: starting archive log restore to user-specified destination

archive log destination=/archivelog

。。。。。。

released channel: ch00

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03002: failure of restore command at 06/23/2010 08:45:36

RMAN-06026: some targets not found - aborting restore

RMAN-06025: no backup of log thread 1 seq 17105 lowscn 10125219611361 found to restore

RMAN-06025: no backup of log thread 1 seq 17104 lowscn 10125218669876 found to restore

。。。。。。

這時候擔心是不是SBT_TAPE的通道出現問題，進行了一次當天歸檔日誌的備份，發現備份是正常的，說明介質通道應該沒有問題，但馬上對這個備份進行恢復，也出現上面描述的相同問題。

開啟client端的netbackup的日誌，日誌描述如下：

$cd /usr/openv/netbackup/logs/user_ops/dbext/logs

$ls -lt

-rw-r--r-- 1 oracle oinstall 153 Jun 23 09:44 12444.0.1277257483

-rw-r--r-- 1 oracle oinstall 153 Jun 23 09:28 8231.0.1277256490

-rw-r--r-- 1 oracle oinstall 153 Jun 23 09:24 8231.0.1277256245

-rw-r--r-- 1 oracle oinstall 1348 Jun 23 09:22 8231.0.1277256110

$more 12444.0.1277257483

Restore started Wed Jun 23 09:44:43 2010

09:44:58 client pdgis-ora peername PDGIS-ORA is invalid for restore request

09:44:59 INF - Server status = 37

擔心是不是直接使用rman訪問netbackup的裝置，存在部分引數沒有設定正常的問題，複製了netbackup的恢復檔案，並做了修改生成新的恢復指令碼檔案，執行編輯後指令碼檔案，問題仍然存在。

問題分析

1.首先尋找ORA-19511 錯誤的解決,在google上找到下面的文章

http://space.itpub.net/7199859/viewspace-631043

文中描述:

透過查詢metalink和google建議檢查

On a Windows master server, run the command:

VERITASNetBackupbinadmincmdbpgetconfig

DISALLOW_CLIENT_LIST_RESTORE = YES

DISALLOW_CLIENT_RESTORE = YES

建議改為

DISALLOW_CLIENT_LIST_RESTORE = NO

DISALLOW_CLIENT_RESTORE = NO

或netbackup備份軟體修改master server屬性,選擇client attributes

給allow client browse和allow client restore鉤上即可

然後重新啟動netbackup服務,再執行restore archivelog正常

源文件 <http://space.itpub.net/7199859/viewspace-631043>

進入netbackup伺服器,發現client的配置已經是allow client browse和allow client restore鉤上了,所以問題依然沒有解決。

2.查詢ORA-27029: skgfrtrv: sbtrestore returned error，找到下面的文章，

執行了sbttest 命令，測試通道的有效性。結果如下：

[oracle@pdgisdb bin]$ sbttest sbt_tape

The sbt function pointers are loaded from libobk.so library.

-- sbtinit succeeded

-- sbtinit (2nd time) succeeded

sbtinit: Media manager supports SBT API version 2.0

sbtinit: Media manager is version 5.0.0.0

sbtinit: vendor description string=Veritas NetBackup for Oracle - Release 6.5 (2007072323)

sbtinit: allocated sbt context area of 8 bytes

sbtinit: proxy copy is supported

-- sbtinit2 succeeded

-- regular_backup_restore starts ................................

-- sbtbackup succeeded

write 100 blocks

-- sbtwrite2 succeeded

-- sbtclose2 succeeded

sbtinfo2: SBTBFINFO_NAME=sbt_tape

sbtinfo2: SBTBFINFO_SHARE=multiple users

sbtinfo2: SBTBFINFO_ORDER=sequential access

sbtinfo2: SBTBFINFO_LABEL=G:

sbtinfo2: SBTBFINFO_CRETIME=Wed Jun 23 10:55:41 2010

sbtinfo2: SBTBFINFO_EXPTIME=Sat Jul 24 10:55:41 2010

sbtinfo2: SBTBFINFO_COMMENT=Backup ID : pdgis-ora_1277261741

sbtinfo2: SBTBFINFO_METHOD=stream

-- sbtinfo2 succeeded

MMAPI error from sbtrestore: 7501, Failed to open backup file for restore.

-- sbtrestore failed

透過這個命令的輸出結果，證明了rman恢復歸檔日誌失敗的原因不是oracle的問題，應該是介質管理的問題，是netbackup的server或者netbackup的client配置不正確的問題。

在 ORACLE metalink 上搜尋“ORA-19511: Error received from media manager layer, error text:

Failed to open backup file for restore.”，找到了下面的文章。

ORA-27029, ORA-19511 and Veritas NetBackup status code 135. [ID 335850.1]

源文件 <>

文章中詳細解釋了問題的發生原因

Symptoms

RMAN-03002: failure of restore command at 09/27/2005 12:41:54
ORA-: failed to retrieve sequential file, handle="cntrl_1984_1_569535276",parms=""
ORA-: skgfrtrv: sbtrestore returned error
ORA-: Error received from media manager layer, error text: Failed to open for restore.

NetBackup status code 135.

Cause

Media Manager unable to find the backup file.

NetBackup status code 135: client peername is invalid for restore request

The master server authenticates the host requesting an Oracle RMAN restore by
performing a reverse IP lookup, gethostbyaddr().However, the packet transporting
the restore request was transmitted from an interface on the client which resolves
to a hostname which does not match the client name which performed the backup.
Hence, the NBU master server rejects the restore request.

Solution

Check that the Hostname or Ipaddress is set properly.

Refer

Article from Veritas:

源文件 <>

並且點出veritas的相關技術文件即

該文件中詳細描述了問題，如下：

When using NetBackup Database extension for Oracle, a restore fails with RMAN error ORA-27029 and NetBackup status code 135.

Exact Error Message

ORA-27029: skgfrtrv: sbtrestore returned error

status code 135: client peername is invalid for restore request

Details:

Detailed Problem Description:

The master server authenticates the host requesting an Oracle RMAN restore by performing a reverse IP lookup, gethostbyaddr(). However, the packet transporting the restore request was transmitted from an interface on the client which resolves to a hostname which does not match the client name which performed the backup. Hence, the NBU master server rejects the restore request.

The host in the following example is named devo which resolves to NIC 172.31.46.28 and has a second NIC, named devo-b which resolves to 10.1.100.10, over which the backups and restores are to occur. The CLIENT_NAME in /usr/openv/netbackup/bp.conf is set to 'devo-b'.

The /usr/openv/netbackup/logs/bprd on the master server logged the failed validation request, peername is the result of gethostbyaddr():

16:08:26 [12763] <4> get_ccname: configured name is: devo-b

16:08:26 [12763] <2> process_request: restore request 66, bufr = 329199 66 oracle oinstall devo-b devo devo devo-b /usr/openv/netbackup/logs/user_ops/dbext/logs/11876.0.1019592506 NONE NONE 0 1019592506 1019586470 1019586470 1019592506 4 0 0 0 0 12 0 4 0 1 10004 0 0 0 C C C C C 0 1 0 1 0 0 0 0 9

16:08:26 [12763] <2> process_request: As rcvd from client:

16:08:26 [12763] <2> process_request: browse_clnt: devo-b

16:08:26 [12763] <2> process_request: requesting_clnt: devo

16:08:26 [12763] <2> process_request: destination_clnt: devo

16:08:26 [12763] <2> process_request: clnt_bp_conf_name: devo-b

16:08:26 [12763] <2> process_request: peername: devo-b

16:08:26 [12763] <2> process_request: ccname: devo-b

16:08:26 [12763] <2> process_request: keyword =

16:08:26 [12763] <2> process_request: restore_format: 0

16:08:26 [12763] <2> process_request: true_image: 0

16:08:26 [12763] <2> process_request: mpx_restore_possible: 1

16:08:26 [12763] <4> get_type_of_client_port: db_getCLIENT() failed: no entity was found (227)

16:08:26 [12763] <2> validate_hostname: Unknown hostname devo, switching to peername devo-b.

...lines deleted...

16:08:27 [12763] <4> get_type_of_client_free_browse: db_getCLIENT_by_hostname() failed: no entity was found (227)

...lines deleted...

16:08:27 [12763] <16> process_request: client devo-b peername devo-b is invalid for restore request

The error can also be verified by inspecting the progress file on the client if the /usr/openv/netbackup/logs/bprd log is not available, because the master server recorded the error in the progress file ( /usr/openv/netbackup/logs/user_ops/dbext/logs/ ):

16:08:27 client devo-b peername devo-b is invalid for restore request

16:08:28 INF - Server status = 135

Likewise, the /usr/openv/netbackup/logs/dbclient log on the Oracle database host also reflects progress file entry, rejecting the restore request:

System name: SunOS

Node name: devo

Release: 5.8

Version: Generic_108528-12

Machine: sun4u

User name: oracle

Group name: oinstall

Client Host: devo

...lines deleted...

15:29:02 [26493] <4> sendRequest: sending RESTORE request to bprd

15:29:02 [26493] <4> sendRequest: request:

15:29:02 [26493] <2> getsockconnected: host=nbu service=bprd address=10.1.100.29 protocol=tcp non-reserved port=13720

15:29:02 [26493] <2> bind_on_port_addr: bound to port 52000

15:29:02 [26493] <2> bprd_connect: no authentication required

15:29:03 [26493] <4> sendRequest: sending buf = 1019586470 1019586470 /cf_TRMAN2_t459959266_s112_p1

15:29:03 [26493] <4> sendRequest: Date range: ,

15:29:03 [26493] <4> serverResponse: entering serverResponse.

15:29:03 [26493] <4> serverResponse: initial client_read_timeout = <900>

15:29:08 [26493] <4> serverResponse: read comm client devo-b peername devo-b is invalid for restore request>

15:29:08 [26493] <4> serverResponse: read comm INF - Server status = 135>

15:29:08 [26493] <16> serverResponse: ERR - server exited with status 135: client is not validated to perform the requested operation

...lines deleted...

15:29:08 [26493] <4> closeApi: INF - EXIT STATUS 5: the restore failed to recover the requested files

Additional Environment Information:

Oracle 8.1.6, Solaris 2.6/Solaris 7/Solaris 8, HP-UX 11.00/HP-UX 11.11

Solution:

Add REQUIRED_INTERFACE = to the /usr/openv/netbackup/bp.conf file on the client host.

Example:

CLIENT_NAME = devo-b

REQUIRED_INTERFACE = devo-b

源文件 <>

問題解決

透過分析與文件的解釋，確定了問題所在，即netbackup中設定的client name與client端的hostname不相同，導致了恢復的時候，無法找到正確的途徑，導致恢復失敗。

將netbackup伺服器上的client name修改成netbackup客戶端的hostname，重新執行sbttest sbt_tape，restore測試透過。

執行歸檔日誌檔案的恢復，也成功。

RMAN>run{

ALLOCATE CHANNEL ch00 TYPE 'SBT_TAPE';

SET ARCHIVELOG DESTINATION TO '/archivelog';

RESTORE ARCHIVELOG SEQUENCE BETWEEN 17018 and 17105;

release channel ch00;}

allocated channel: ch00

channel ch00: sid=914 devtype=SBT_TAPE

channel ch00: Veritas NetBackup for Oracle - Release 6.5 (2007072323)

executing command: SET ARCHIVELOG DESTINATION

Starting restore at 23-JUN-10

channel ch00: starting archive log restore to user-specified destination

archive log destination=/archivelog

channel ch00: restoring archive log

archive log thread=1 sequence=17018

。。。。。。

channel ch00: restoring archive log

archive log thread=1 sequence=17035

channel ch00: reading from backup piece al_1029_1_721945418

channel ch00: restored backup piece 1

piece handle=al_1029_1_721945418 tag=TAG20100617T202337

channel ch00: restore complete, elapsed time: 00:06:07

channel ch00: starting archive log restore to user-specified destination

。。。。。。

archive log thread=1 sequence=17105

channel ch00: reading from backup piece al_1039_1_722031789

channel ch00: restored backup piece 1

piece handle=al_1039_1_722031789 tag=TAG20100618T202308

channel ch00: restore complete, elapsed time: 00:06:27

Finished restore at 23-JUN-10

released channel: ch00

問題分析與總結

netbackup伺服器配置client時，client的名稱最好與netbackup的客戶端計算機的hostname相同，這樣可以避免無法恢復的問題。

ORA-19511,ORA-19507 錯誤解決的一個方法

相關文章