OCP課程50:管理II之診斷資料庫

stonebox1122發表於2016-05-17

課程目標:

  • 檢測和修復資料庫損壞
  • 處理塊損壞
  • 配置ADR(Automatic Diagnostic Repository)
  • 執行健康檢查

1、資料恢復顧問

clipboard

資料恢復顧問:

當遇到錯誤時,資料恢復顧問自動收集資料問題資訊,檢查分析問題。

資料問題有時會很嚴重,比如丟失日誌檔案導致不能啟動資料庫。有時不嚴重(比如資料檔案塊損壞),不需要關閉資料庫或者不能啟動例項。故資料恢復顧問可以用於以下2種情況:一是不能啟動資料庫(由於資料庫檔案丟失、不一致或者損壞),一是執行時檔案損壞。

使用者介面:

可以透過EM訪問資料恢復顧問,從資料庫例項主頁開始有如下方式訪問:

  • Availability tabbed page > Perform Recovery > Advise and Recover
  • Active Incidents link > on the Support Workbench “Problems” page: Checker Findings tabbed page > Launch Recovery Advisor
  • Database Instance Health > click the specific link, for example, ORA 1578 in the Incidents section > Support Workbench, Problems Detail page > Data Recovery Advisor
  • Database Instance Health > Related Links section: Support Workbench > Checker Findings tabbed page: Launch Recovery Advisor
  • Related Link: Advisor Central > Advisors tabbed page: Data Recovery Advisor
  • Related Link: Advisor Central > Checkers tabbed page: Details > Run Detail tabbed page: Launch Recovery Advisor

也可以透過RMAN訪問:

rman target / nocatalog

rman> list failure all;

支援的資料庫配置(11gR2):

只支援單例項,不支援RAC。

不能使用備庫的塊或者檔案去修復主庫的問題,也不能使用資料恢復顧問去診斷和修復備庫的問題,但資料恢復顧問會把切換到備庫作為一個修復選項。


2、資料恢復顧問

clipboard[1]

在11g中使用資料恢復顧問進行自動診斷的流程如下:

(1)健康監測自動檢查並記錄問題到ADR。

(2)列出級別為critical或者high的問題。

(3)進行修復建議,包括自動和手動修復選項,並檢查可行性。

(4)可以選擇手動執行修復或者使用資料恢復顧問執行修復。

(5)使用VALIDATE命令進行檢查。


3、資料問題

clipboard[2]

檢查資料問題,評估資料庫及其元件健康情況,診斷問題。

可以執行VALIDATE DATABASE命令主動檢查問題,當資料庫出現錯誤,會自動進行被動檢查。

在EM,使用可用性>執行恢復,如果資料庫關閉或者掛載狀態而不能開啟,點選“執行恢復”進行問題修復。


4、資料問題:示例

clipboard[3]

資料恢復顧問可以分析以上問題並提出修復建議。


5、資料恢復顧問:RMAN命令列介面

clipboard[4]

如果懷疑資料庫出現問題,使用LIST FAILURE命令列出問題資訊。

使用ADVISE FAILURE命令為問題列出推薦的修復建議,隱式關閉已經修復的問題。如果不加選項,預設動作是為記錄在ADR中所有CRITICAL和HIGH級別錯誤提出建議。

在同一個會話中,使用ADVISE FAILURE命令後,使用REPAIR FAILURE命令進行修復。修復完成後,關閉問題。

使用CHANGE FAILURE命令修改問題優先順序或者關閉問題。只能修改級別為HIGH或者LOW的問題優先順序。問題修復後會隱式關閉,也可以顯示關閉。


6、列出資料問題

clipboard[5]

LIST FAILURE命令列出問題。如果目標例項使用了恢復目錄,可以在STARTED(NOMOUNT)模式使用,否則需要在MOUNTED模式使用。

LIST FAILURE命令不會去檢查診斷新的問題,而是列出先前評估的結果。重複執行LIST FAILURE命令重新驗證所有存在的問題。如果使用者手動修復問題,或如果問題消失,則資料恢復顧問從LIST FAILURE輸出中刪除這些問題。語法描述如下:

  • failnum:要顯示修復選項的問題數量
  • ALL:列出所有優先順序的問題
  • CRITICAL:列出CRITICAL優先順序和OPEN狀態的問題。這些問題會導致整個資料庫不可用(例如丟失控制檔案)
  • HIGH:列出HIGH優先順序和OPEN狀態的問題,這些問題會導致資料庫部分不可用或者不可恢復(例如丟失歸檔日誌檔案)
  • LOW:列出LOW優先順序和OPEN狀態的問題
  • CLOSED:只列出關閉的問題
  • EXCLUDE FAILURE:排除指定的問題
  • DETAIL:列出問題詳細資訊,例如檔案中多個塊損壞,會列出每一個塊資訊

例子:列出問題

RMAN> list failure;

List of Database Failures

=========================

Failure ID Priority Status    Time Detected Summary

---------- -------- --------- ------------- -------

3282       HIGH     OPEN      27-JAN-16     One or more non-system datafiles are missing


7、修復建議

clipboard[6]

ADVISE FAILURE命令為問題列出了推薦的修復選項,隱式關閉已經修復了的所有問題,顯示輸入問題的概要資訊。

ADVISE FAILURE預設(不加選項)為記錄在ADR中所有CRITICAL和HIGH優先順序的問題提出建議。如果自上一次LAST FAILURE命令之後有新的問題記錄到ADR,則該命令的輸出會包含WARNING。

當資料恢復顧問生成一個自動修復選項,會生成一個指令碼,顯示RMAN將如何修復問題。

語法:

ADVISE FAILURE

[ ALL | CRITICAL | HIGH | LOW | failnum[,failnum,…] ]

[ EXCLUDE FAILURE failnum [,failnum,…] ]

例子:修復建議

RMAN> advise failure;

List of Database Failures

=========================

Failure ID Priority Status    Time Detected Summary

---------- -------- --------- ------------- -------

3282       HIGH     OPEN      27-JAN-16     One or more non-system datafiles are missing

analyzing automatic repair options; this may take some time

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=21 device type=DISK

analyzing automatic repair options complete

Mandatory Manual Actions

========================

no manual actions available

Optional Manual Actions

=======================

1. If file /u01/app/oracle/oradata/STONE/datafile/user01.dbf was unintentionally renamed or moved, restore it

Automated Repair Options

========================

Option Repair Description

------ ------------------

1      Restore and recover datafile 4 

  Strategy: The repair includes complete media recovery with no data loss

  Repair script: /u01/app/oracle/diag/rdbms/stone/stone/hm/reco_277717931.hm


8、執行修復

clipboard[7]

REPAIR FAILURE命令需要在同一個RMAN會話中,ADVISE FAILURE命令之後使用。預設(不加選項)使用一個推薦的修復選項。如果沒有推薦的修復選項,REPAIR FAILURE命令發出一個隱式的ADVISE FAILURE命令。

使用USING ADVISE OPTION interger,透過選項編號指定使用哪一個修復選項。

由於修復可能會需要很長時間進行大量修改,故需要對其進行確認。完成修復後,會關閉問題。

不能同時執行多個修復會話,但可以同時執行REPAIR ... PREVIEW。

PREVIEW:不執行修復而是顯示修復指令碼

NOPROMPT:不需要確認

例子:執行修復

RMAN> repair failure preview;

Strategy: The repair includes complete media recovery with no data loss

Repair script: /u01/app/oracle/diag/rdbms/stone/stone/hm/reco_3201719897.hm

contents of repair script:

   # restore and recover datafile

   restore datafile 4;

   recover datafile 4;

   sql 'alter database datafile 4 online';

RMAN> repair failure;

Strategy: The repair includes complete media recovery with no data loss

Repair script: /u01/app/oracle/diag/rdbms/stone/stone/hm/reco_3201719897.hm

contents of repair script:

   # restore and recover datafile

   restore datafile 4;

   recover datafile 4;

   sql 'alter database datafile 4 online';

Do you really want to execute the above repair (enter YES or NO)? yes

executing repair script

Starting restore at 27-JAN-16

using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore

channel ORA_DISK_1: specifying datafile(s) to restore from backup set

channel ORA_DISK_1: restoring datafile 00004 to /u01/app/oracle/oradata/STONE/datafile/user01.dbf

channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/STONE/backupset/2016_01_24/o1_mf_nnndf_TAG20160124T211323_cb9mhmmc_.bkp

channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/STONE/backupset/2016_01_24/o1_mf_nnndf_TAG20160124T211323_cb9mhmmc_.bkp tag=TAG20160124T211323

channel ORA_DISK_1: restored backup piece 1

channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

Finished restore at 27-JAN-16

Starting recover at 27-JAN-16

using channel ORA_DISK_1

starting media recovery

archived log for thread 1 with sequence 8 is already on disk as file /u01/app/oracle/fast_recovery_area/STONE/archivelog/2016_01_26/o1_mf_1_8_cbfqhwdr_.arc

archived log for thread 1 with sequence 9 is already on disk as file /u01/app/oracle/fast_recovery_area/STONE/archivelog/2016_01_27/o1_mf_1_9_cbj37c44_.arc

archived log for thread 1 with sequence 10 is already on disk as file /u01/app/oracle/fast_recovery_area/STONE/archivelog/2016_01_27/o1_mf_1_10_cbj9dox3_.arc

channel ORA_DISK_1: starting archived log restore to default destination

channel ORA_DISK_1: restoring archived log

archived log thread=1 sequence=7

channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/STONE/backupset/2016_01_24/o1_mf_annnn_TAG20160124T211500_cb9mlnkc_.bkp

channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/STONE/backupset/2016_01_24/o1_mf_annnn_TAG20160124T211500_cb9mlnkc_.bkp tag=TAG20160124T211500

channel ORA_DISK_1: restored backup piece 1

channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

archived log file name=/u01/app/oracle/fast_recovery_area/STONE/archivelog/2016_01_27/o1_mf_1_7_cbjtpskc_.arc thread=1 sequence=7

channel default: deleting archived log(s)

archived log file name=/u01/app/oracle/fast_recovery_area/STONE/archivelog/2016_01_27/o1_mf_1_7_cbjtpskc_.arc RECID=240 STAMP=902242777

archived log file name=/u01/app/oracle/fast_recovery_area/STONE/archivelog/2016_01_26/o1_mf_1_8_cbfqhwdr_.arc thread=1 sequence=8

media recovery complete, elapsed time: 00:00:02

Finished recover at 27-JAN-16

sql statement: alter database datafile 4 online

repair failure complete

Do you want to open the database (enter YES or NO)? yes

database opened


9、分類(和關閉)問題

clipboard[8]

CHANGE FAILURE命令用於修改問題優先順序以及關閉問題。

語法:

CHANGE FAILURE

{ ALL | CRITICAL | HIGH | LOW | failnum[,failnum,…] }

[ EXCLUDE FAILURE failnum[,failnum,…] ]

{ PRIORITY {CRITICAL | HIGH | LOW} |

CLOSE } – change status of the failure(s) to closed

[ NOPROMPT ] – do not ask user for a confirmation

問題優先順序只能從HIGH到LOW和從LOW到HIGH。不能修改為CRITICAL的優先順序。(修改優先順序從HIHG到LOW的情況之一是不想在LIST FAILURE命令的預設輸出中看到該問題。例如塊損壞為HIGH優先順序,如果塊位於一個很少使用的表空間,可以臨時將其優先順序修改為LOW。)

問題修復後會隱式關閉問題,也可以顯示關閉問題。

該命令預設需要使用者進行確認。

例子:更改問題優先順序

RMAN> list failure;

List of Database Failures

=========================

Failure ID Priority Status    Time Detected Summary

---------- -------- --------- ------------- -------

3282       HIGH     OPEN      27-JAN-16     One or more non-system datafiles are missing

RMAN> change failure 3282 priority low;

List of Database Failures

=========================

Failure ID Priority Status    Time Detected Summary

---------- -------- --------- ------------- -------

3282       HIGH     OPEN      27-JAN-16     One or more non-system datafiles are missing

Do you really want to change the above failures (enter YES or NO)? yes

changed 1 failures to LOW priority

RMAN> list failure;

no failures found that match specification

RMAN> list failure all;

List of Database Failures

=========================

Failure ID Priority Status    Time Detected Summary

---------- -------- --------- ------------- -------

3282       LOW      OPEN      27-JAN-16     One or more non-system datafiles are missing

RMAN> change failure 3282 priority high;

List of Database Failures

=========================

Failure ID Priority Status    Time Detected Summary

---------- -------- --------- ------------- -------

3282       LOW      OPEN      27-JAN-16     One or more non-system datafiles are missing

Do you really want to change the above failures (enter YES or NO)? yes

changed 1 failures to HIGH priority

RMAN> list failure;

List of Database Failures

=========================

Failure ID Priority Status    Time Detected Summary

---------- -------- --------- ------------- -------

3282       HIGH     OPEN      27-JAN-16     One or more non-system datafiles are missing


10、資料恢復顧問檢視

clipboard[9]

V$IR_FAILURE:列出所有的問題,包括關閉的,RMAN命令LIST FAILURE的結果

V$IR_MANUAL_CHECKLIST:列出需要手動操作的建議,RMAN命令ADVISE FAILURE的結果

V$IR_REPAIR:列出自動修復建議,RMAN命令ADVISE FAILURE的結果

V$IR_FAILURE_SET:是否只能手動修復


11、最佳實踐:主動檢查

clipboard[10]

對於非常重要的資料庫,可能需要執行額外的前瞻性主動檢查(在每天的業務低峰期間)。可以透過Health Monitor或者使用RMAN的VALIDATE命令進行定期的健康檢查。在一般情況下,當被動檢查檢測到資料庫元件中的問題時,可以對受影響的元件執行一個更完整的檢查。

RMAN的VALIDATE DATABASE命令用於為資料庫及其元件呼叫健康檢查,擴充套件了現有的VALIDATE BACKUPSET命令,將顯示驗證過程中檢測到的所有問題。如果檢測到問題,則將其作為finding記錄到ADR。可以使用LIST FAILURE命令來檢視儲存庫中記錄的所有問題。

VALIDATE命令支援單個備份集和資料塊的驗證。預設情況下,VALIDATE命令只檢查物理損壞。可以指定CHECK LOGICAL檢查邏輯損壞。

塊損壞可分為塊間(interblock)和塊內(intrablock)損壞。塊內損壞發生在塊體本身,可以是物理的或邏輯的損壞。塊間損壞發生在塊之間,只能是邏輯損壞。VALIDATE命令只檢查塊內損壞。


12、什麼是塊損壞

clipboard[11]

一個已損壞的資料塊是指塊不是Oracle可以識別的格式或其內容內部不一致。通常情況下,損壞是由硬體故障或作業系統問題造成的。資料庫將塊損壞分為“邏輯損壞”或“介質損壞”。如果是邏輯損壞,則有一個Oracle內部錯誤。邏輯損壞的塊在資料庫檢測到不一致後會被標記為損壞。如果是介質損壞,則塊格式不正確;在被讀取到磁碟後,該塊中的資訊沒有任何意義。

透過恢復塊或刪除包含損壞塊的資料庫物件可以修復一個介質損壞塊。如果介質損壞是由於硬體故障導致,則需要修復硬體故障。


13、塊損壞:ORA-01578

clipboard[12]

通常ORA-01578錯誤是由硬體問題造成的。如果ORA-01578錯誤總是返回相同的引數,則最有可能是介質損壞塊。

如果引數每次都發生變化,則可能會是硬體問題,應該進行記憶體和頁空間檢查以及I/O子系統。

注意:ORA-01578返回相對檔案號,但隨之而來的ORA-01110錯誤顯示絕對檔案號。

例子:模擬塊損壞

SQL> create tablespace tstest datafile '/home/oracle/tstest.dbf' size 10m;

Tablespace created.

SQL> create table emptest tablespace tstest as select * from hr.employees where rownum<10;

Table created.

定位表對應的塊。

SQL> select rowid,dbms_rowid.rowid_relative_fno(rowid) rel_fno,dbms_rowid.rowid_block_number(rowid) block from emptest;

ROWID                 REL_FNO      BLOCK

------------------ ---------- ----------

AAAWO6AAIAAAACDAAA          8        131

AAAWO6AAIAAAACDAAB          8        131

AAAWO6AAIAAAACDAAC          8        131

AAAWO6AAIAAAACDAAD          8        131

AAAWO6AAIAAAACDAAE          8        131

AAAWO6AAIAAAACDAAF          8        131

AAAWO6AAIAAAACDAAG          8        131

AAAWO6AAIAAAACDAAH          8        131

AAAWO6AAIAAAACDAAI          8        131

驗證是否有壞塊,目前沒有。

[ ~]$ dbv userid=sys/123456 file=/home/oracle/tstest.dbf

DBVERIFY: Release 11.2.0.4.0 - Production on Wed Jan 27 16:30:20 2016

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /home/oracle/tstest.dbf

DBVERIFY - Verification complete

Total Pages Examined         : 1280

Total Pages Processed (Data) : 1

Total Pages Failing   (Data) : 0

Total Pages Processed (Index): 0

Total Pages Failing   (Index): 0

Total Pages Processed (Other): 127

Total Pages Processed (Seg)  : 0

Total Pages Failing   (Seg)  : 0

Total Pages Empty            : 1152

Total Pages Marked Corrupt   : 0

Total Pages Influx           : 0

Total Pages Encrypted        : 0

Highest block SCN            : 4154173 (0.4154173)

先進行資料庫全備。

RMAN> backup database plus archivelog delete input;

RMAN標記壞塊。

RMAN> recover datafile 8 block 131 clear;

Starting recover at 27-JAN-16

using channel ORA_DISK_1

Finished recover at 27-JAN-16

再次驗證,發現壞塊。

[ ~]$ dbv userid=sys/123456 file=/home/oracle/tstest.dbf

DBVERIFY: Release 11.2.0.4.0 - Production on Wed Jan 27 16:34:57 2016

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /home/oracle/tstest.dbf

Page 131 is marked corrupt

Corrupt block relative dba: 0x02000083 (file 8, block 131)

Bad header found during dbv:

Data in bad block:

type: 6 format: 6 rdba: 0x02000094

last change scn: 0x0000.003f633d seq: 0x2 flg: 0x04

spare1: 0xed spare2: 0xf3 spare3: 0x0

consistency value in tail: 0x633d0602

check value in block header: 0xadcc

computed block checksum: 0xe7fa

DBVERIFY - Verification complete

Total Pages Examined         : 1280

Total Pages Processed (Data) : 0

Total Pages Failing   (Data) : 0

Total Pages Processed (Index): 0

Total Pages Failing   (Index): 0

Total Pages Processed (Other): 129

Total Pages Processed (Seg)  : 1

Total Pages Failing   (Seg)  : 0

Total Pages Empty            : 1149

Total Pages Marked Corrupt   : 1

Total Pages Influx           : 0

Total Pages Encrypted        : 0

Highest block SCN            : 4154177 (0.4154177)

查詢報錯。

SQL> select count(*) from emptest;

select count(*) from emptest

                     *

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 8, block # 131)

ORA-01110: data file 8: '/home/oracle/tstest.dbf'


14、如何處理塊損壞

clipboard[13]

總是試圖找出錯誤是否是永久性的。多次執行ANALYZE命令,如果可能的話,關機再啟動,然後再試一次執行失敗的操作。找出是否有更多的損壞。

硬體故障應立即解決。當遇到硬體問題,聯絡供應商進行檢查和修復。

硬體故障的型別:

  • 錯誤的硬體或韌體
  • 作業系統I/O或快取問題
  • 記憶體或分頁問題

例子:執行ANALYZE命令分析表

SQL> analyze table emptest validate structure cascade;

analyze table emptest validate structure cascade

*

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 8, block # 131)

ORA-01110: data file 8: '/home/oracle/tstest.dbf'


15、設定引數檢測損壞

clipboard[14]

使用db_ultra_safe引數方便管理,它會影響下列引數的預設值:

  • db_block_checking,是否進行資料塊校驗。可以防止記憶體和資料損壞。(預設值:FALSE,建議值:FULL)
  • db_block_checksum,在資料塊寫入到磁碟的時候,對資料塊的快取頭部進行校驗和。校驗和協助檢測磁碟,儲存系統或I/O系統導致的損壞。(預設值:TYPICAL,建議值:TYPICAL)
  • db_lost_write_protect,失寫檢查。資料塊失寫發生在主庫已經完成寫入,但備庫還未完成寫入。(預設值:TYPICAL,建議值:TYPICAL)

可以顯式設定這些引數,db_ultra_safe引數(11g新加入)僅僅修改這些引數的預設值。


16、設定引數檢測損壞

clipboard[15]

根據要求,可以加強對塊損壞的檢查。啟用db_ultra_safe引數(預設關閉)會增加系統開銷。開銷的量與每秒塊改變數量相關,因此不容易量化。對於一個頻繁更新的應用程式,會對CPU有10%-20%的影響。

  • 當db_ultra_safe引數設定為data_only,db_block_checking引數被設定為MEDIUM。塊內資料的檢查是邏輯自包含的。記憶體中的塊內容發生改變後執行基本的塊頭檢查(例如,更新或插入命令後的磁碟讀取或者RAC環境中的內部例項塊傳輸)。這一級別的檢查包括所有非索引組織表塊的語義塊檢查。
  • 當db_ultra_safe引數設定為data_and_index,db_block_checking引數設定為FULL。除了前面的檢查,對索引塊執行語義檢查(即,該物件的子物件,可以在塊損壞時進行刪除和重建)。
  • 當db_ultra_safe引數設定為data_only或data_and_index,db_block_checksum引數設定為FULL以及db_lost_write_protect引數設定為TYPICAL。

例子:檢視引數

SQL> show parameter db_ultra

NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

db_ultra_safe                        string      OFF

SQL> show parameter db_block_check

NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

db_block_checking                    string      FALSE

db_block_checksum                    string      TYPICAL

SQL> show parameter db_lost

NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

db_lost_write_protect                string      NONE

SQL> alter system set db_ultra_safe=data_only scope=spfile;

System altered.

SQL> shutdown immediate

Database closed.

Database dismounted.

ORACLE instance shut down.

SQL> startup

ORACLE instance started.

Total System Global Area  835104768 bytes

Fixed Size                  2257840 bytes

Variable Size             507513936 bytes

Database Buffers          322961408 bytes

Redo Buffers                2371584 bytes

Database mounted.

Database opened.

SQL> show parameter db_ultra

NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

db_ultra_safe                        string      DATA_ONLY

SQL> show parameter db_block_check

NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

db_block_checking                    string      MEDIUM

db_block_checksum                    string      FULL

SQL> show parameter db_lost

NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

db_lost_write_protect                string      TYPICAL

SQL> alter system set db_block_checking=full;

System altered.

SQL> show parameter db_block_check

NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

db_block_checking                    string      FULL

db_block_checksum                    string      FULL


17、塊介質恢復

clipboard[16]

大多數情況下,當首次遇到損壞的塊,資料庫標記塊為介質損壞並寫入到相關檢視中,後續對塊的讀取都會失敗。可以只對標記為損壞的塊進行塊恢復。使用RMAN命令RECOVER ... BLOCK進行塊介質恢復。預設情況下,RMAN搜尋閃回日誌查詢損壞塊的可用複製,然後搜尋全備或者0級增量備份查詢可用塊。找到後,還原並進行介質恢復。塊介質恢復只能使用重做日誌,不能使用增量備份。

V$DATABASE_BLOCK_CORRUPTION檢視顯示被RMAN命令、ANALYZE、dbv、SQL查詢等標記為損壞的塊。以下型別的損壞將會增加到這個檢視:

物理/介質損壞:資料庫不能識別塊:校驗和無效,塊內都是0或者塊頭損壞。預設啟用物理損壞檢查。

邏輯損壞:塊有有效的校驗和,塊頭和塊尾匹配,但內容不一致。塊介質恢復不能修復邏輯塊損壞。預設不啟用邏輯損壞檢查。可以透過BACKUP,RESTORE,RECOVER和VALIDATE命令指定CHECK LOGICAL選項啟用邏輯損壞檢查。


18、塊介質恢復的前提條件

clipboard[17]

執行RECOVER ... BLOCK命令的前提條件:

  • 目標資料庫必須是歸檔模式,並使用當前控制檔案開啟或者掛載。
  • 包含損壞塊的資料檔案的備份必須是全備或者0級增量備份,不能是代理複製。如果只有代理複製,需要還原到非預設位置以便讓RMAN認為是資料檔案的複製。
  • 只能使用歸檔日誌檔案進行恢復。不能使用1級增量備份。
  • 從閃回日誌中搜尋損壞塊的可用複製需要啟用閃回資料庫,如果閃回日誌有損壞塊的老版本的完好的塊,則RMAN會使用這些塊加快恢復速度。

19、RECOVER...BLOCK命令

clipboard[18]

塊恢復前,需要先定位到損壞的塊,一般使用如下方法定位:

  • LIST FAILURE,VALIDATE或者BACKUP ... VALIDATE命令
  • V$DATABASE_BLOCK_CORRUPTION檢視
  • 標準輸出的錯誤訊息
  • 告警日誌和使用者跟蹤檔案
  • ANALYZE TABLE和ANALYZE INDEX命令
  • DBVERIFY工具

例如在使用者跟蹤檔案發現如下資訊:

ORA-01578: ORACLE data block corrupted (file # 7, block # 3)

ORA-01110: data file 7: '/oracle/oradata/orcl/tools01.dbf'

ORA-01578: ORACLE data block corrupted (file # 2, block # 235)

ORA-01110: data file 2: '/oracle/oradata/orcl/undotbs01.dbf'

在RMAN提示符後執行RECOVER ... BLOCK命令,指定檔案和塊編號:

RECOVER

DATAFILE 7 BLOCK 3

DATAFILE 2 BLOCK 235;

例子:定位損壞的塊並恢復

RMAN> list failure;

using target database control file instead of recovery catalog

List of Database Failures

=========================

Failure ID Priority Status    Time Detected Summary

---------- -------- --------- ------------- -------

8323       HIGH     OPEN      27-JAN-16     Datafile 8: '/home/oracle/tstest.dbf' contains one or more corrupt blocks

RMAN> validate database;

Starting validate at 27-JAN-16

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=35 device type=DISK

channel ORA_DISK_1: starting validation of datafile

channel ORA_DISK_1: specifying datafile(s) for validation

input datafile file number=00001 name=/u01/app/oracle/oradata/STONE/datafile/o1_mf_system_cb1769f2_.dbf

input datafile file number=00002 name=/u01/app/oracle/oradata/STONE/datafile/o1_mf_sysaux_cb1769f6_.dbf

input datafile file number=00005 name=/u01/app/oracle/oradata/STONE/datafile/o1_mf_example_cb33c57f_.dbf

input datafile file number=00003 name=/u01/app/oracle/oradata/STONE/datafile/o1_mf_undotbs1_cb1769ff_.dbf

input datafile file number=00007 name=/u01/app/oracle/oradata/STONE/datafile/o1_mf_rcat_ts_cb178oqs_.dbf

input datafile file number=00006 name=/u01/app/oracle/oradata/STONE/datafile/o1_mf_test_ind_cb178or2_.dbf

input datafile file number=00008 name=/home/oracle/tstest.dbf

input datafile file number=00004 name=/u01/app/oracle/oradata/STONE/datafile/user01.dbf

channel ORA_DISK_1: validation complete, elapsed time: 00:01:26

List of Datafiles

=================

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

1    OK     0              13892        116552          4168226  

  File Name: /u01/app/oracle/oradata/STONE/datafile/o1_mf_system_cb1769f2_.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       0              66551          

  Index      0              13992          

  Other      0              22045          

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

2    OK     0              19632        116550          4168231  

  File Name: /u01/app/oracle/oradata/STONE/datafile/o1_mf_sysaux_cb1769f6_.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       0              33407          

  Index      0              28236          

  Other      0              35205          

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

3    OK     0              1            14721           4168230  

  File Name: /u01/app/oracle/oradata/STONE/datafile/o1_mf_undotbs1_cb1769ff_.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       0              0              

  Index      0              0              

  Other      0              14719          

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

4    OK     0              78           989             3979319  

  File Name: /u01/app/oracle/oradata/STONE/datafile/user01.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       0              268            

  Index      0              51             

  Other      0              563            

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

5    OK     0              31371        44323           3351224  

  File Name: /u01/app/oracle/oradata/STONE/datafile/o1_mf_example_cb33c57f_.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       0              4547           

  Index      0              1149           

  Other      0              7253           

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

6    OK     0              1            1281            3759632  

  File Name: /u01/app/oracle/oradata/STONE/datafile/o1_mf_test_ind_cb178or2_.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       0              0              

  Index      0              0              

  Other      0              1279           

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

7    OK     0              513          1921            3706513  

  File Name: /u01/app/oracle/oradata/STONE/datafile/o1_mf_rcat_ts_cb178oqs_.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       0              409            

  Index      0              234            

  Other      0              764            

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

8    FAILED 0              1149         1280            4154177  

  File Name: /home/oracle/tstest.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       1              1              

  Index      0              0              

  Other      0              130            

validate found one or more corrupt blocks

See trace file /u01/app/oracle/diag/rdbms/stone/stone/trace/stone_ora_11705.trc for details

channel ORA_DISK_1: starting validation of datafile

channel ORA_DISK_1: specifying datafile(s) for validation

including current control file for validation

including current SPFILE in backup set

channel ORA_DISK_1: validation complete, elapsed time: 00:00:01

List of Control File and SPFILE

===============================

File Type    Status Blocks Failing Blocks Examined

------------ ------ -------------- ---------------

SPFILE       OK     0              2              

Control File OK     0              608            

Finished validate at 27-JAN-16

SQL> select * from v$database_block_corruption;

     FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# CORRUPTIO

---------- ---------- ---------- ------------------ ---------

         8        131          1            4154173 CORRUPT

SQL> select count(*) from emptest;

select count(*) from emptest

                     *

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 8, block # 131)

ORA-01110: data file 8: '/home/oracle/tstest.dbf'

SQL> analyze table emptest validate structure cascade;

analyze table emptest validate structure cascade

*

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 8, block # 131)

ORA-01110: data file 8: '/home/oracle/tstest.dbf'

[ ~]$ dbv userid=sys/123456 file=/home/oracle/tstest.dbf

DBVERIFY: Release 11.2.0.4.0 - Production on Wed Jan 27 19:11:02 2016

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /home/oracle/tstest.dbf

Page 131 is marked corrupt

Corrupt block relative dba: 0x02000083 (file 8, block 131)

Bad header found during dbv:

Data in bad block:

type: 6 format: 6 rdba: 0x02000094

last change scn: 0x0000.003f633d seq: 0x2 flg: 0x04

spare1: 0xed spare2: 0xf3 spare3: 0x0

consistency value in tail: 0x633d0602

check value in block header: 0xadcc

computed block checksum: 0xe7fa

DBVERIFY - Verification complete

Total Pages Examined         : 1280

Total Pages Processed (Data) : 0

Total Pages Failing   (Data) : 0

Total Pages Processed (Index): 0

Total Pages Failing   (Index): 0

Total Pages Processed (Other): 129

Total Pages Processed (Seg)  : 1

Total Pages Failing   (Seg)  : 0

Total Pages Empty            : 1149

Total Pages Marked Corrupt   : 1

Total Pages Influx           : 0

Total Pages Encrypted        : 0

Highest block SCN            : 4154177 (0.4154177)

RMAN> advise failure;

List of Database Failures

=========================

Failure ID Priority Status    Time Detected Summary

---------- -------- --------- ------------- -------

8323       HIGH     OPEN      27-JAN-16     Datafile 8: '/home/oracle/tstest.dbf' contains one or more corrupt blocks

analyzing automatic repair options; this may take some time

using channel ORA_DISK_1

analyzing automatic repair options complete

Mandatory Manual Actions

========================

no manual actions available

Optional Manual Actions

=======================

no manual actions available

Automated Repair Options

========================

Option Repair Description

------ ------------------

1      Perform block media recovery of block 131 in file 8 

  Strategy: The repair includes complete media recovery with no data loss

  Repair script: /u01/app/oracle/diag/rdbms/stone/stone/hm/reco_2387623021.hm

RMAN> repair failure;

Strategy: The repair includes complete media recovery with no data loss

Repair script: /u01/app/oracle/diag/rdbms/stone/stone/hm/reco_2387623021.hm

contents of repair script:

   # block media recovery

   recover datafile 8 block 131;

Do you really want to execute the above repair (enter YES or NO)? yes

executing repair script

Starting recover at 27-JAN-16

using channel ORA_DISK_1

channel ORA_DISK_1: restoring block(s)

channel ORA_DISK_1: specifying block(s) to restore from backup set

restoring blocks of datafile 00008

channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/STONE/backupset/2016_01_27/o1_mf_nnndf_TAG20160127T163123_cbk02w4k_.bkp

channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/STONE/backupset/2016_01_27/o1_mf_nnndf_TAG20160127T163123_cbk02w4k_.bkp tag=TAG20160127T163123

channel ORA_DISK_1: restored block(s) from backup piece 1

channel ORA_DISK_1: block restore complete, elapsed time: 00:00:01

starting media recovery

media recovery complete, elapsed time: 00:00:03

Finished recover at 27-JAN-16

repair failure complete


20、自動診斷流程

clipboard[19]

使用自動診斷知識庫ADR(Automatic Diagnostic Repository)存放關鍵錯誤事件的診斷資訊。可以使用這些診斷資訊建立事件包併傳送給Oracle支援服務。

診斷工作流如下:

(1)在EM中出現事件告警。

(2)在EM告警頁面檢視告警。

(3)向下鑽取到的事件和問題的細節。

(4)透過My Oracle Support將資訊打包併傳送到Oracle支援服務。


21、自動診斷知識庫ADR

clipboard[20]

ADR是一個基於檔案的知識庫,包括跟蹤檔案,事件轉儲和包,告警日誌,健康監測報告,核心轉儲等資料。多個例項以及多個產品共用一個統一的位於資料庫之外的目錄結構。因此可以在資料庫關閉時用於問題診斷。

從Oracle資料庫11gR1開始,資料庫,自動儲存管理(ASM),叢集就緒服務(CRS),和其他Oracle產品或部件儲存所有診斷資料到ADR。每個產品的每個例項都儲存診斷資料到自己的ADR主目錄。例如,在一個共享儲存和ASM的RAC環境中,每個資料庫例項和每個ASM例項都有一個在ADR中的主目錄。ADR的統一目錄結構,統一的診斷資料格式和一套統一的工具,可以使使用者和Oracle支援跨多個例項進行關聯和分析相關診斷資料。

ADR基目錄透過DIAGNOSTIC_DEST初始化引數設定,如果忽略該引數或者為空,那麼如果設定了ORACLE_BASE環境變數,則DIAGNOSTIC_DEST就為ORACLE_BASE,如果沒有設定ORACLE_BASE環境變數,則DIAGNOSTIC_DEST為$ORACLE_HOME/log。

SQL> show parameter diag

NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

diagnostic_dest                      string      /u01/app/oracle

[ ~]$ tree -L 2 /u01/app/oracle/diag/

/u01/app/oracle/diag/

├── asm

├── clients

│   └── user_oracle

├── crs

├── diagtool

├── lsnrctl

├── netcman

├── ofm

├── rdbms

│   ├── dummy

│   └── stone

└── tnslsnr

    └── oracletest


22、ADR命令列工具ADRCI

clipboard[21]

ADRCI是一個命令列工具,是資料庫的故障可診斷性基礎設施的一部分。ADRCI可以:

  • 檢視自動診斷知識庫內的診斷資料
  • 打包事件和問題資訊到zip檔案用於發給Oracle Support。

ADRCI可以在互動模式下使用或在指令碼里使用。此外,ADRCI可以執行ADRCI命令指令碼,和在SQL*Plus中執行SQL指令碼和PL/SQL命令類似。由於ADR的資料沒有安全要求,故無需登入到ADRCI,透過作業系統許可權確保ADR資料安全。

打包和管理診斷資料最簡單的方式是使用EM的支援工作臺(Support Workbench)。

ADRCI提供了支援工作臺的大部分功能,並增加了諸如檢視跟蹤檔案的功能。上圖中例子列出所有開啟的事件。

例子:透過ADRCI檢視事件

[ ~]$ adrci

ADRCI: Release 11.2.0.4.0 - Production on Wed Jan 27 21:02:27 2016

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

ADR base = "/u01/app/oracle"

adrci> show incident

ADR Home = /u01/app/oracle/diag/clients/user_oracle/host_3709602581_80:

*************************************************************************

0 rows fetched

ADR Home = /u01/app/oracle/diag/tnslsnr/oracletest/listener:

*************************************************************************

0 rows fetched

ADR Home = /u01/app/oracle/diag/rdbms/dummy/stone:

*************************************************************************

0 rows fetched

ADR Home = /u01/app/oracle/diag/rdbms/stone/stone:

*************************************************************************

INCIDENT_ID          PROBLEM_KEY                                                 CREATE_TIME                             

-------------------- ----------------------------------------------------------- ----------------------------------------

73353                ORA 1578                                                    2016-01-27 16:35:59.925000 +08:00      

73354                ORA 1578                                                    2016-01-27 16:36:04.770000 +08:00      

73355                ORA 1578                                                    2016-01-27 16:46:00.525000 +08:00      

73356                ORA 1578                                                    2016-01-27 16:50:18.178000 +08:00      

74781                ORA 1578                                                    2016-01-27 19:09:09.792000 +08:00      

74782                ORA 1578                                                    2016-01-27 19:09:26.033000 +08:00      

74783                ORA 1578                                                    2016-01-27 19:09:30.499000 +08:00      

7 rows fetched


23、V$DIAG_INFO檢視

clipboard[22]

V$DIAG_INFO檢視列出了所有重要的ADR的位置:

  • ADR Base:ADR基目錄
  • ADR Home:當前資料庫例項的ADR主目錄
  • Diag Trace:文字告警日誌和前後臺程式的跟蹤檔案的位置
  • Diag Alert:XML版本的告警日誌的位置
  • Diag Incident:事件日誌位置
  • Diag Cdump:診斷核心檔案位置
  • Health Monitor:健康監測日誌位置
  • Default Trace File:使用者會話跟蹤檔案路徑,SQL跟蹤檔案位置。

SQL> select * from v$diag_info;

   INST_ID NAME                      VALUE

---------- ------------------------- -----------------------------------------------------------------

         1 Diag Enabled              TRUE

         1 ADR Base                  /u01/app/oracle

         1 ADR Home                  /u01/app/oracle/diag/rdbms/stone/stone

         1 Diag Trace                /u01/app/oracle/diag/rdbms/stone/stone/trace

         1 Diag Alert                /u01/app/oracle/diag/rdbms/stone/stone/alert

         1 Diag Incident             /u01/app/oracle/diag/rdbms/stone/stone/incident

         1 Diag Cdump                /u01/app/oracle/diag/rdbms/stone/stone/cdump

         1 Health Monitor            /u01/app/oracle/diag/rdbms/stone/stone/hm

         1 Default Trace File        /u01/app/oracle/diag/rdbms/stone/stone/trace/stone_ora_14570.trc

         1 Active Problem Count      1

         1 Active Incident Count     7

11 rows selected.


24、診斷檔案位置

clipboard[23]

上圖是跟蹤資料和轉儲檔案在10g和11g的位置對比。

在11g中,前臺程式和後臺程式跟蹤檔案沒有區別,都位於ADR_HOME/trace目錄。也就是說,所有非事件的跟蹤檔案都儲存在trace子目錄。事件轉儲檔案與程式跟蹤檔案分開存放。

跟蹤檔案和轉儲檔案之間的主要區別是,跟蹤檔案是一個連續的輸出,如SQL跟蹤,轉儲檔案是一個事件的一次性輸出。此外,核心轉儲是一個二進位制記憶體轉儲。


25、健康監控:概覽

clipboard[24]

資料庫包括一個健康監測框架,用於對資料庫中的各種元件進行診斷檢查。健康監測檢查資料庫中的各種元件,包括檔案、記憶體、事務完整性、後設資料和程式使用。這些檢查產生報告以及解決問題的建議。故障診斷基礎設施可以自動執行健康檢查以響應關鍵錯誤,也可以透過DBMS_HM PL/SQL包或EM介面手動執行健康檢查。

透過V$HM_CHECK檢視健康監測可以執行的檢查的描述,這些檢查分為兩類:

  • DB-online:在資料庫開啟的時候執行(open模式)。
  • DB-offline:例項可用,資料庫關閉(nomount模式)。

在檢查執行後,會產生一個包含檢查結果的報告,包括優先順序(低,高,或關鍵),描述以及基本統計資料。健康監測產生XML格式的報告並儲存在ADR中。可以使用V$HM_RUN,DBMS_HR,ADRCI或者EM檢視。

例子:檢視健康監測可以執行的檢查的描述

SQL> select name,description from v$hm_check;

NAME                                     DESCRIPTION

---------------------------------------- ----------------------------------------------------------------------

HM Test Check                            Check for health monitor functionality

DB Structure Integrity Check             Checks integrity of all database files

CF Block Integrity Check                 Checks integrity of a control file block

Data Block Integrity Check               Checks integrity of a data file block

Redo Integrity Check                     Checks integrity of redo log content

Logical Block Check                      Checks logical content of a block

Transaction Integrity Check              Checks a transaction for corruptions

Undo Segment Integrity Check             Checks integrity of an undo segment

No Mount CF Check                        Checks control file in NOMOUNT mode

Mount CF Check                           Checks control file in mount mode

CF Member Check                          Checks a multiplexed copy of the control file

NAME                                     DESCRIPTION

---------------------------------------- ----------------------------------------------------------------------

All Datafiles Check                      Checks all datafiles in the database

Single Datafile Check                    Checks a data file

Tablespace Check Check                   Checks a tablespace

Log Group Check                          Checks all members of a log group

Log Group Member Check                   Checks a particular member of a log group

Archived Log Check                       Checks an archived log

Redo Revalidation Check                  Checks redo log content

IO Revalidation Check                    Checks file accessibility

Block IO Revalidation Check              Checks file accessibility

Txn Revalidation Check                   Revalidate corrupted transaction

Failure Simulation Check                 Creates dummy failures

NAME                                     DESCRIPTION

---------------------------------------- ----------------------------------------------------------------------

Dictionary Integrity Check               Checks dictionary integrity

ASM Mount Check                          Diagnose mount failure

ASM Allocation Check                     Diagnose allocation failure

ASM Disk Visibility Check                Diagnose add disk failure

ASM File Busy Check                      Diagnose file drop failure

ASM Toomanyoff Check                     Diagnose mount failed because there were too many offline disks

ASM Insufficient Disks Check             Diagnose mount failed because there were insufficient disks

ASM Insufficient Mem Check               Check to adjust memory on allocation failure

30 rows selected.


26、手工執行健康檢查:PL/SQL示例

clipboard[25]

使用DBMS_HM.RUN_CHECK程式執行健康檢查。呼叫RUN_CHECK,使用V$HM_CHECK中的名字,提供一個執行的名字(用來檢索報告)以及控制執行的輸入引數。可以使用V$HM_CHECK_PARAM檢視這些引數。

在上圖中,執行一個資料庫字典檢查,檢查表TAB$(一個重要的核心字典物件),命名為mycheck,不設定任何超時。

執行後,使用DBMS_HM.GET_RUN_REPORT從V$HM_RUN,V$HM_FINDING和V$HM_RECOMMENDATION檢視獲取報告。報告清楚地顯示了TAB$表的一個關鍵錯誤。

當你呼叫GET_RUN_REPORT函式,在ADR中的HM目錄產生XML報告檔案,在上面的例子中,檔名字為HMREPORT_mycheck.hm。

例子:手工執行監控檢查

SQL> begin

  2  dbms_hm.run_check(

  3  check_name => 'Data Block Integrity Check',

  4  run_name => 'stone',

  5  input_params => 'BLC_DF_NUM=8;BLC_BL_NUM=131');

  6  end;

  7  /

PL/SQL procedure successfully completed.

SQL> set long 100000

SQL> select dbms_hm.get_run_report('stone') from dual;

DBMS_HM.GET_RUN_REPORT('STONE')

--------------------------------------------------------------------------------

Basic Run Information

Run Name                     : stone

Run Id                       : 8585

Check Name                   : Data Block Integrity Check

Mode                         : MANUAL

Status                       : COMPLETED

Start Time                   : 2016-01-27 22:52:01.234049 +08:00

End Time                     : 2016-01-27 22:52:01.533168 +08:00

Error Encountered            : 0

Source Incident Id           : 0

Number of Incidents Created  : 0

DBMS_HM.GET_RUN_REPORT('STONE')

--------------------------------------------------------------------------------

Input Paramters for the Run

BLC_DF_NUM=8

BLC_BL_NUM=131

Run Findings And Recommendations

Finding

Finding Name  : Media Block Corruption

Finding ID    : 8589

Type          : FAILURE

Status        : OPEN

DBMS_HM.GET_RUN_REPORT('STONE')

--------------------------------------------------------------------------------

Priority      : HIGH

Message       : Block 131 in datafile 8: '/home/oracle/tstest.dbf' is media

               corrupt

Message       : Object EMPTEST owned by SYS might be unavailable


27、使用ADRCI工具檢視健康檢查報告

clipboard[26]

可以使用ADRCI建立和檢視健康監測檢查報告。需要先確保正確設定作業系統環境變數,如ORACLE_HOME,然後在作業系統命令提示符輸入命令:adrci。

命令顯示如上圖,可以使用show homes列出所有ADR主目錄,使用set homepath改變當前ADR主目錄。使用create report hm_run命令生成報告,使用show report hm_rum命令檢視報告。

例子:使用ADRCI工具檢視健康檢查報告

adrci> show homes

ADR Homes:

diag/clients/user_oracle/host_3709602581_80

diag/tnslsnr/oracletest/listener

diag/rdbms/dummy/stone

diag/rdbms/stone/stone

adrci> show hm_run -p "run_id=8585"

ADR Home = /u01/app/oracle/diag/clients/user_oracle/host_3709602581_80:

*************************************************************************

0 rows fetched

ADR Home = /u01/app/oracle/diag/tnslsnr/oracletest/listener:

*************************************************************************

0 rows fetched

ADR Home = /u01/app/oracle/diag/rdbms/dummy/stone:

*************************************************************************

0 rows fetched

ADR Home = /u01/app/oracle/diag/rdbms/stone/stone:

*************************************************************************

**********************************************************

HM RUN RECORD 1

**********************************************************

   RUN_ID                        8585

   RUN_NAME                      stone

   CHECK_NAME                    Data Block Integrity Check

   NAME_ID                       3

   MODE                          0

   START_TIME                    2016-01-27 22:52:01.234049 +08:00

   RESUME_TIME                  

   END_TIME                      2016-01-27 22:52:01.533168 +08:00

   MODIFIED_TIME                 2016-01-27 22:53:09.975723 +08:00

   TIMEOUT                       0

   FLAGS                         0

   STATUS                        5

   SRC_INCIDENT_ID               0

   NUM_INCIDENTS                 0

   ERR_NUMBER                    0

   REPORT_FILE                   /u01/app/oracle/diag/rdbms/stone/stone/hm/HMREPORT_stone.hm

1 rows fetched

adrci> set homepath diag/rdbms/stone/stone

adrci> create report hm_run stone

adrci> show report hm_run stone

    HM Report: stone

   

        Data Block Integrity Check

        8585

        stone

        MANUAL

        COMPLETED

        0

        0

        0

        2016-01-27 22:52:01.234049 +08:00

        2016-01-27 22:52:01.533168 +08:00

   

   

        BLC_DF_NUM=8

        BLC_BL_NUM=131

   

   

       

            Media Block Corruption

            8589

            FAILURE

            OPEN

            HIGH

            0

            2016-01-27 22:52:01.526152 +08:00

            Block 131 in datafile 8: '/home/oracle/tstest.dbf' is media corrupt

            Object EMPTEST owned by SYS might be unavailable

       

   


相關習題:

(1)Examine the section of the Health Check report given below:

DBMS_HM.GET_RUN_REPORT ('HM_RUN_1061')

Run Name: HM_RUN_1061 Run Id: 1061        Check Name: Data Block Integrity

Check Mode: REACTIVE                      Status: COMPLETED              

Start Time: 2007-05-12 22:11:02.032292 -07:00     End Time: 2007-05-12 22:11:20.835135-07:00

Error Encountered: 0                        Source Incident Id: 7418                 

Number of Incidents Created: 0

Which two statements are true regarding the Health Check report? (Choose two.)

A.Health Check was performed manually.

B.Health Check was performed to check the disk image block corruptions.

C.Health Check was performed to check interblock and intersegment corruption.

D.Health Check was performed to verify the integrity of database files and report failures.

E.Health Check was performed by the Health Monitor automatically in response to a critical error.

答案:BE

(2)Which two statements are true regarding the Automatic Diagnostic Repository (ADR) in Oracle Database 11g? (Choose two.)

A.A single ADR can support multiple ADR homes for different database instances.

B.The alert files are stored in XML file format in the TRACE directory of each ADR home.

C.If the environmental variable ORACLE_BASE is set, then DIAGNOSTIC_DEST is set to $ORACLE_BASE.

D.The BACKGROUND_DUMP_DEST initialization parameter overrides the DIAGNOSTIC_DEST

initialization parameter for the location of the alert log file.

答案:AC

(3)Identify the three predefined server-generated alerts. (Choose three.)

A.Drop User

B.Tablespace Space Usage

C.Resumable Session Suspended

D.Recovery Area Low On Free Space

E.SYSTEM Tablespace Size Increment

答案:BCD

(4)102. Examine the following values of the initialization  parameters in the database having the SID ORCL:

BACKGROUND_DUMP_DEST=/u01/app/oracle/product/1 1. 1. 0/db_1/bdump

USER_DUMP_DEST=/ u01/app/oracle/product/1 1.1. 0/db_1/udump

CORE_DUMP_DEST=/u01/app/oracle/product/1 1.1. 0/db_1/cdump

DIAGNOSTIC_DEST=

The environment  variables have the following value:

ORACLE_BASE=/u01/ app/oracle

ORACLE_HOME=/u01/app/oracle/product/ 1 1.1.0/db_1

What  is the location of the  Automatic Diagnostic Repository (ADR)  home?

A.  /u01/app/oracle/product /11.1. 0/db_1

B.  /u01/app/oracle

C.  $ORACLE_HOME/bdump

D.  $ORACLE_HOME/log

答案:B

(5)You plan to collect the Automatic Workload Repository (AWR) data every Monday morning for a month. You want Oracle Database to automatically create a baseline every Monday and remove the old baseline. What is the correct action to achieve this?

A.Create and populate a SQL Tuning Set from the AWR on every Monday.

B.Change the RETENTION setting for the AWR snapshots to 7 days on Monday.

C.Create a repeating baseline template.

D.Insert a finding directive for future ADDM tasks.

答案:C

(6) Which tasks can be accomplished using the Enterprise Manager Support Workbench in Oracle Database 1 1g? (Choose all that apply .)
A. Generate reports on data failure such as data file failures.
B. You can track the Service Request (SR) and implement repairs.
C. You can package and upload diagnostic data to Oracle Support.
D. You can manually run health checks to gather diagnostic data for a problem.

答案:BCD

(7) Which statements are true regarding the concept of problems and incidents in the fault diagnosability infrastructure for Oracle Database 1 1g? (Choose all that apply .)
A. Only the incident metadata and dumps are stored in the  Automatic Diagnostic Repository (ADR).
B. The problem key is the same as the incident number .
C. The database sends an incident alert to the Oracle Enterprise Manager Database Home page.
D. Every problem has a problem key , which is a text string that describes the problem.
E. The database makes an entry into the alert log file when problems and incidents occur .

答案:CDE

(8)Which statement describes the significance of the CHANGE FAILURE command in RMAN?

(Choose all that apply.)

A.  It is used to change failure priority only for HIGH or LOW priorities.

B.  It is used to execute the advised repair script.

C.  It is used to change failure priority only for the CRITICAL priority.

D.  It is used to explicitly close the open failures.

E.  It is used to inform the database about the repair after the repair script executes.

答案:AD

(9) Which statements are true regarding the creation of an incident package file by using the EM Workbench Support? (Choose all that apply .)
A. You can add or remove the trace files to the package.
B. You can create the incremental incident package ZIP file for new or modified diagnostic information for the incident package already created.
C. You can add SQL  test cases to the incident package.
D. You cannot create an incremental incident package when the physical files are purged rom the  ADR.

答案:ABC

(10)Which two statements are true regarding Health Monitor checks in Oracle Database 11g?

(Choose two.)

A.  Health Monitor checks can be used to scan the contents of the redo log and archive logs for accessibility and corruption.

B.  Health Monitor checks can be used to verify the integrity of database files and report failures if these files are inaccessible, corrupt or inconsistent.

C.  Health Monitor checks can be used to verify the contents of dictionary entries for each dictionary object and fix it automatically.

D.  Health Monitor checks are always initiated manually when there is some critical error.

答案:AB

(11)You installed Oracle Database 1 1g afresh.  Which statements are true regarding the default audit settings in this database? (Choose all that apply .)
A. The audit trail is stored in an operating system file.
B.  Auditing is disabled for all privileges.
C. The audit trail is stored in the database.
D.  Auditing is enabled for all privileges.
E.  Auditing is enabled for certain privileges related to database security .

答案:CE

(12)The DB_BLOCK_CHECKING initialization parameter is set to OFF. Which block checking would be performed?

A.  The Oracle database will perform block checking for the index blocks only

B.  The Oracle database will not perform block checking for any of the data blocks

C.  The Oracle database will perform block checking for the default permanent tablespace only

D.  The Oracle database will perform block checking for the data blocks in all user tablespaces

E.  The Oracle database will perform block checking for the data blocks in the SYSTEM tablespace only

答案:E

(13)View  the Exhibit to examine the details  for  an incident.  Which statement is  true regarding the status  of the incident?

png此主題相關圖片如下164.png:
clipboard[27]

A. The incident has been newly created and is in the process of collecting diagnostic information.
B. The incident is now in the Done state and the  ADR can select the incident to be purged.
C. The DBA  is working on the incident and prefers that the incident be kept in the  ADR.
D. The data collection for the incident is complete and the incident can be packaged and sent to Oracle
Support.

答案:D

(14)Exhibit:

png[1]此主題相關圖片如下:
clipboard[28]
View the Exhibit to examine a portion of the output from the VALIDATE DATABASE command.

Which statement is true about the block corruption detected by the command?

A.  No action is taken except the output in the Exhibit.

B.  The ADVISE FAILURE command is automatically called to display the repair script.

C.  The failure is logged into the Automatic Diagnostic Repository (ADR).

D.  The corruption is repaired by the command implicitly.

答案:C

(15)View the Exhibit to examine the error during the database startup.

You open an RMAN session for the database instance. To repair the failure, you executed the following as the first command in the RMAN session:

RMAN> REPAIR FAILURE;

Which statement describes the consequence of the command?

此主題相關圖片如下:
clipboard[29]

A.  The command performs the recovery and closes the failures.

B.  The command only displays the advice and the RMAN script required for repair.

C.  The command produces an error because the ADVISE FAILURE command has not been executed before the REPAIR FAILURE command.

D.  The command executes the RMAN script to repair the failure and removes the entry from the Automatic Diagnostic Repository (ADR).

答案:C

(16)You executed the following PL/SQL block successfully:

VARIABLE tname VARCHAR2(20)

BEGIN

dbms_addm.insert_finding_directive (NULL, DIR_NAME=>'Detail CPU Usage',

FINDING_NAME=>'CPU Usage', MIN_ACTIVE_SESSIONS=>0, MIN_PERC_IMPACT=>90);

:tname := 'database ADDM task4';

dbms_addm.analyze_db(:tname, 150, 162);

END; /

Then you executed the following command:

SQL> SELECT dbms_addm.get_report(:tname) FROM DUAL;

The above command produces Automatic Database Diagnostic Monitor (ADDM) analysis ____.

A.  with the CPU Usage finding if it is less than 90

B.  without the CPU Usage finding if it is less than 90

C.  with the CPU Usage finding for snapshots below 90

D.  with the CPU Usage finding for snapshots not between 150 and 162

答案:B

(17)You plan to set up the Automatic Workload Repository (AWR) baseline metric thresholds for a moving window baseline. Which action would you take before performing this task?

A.  Compute the baseline statistics.

B.  Take an immediate AWR snapshot.

C.  Decrease the window size for the baseline.

D.  Decrease the expiration time for the baseline.

答案:A

(18)View  the  Exhibit to examine the  Automatic Database  Diagnostic  Monitor  (ADDM)  tasks.  Y ou execut ed
the following commands:
SQL> V AR tname V ARCHAR2(60);
SQL> BEGIN

:tname := 'my_instance_analysis_mode_task';
DBMS_ADDM.INSERT_SEGMENT_DIRECTIVE(:tname,'Sg_directive','SCOTT');
END;
Which statement describes the consequence?

此主題相關圖片如下:
clipboard[30]
A. The  ADDM task is filtered to suppress the Segment  Advisor suggestions for the SCOTT schema.
B. The  ADDM task is filtered to produce the Segment  Advisor suggestions for the SCOTT schema only .
C. The PL/SQL  block produces an error because the my_instance_analysis_mode_task task has not been reset to its initial state.
D.  All subsequent  ADDM tasks including my_instance_analysis_mode_task are filtered to suppress the Segment  Advisor suggestions for the SCOTT schema.

答案:C

(19)You executed the following commands in an RMAN session for your database instance that has failures:

RMAN> LIST FAILURE;

After some time, you executed the following command in the same session:

RMAN> ADVISE FAILURE;

But there are new failures recorded in the Automatic Diagnostic Repository (ADR) after the execution of the last LIST FAILURE command. Which statement is true for the above ADVISE FAILURE command in this scenario?

A.  It produces a warning for new failures before advising for CRITICAL and HIGH failures.

B.  It ignores new failures and considers the failures listed in the last LIST FAILURE command only.

C.  It produces an error with recommendation to run the LIST FAILURE command before the ADVISE FAILURE command.

D.  It produces advice only for new failures and the failures listed in the last LIST FAILURE command are ignored.

答案:A

(20)View the Exhibit to examine the output for the V$DIAG_INFO view.

Which statements are true regarding the location of diagnostic traces? (Choose all that apply.)

png[2]此主題相關圖片如下:
clipboard[31]

A.  The path to the location of the background as well as the foreground process trace files is /u01/oracle/diag/rdbms/orclbi/orclbi/trace.

B.  The location of the text alert log file is /u01/oracle/diag/rdbms/orclbi/orclbi/alert.

C.  The location of the trace file for the current session is /u01/oracle/diag/rdbms/orclbi/orclbi/trace.

D.  The location of the XML-formatted version of the alert log is /u01/oracle/diag/rdbms/orclbi/orclbi/alert.

答案:ACD

(21)Which two kinds of failures make the Data Recovery Advisor (DRA) generate a manual checklist?

(Choose two.)

A.  failures because a data file is renamed by error

B.  failures when no standby database is configured

C.  failures that require no archive logs to be applied for recovery

D.  failures due to loss of connectivity-for example, an unplugged disk cable

答案:AD

(22)Which two activities are NOT supported by the Data Recovery Advisor? (Choose two.)

A.  Diagnose and repair a data file corruption offline.

B.  Diagnose and repair a data file corruption online.

C.  Diagnose and repair failures on a standby database.

D.  Recover from failures in the Real Application Cluster (RAC) environment.

答案:CD

(23)Which statement  is true for  enabling Enterprise  Manager  Support  Workbench in Oracle  Database 1 1g to upload the physical files generated by Incident Packaging Service (IPS) to MetaLink?
A. The database must be running in  ARCHIVELOG mode.
B. No special setup is required, and the feature is enabled by default.
C. The path for the  Automatic Diagnostic Repository  (ADR) must be configured with the DIAGNOSTIC_DEST initialization parameter .
D. The Enterprise Manager Support  Workbench can be enabled only if the background process manageability monitor (MMON) is configured.
E. Select the Enable option in the Oracle Configuration Manager Registration window during the installation of the Oracle Database 1 1g software, provide valid MetaLink credentials and select license agreement.

答案:E

(24)View the Exhibit to examine the error while executing the REPAIR FAILURE command in an RMAN session.

png[3]此主題相關圖片如下:
clipboard[32]

What is the reason for this error?

A.  Another repair session is running concurrently.

B.  The failure ID has not been mentioned in the command for data file 5.

C.  There are new failures recorded in the Automatic Diagnostic Repository (ADR).

D.  The ADVISE FAILURE command has not been issued before the REPAIR FAILURE command.

答案:A

(25)Observe the following warning in an RMAN session of your database instance:

WARNING: new failures were found since last LIST FAILURE command

Which statement describes the scenario that must have produced this warning?

A.  The CHANGE FAILURE command has detected new failures recorded in the Automatic Diagnostic Repository(ADR).

B.  The VALIDATE DATABASE command has detected new failures recorded in the Automatic Diagnostic Repository (ADR).

C.  The ADVISE FAILURE command has detected new failures recorded in the Automatic Diagnostic Repository (ADR) since the last LIST FAILURE.

D.  The RECOVER command has detected new failures recorded in the Automatic Diagnostic Repository (ADR) since the last LIST FAILURE command was executed.

答案:C

(26)Which of the following initialization parameters have been deprecated in Oracle 11g because of

the introduction of the Automatic Diagnostic Repository? (Choose all that apply.)

A.  BACKGROUND_DUMP_DEST

B.  FOREGROUND_DUMP_DEST

C.  CORE_DUMP_DEST

D.  USER_DUMP_DEST

E.  DIAGNOSTIC_DEST

F.  All of the above

答案:ACD

(27)Which of the following statements is true regarding the initialization parameter DIAGNOSTIC_DEST?

A.The default value is the value of the environment variable $ORACLE_HOME; if $ORACLE_HOME isn't set,then the default is set to $ORACLE_BASE.

B.The default value is the value of the environment variable $ORACLE_BASE; if $ORACLE_BASE isn't set, then it is set to $ORACLE_HOME.

C. DIAGNOSTIC_DEST is always equal to $ORACLE_HOME.

D. DIAGNOSTIC_DEST is always equal to $ORACLE_BASE.

答案:B

(28)Which of these formats represents the correct hierarchy for the ADR?

A.  /rdbms/diag//

B.  /diag/rdbms//

C.  /diag/rdbms//

D.  None of the above

答案:C

(29)Which of the following are not fundamental tasks of the Support Workbench? (Choose all that

apply.)

A.  View long-running SQL workloads

B.  View problem details

C.  Gather additional diagnostic information

D.  Create a Service Request

E.  Clean up incident data after upload to Oracle Support

答案:AE

(30)Which of the following tasks does the tool Incident Packaging Service (IPS) perform?

A.  Cleans up the ADR by deleting files not associated with an incident uploaded

to Oracle Support.

B.  Identifies all files associated with a critical error and adds them to a zip file to 

be sent to Oracle Support.

C.  Automatically opens a Service Request with Oracle Support for each critical

error and sends all relevant files.

D.  Displays a high-level view of critical errors on the database home page.

答案:B

A.  Schedule, create new package, view manifest, view contents

B.  Create new package, view manifest, view contents, schedule

C.  Schedule, create new package, view contents, view manifest

D.  Create new package, view contents, view manifest, schedule

E.  None of the above.

答案:D

(32)Which statement is true regarding the VALIDATE DATABASE command?

A.  It checks the database for intrablock corruptions only.

B.  It checks for block corruption in the valid backups of the database.

C.  It checks the database for both intrablock and interblock corruptions.

D.  It checks for only those corrupted blocks that are associated with data files.

答案:A

(33)Which statement is true regarding the retention policy for the incidents accumulated in the Automatic Diagnostic Repository (ADR)?

A.  The incident metadata is purged when the problem is resolved and the DBA closes the SR.

B.  The incident files and dumps are not retained in the ADR for the manually created incidents.

C.  The incident files are retained but the incident metadata is purged when the problem is resolved and the DBA closes the SR

D.  The default setting is for one year after which the incident metadata is purged from the ADR and the files are retained for one month.

答案:D

(34)View the Exhibit to examine the error during the database startup. You open an RMAN session for the database instance. To repair the failure, you executed the following as the first command in the RMAN session:

RMAN> REPAIR FAILURE;

Which statement describes the consequence of the command? Exhibit:

此主題相關圖片如下:
clipboard[33]
A.The command performs the recovery and closes the failures.

B.The command executes the RMAN script to repair the failure and removes the entry from the Automatic

Diagnostic Repository (ADR).

C.The command only displays the advice and the RMAN script required for repair.

D.The command produces an error because the ADVISE FAILURE command has not been executed before the REPAIR FAILURE command.

答案:D

(35)View the Exhibit and note the contents of V$DIAG_INFO. Which statement is true about the ADR?

Exhibit:

png[4]此主題相關圖片如下:
clipboard[34]

A.  The text alert log file will be available in Diag Trace

B.  A copy alert log file will be kept in Diag Incident for every incident.

C.  The XML version of the alert log file will be available in Diag Trace.

D.  An Automatic Database Diagnostic Management (ADDM) report is generated and stored in the Health Monitor whenever an incident occurs.

答案:A

(36)Which two statements are true about the Automatic Diagnostic Repository (ADR)? (Choose two.)

A.  The ADR base keeps all diagnostic information in binary format

B.  SQL*Plus provides the ADRI script, which can be used to work with ADR

C.  The ADR can be used for problem diagnosis only when the database is open

D.  The ADR can be disabled by settting the DIAGNOSTIC_DEST parameter to null

E.  The ADR can be used for problem diagnosis even when the database instance is down

F.  The ADR base is shared across multiple instances

答案:EF

(37)Examine the following RMAN script:

RMAN> run {

debug on;

allocate channel c1 type disk;

backup datafile 5;

}

Which statement describes the purpose of the script?

A.  The data file is checked for physical corruption and backed up if found clean.

B.  The backup of data file 5 is performed and the interactive messages during the backup are suppressed.

C.  The existing backup for the data file is checked and the backup is performed if there are changes in the data file after the last backup.

D.  The backup of data file 5 is performed and all SQL statements that are executed during RMAN compilation and their results are displayed

答案:D

(38)The environmental variable ORACLE-BASE is set. You want to check the diagnostic files created as part of the Automatic Diagnostic Repository (ADR). View the Exhibit and note the various parameters set in your database.

What will be the location of the ADR base?

此主題相關圖片如下:
clipboard[35]

A.It is set to ORACLE-BASE.

B.It is set to ORACLE_HOME/log.

C.It is set to /u01/app/oracle/admin/orcl/adump.

D.It is set to /u01/app/oracle/flash_recovery_area.

答案:A

(39)View the Exhibit and examine the parameter settings in your server-side parameter file (SPFILE).

png[5]此主題相關圖片如下:
clipboard[36]

When you tried to start the database instance, you received the following error:

png[6]此主題相關圖片如下:
clipboard[37]

Why did the instance fail to start?

A.Because the PGA_AGGREGATE_TARGET parameter is not set

B.Because the STATISTICS_LEVEL parameter is set to BASIC

C.Because MEMORY_TARGET and MEMORY_MAX_TARGET cannot be equal D.Because both the SGA_TARGET and MEMORY_TARGET parameters are set.

答案:B

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/28536251/viewspace-2101319/,如需轉載,請註明出處,否則將追究法律責任。

相關文章