11g_Health_Monitor

zhyuh發表於2009-09-27
健康監控檢查(Health Monitor Check)也是Oracle11g中新出現的功能,用於對資料庫作診斷性檢查,主要包括以下檢查項。[@more@]

SQL> select name,INTERNAL_CHECK, OFFLINE_CAPABLE, description from v$hm_check order by 2,3;

NAME INTERNAL_CHECK OFFLINE_CAPABLE DESCRIPTION
--------------------------------- -------------- --------------- ---------------------------------------------
Undo Segment Integrity Check N N Checks integrity of an undo segment
Transaction Integrity Check N N Checks a transaction for corruptions
Dictionary Integrity Check N N Checks dictionary integrity
Data Block Integrity Check N Y Checks integrity of a datafile block
Redo Integrity Check N Y Checks integrity of redo log content
DB Structure Integrity Check N Y Checks integrity of all database files
Y N
Logical Block Check Y N Checks logical content of a block
Txn Revalidation Check Y N Revalidate corrupted txn
Block IO Revalidation Check Y Y Checks file accessability
Failure Simulation Check Y Y Creates dummy failures
IO Revalidation Check Y Y Checks file accessability
Redo Revalidation Check Y Y Checks redo log content
Archived Log Check Y Y Checks an archived log
Log Group Member Check Y Y Checks a particular member of a log group
Log Group Check Y Y Checks all members of a log group
Single Datafile Check Y Y Checks a datafile
All Datafiles Check Y Y Check for all datafiles in the database
CF Member Check Y Y Checks a multiplexed copy of the control file
All Control Files Check Y Y Checks all control files in the database
HM Test Check Y Y Check for HM Functionality

21 rows selected

其中INTERNAL_CHECK為'N'的檢查項表示使用者能手動執行,而INTERNAL_CHECK='Y'表示系統內部檢查,試圖手動執行時會報ORA-51001錯誤--在HM目錄中未找到檢查[Single Datafile Check]。
手動執行檢查,可以用OEM或者DBMS_HM.RUN_CHECK包。
DBMS_HM.RUN_CHECK的主要引數:
CHECK_NAME -> v$hm_check.name列的值。
RUN_NAME -> 使用者指定或系統自動生成的名字,生成報告用到。
TIME_OUT -> 執行檢查的時間限制。
INPUT_PARAMS -> 輸入引數。

下面的SQL列出各個檢查所需要的輸入引數。
SQL> select c.name check_name, p.name parameter_name,
2 p.default_value, p.description
3 from v$hm_check_param p, v$hm_check c
4 where p.check_id = c.id and c.internal_check = 'N'
5 order by c.name;

CHECK_NAME PARAMETER_NAME DEFAULT_VALUE DESCRIPTION
------------------------------ ---------------- ---------------- ---------------------------------------
Data Block Integrity Check BLC_DF_NUM Block Data File number
Data Block Integrity Check BLC_BL_NUM Datablock number
Dictionary Integrity Check CHECK_MASK ALL Check Mask
Dictionary Integrity Check TABLE_NAME ALL_CORE_TABLES Table Name
Redo Integrity Check SCN_TEXT 0 SCN of the latest good redo (if known)
Transaction Integrity Check TXN_ID Transaction ID
Undo Segment Integrity Check USN NUMBER Undo Segment Number

7 rows selected


下面的例子,我們要做一個Transaction Integrity Check。
1. 新開一個視窗,製造一個transaction,注意不執行隨後的commit或者rollback。
SQL> delete scott.temp1 where object_id=25;
64 rows deleted

2. 在原先視窗中找到這個transaction相關資訊。
SQL> select * from v$transaction;
ADDR XIDUSN XIDSLOT XIDSQN
-------- ---------- ---------- ----------
2EBD5250 3 29 777

3. 執行檢查(注意transaction_id/TXN_ID由XIDUSN.XIDSLOT.XIDSQN拼成。測試中如果TXN_ID給一個錯誤的值,則必發生ORA-00600或者ORA-03113錯誤,這個比較奇怪,估計11g R1版本這方面還是做的有些瑕疵)
SQL> exec dbms_hm.run_check (check_name => 'Transaction Integrity Check',run_name => 'test_tx_ck1',input_params => 'TXN_ID=3.29.777');
PL/SQL procedure successfully completed


4. 生成檢查報告
SQL> set long 10000
SQL> select dbms_hm.get_run_report('test_tx_ck1') from dual;

DBMS_HM.GET_RUN_REPORT('TEST_T
--------------------------------------------------------------------------------
Basic Run Information
Run Name : test_tx_ck1
Run Id : 41
Check Name : Transaction Integrity Check
Mode : MANUAL
Status : COMPLETED
Start Time : 2009-02-17 14:07:44.058000 +08:00
End Time : 2009-02-17 14:07:44.599000 +08:00
Error Encountered : 0
Source Incident Id : 0
Number of Incidents Created : 0

Input Paramters for the Run
TXN_ID=3.29.777

Run Findings And Recommendations
Finding
Finding Name : TXN not corrupt
Finding ID : 42
Type : INFORMATIONAL
Status : OPEN
Priority : HIGH
Message : Transaction 3.29.777 is not corrupted

5. 在ADRCI工具中也可以檢視Health Monitor執行的相關資訊。
adrci> set home diagrdbmsorclorcl
adrci> show hm_run

ADR Home = d:apphz00540diagrdbmsorclorcl:
*************************************************************************

......
**********************************************************
HM RUN RECORD 2
**********************************************************
RUN_ID 21
RUN_NAME TestCheck1
CHECK_NAME Dictionary Integrity Check
NAME_ID 24
MODE 0
START_TIME 2009-02-16 13:37:37.099000 +08:00
RESUME_TIME
END_TIME 2009-02-16 13:37:44.329000 +08:00
MODIFIED_TIME 2009-02-17 14:08:49.045000 +08:00
TIMEOUT 0
FLAGS 0
STATUS 5
SRC_INCIDENT_ID 0
NUM_INCIDENTS 0
ERR_NUMBER 0
REPORT_FILE d:apphz00540diagrdbmsorclorclhmHMREPORT_TestCheck1.hm
......

Health Monitor相關的資料字典 V$HM_CHECK,V$HM_CHECK_PARAM,V$HM_FINDING,V$HM_INFO,V$HM_RECOMMENDATION,V$HM_RUN。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/207/viewspace-1027398/,如需轉載,請註明出處,否則將追究法律責任。