ASM心跳超時檢測之--Delayed ASM PST heart beats
近日,連續收到ASM磁碟dismount,並且是錯誤“Waited 15 secs for write IO to PST”的問題,這是ASM特有的心跳超時檢測,ASM instance會定期檢查每個asm disk是不是能正常反饋。所以決定針對這個問題,做個小總結。
在文件ASM diskgroup dismount with "Waited 15 secs for write IO to PST" (Doc ID 1581684.1) 中有下面一段描述:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Generally this kind messages comes in ASM alertlog file on below situations,
Delayed ASM PST heart beats on ASM disks in normal or high redundancy diskgroup,
thus the ASM instance dismount the diskgroup.By default, it is 15 seconds.
By the way the heart beat delays are sort of ignored for external redundancy diskgroup.
ASM instance stop issuing more PST heart beat until it succeeds PST revalidation,
but the heart beat delays do not dismount external redundancy diskgroup directly.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
上面描述,可以理解為下面幾點:
1. ASM例項會定期檢查每一個磁碟組的磁碟狀態,是否通訊正常;
2. 這個檢查,只是針對normal和high冗餘模式,對於external冗餘,不會遇到這個錯誤;
3. 預設情況是15s超時,也就是說15s磁碟組還是沒有對ASM例項響應的話,就會dismount磁碟組。
而遇到這個問題的客戶,都是使用光纖網路儲存,在儲存網路出現問題的情況下,會引發這個錯誤的出現。也就是說,在ASM定期發出檢查資訊的時候,如果磁碟沒有在15s內反饋的話,我就認為磁碟已經無法訪問。
針對這個錯誤,我嘗試在測試環境測試,由於測試環境是VMware的虛擬機器,在物理層面刪除磁碟,並不會引發這個問題。原因是在同一個主機上的磁碟被異常刪除後,ASM的讀取操作會立即返回系統層面的IO錯誤,而不需要去等待錯誤“Waited 15 secs for write IO to PST”的超時。
所以,我總結這個錯誤,只會出現在共享的ASM磁碟,不在物理主機的本地,而是在儲存網路中,ASM發出去的檢測資訊,不能及時被反饋,才會出現這個錯誤。這時,可能是儲存主機,儲存網路,甚至儲存磁碟的問題,anyway,我ASM沒有收到我需要的確認資訊,我認為你有問題,如果有問題的磁碟數夠多,達到影響資料完整性了,那我ASM就要dismount這個磁碟組了。
這裡對於“Waited 15 secs for write IO to PST”錯誤資訊,根據文件1581684.1介紹,是在11.2.0.3.0之後出現的。同時在文件中有描述,如何手動修改這個檢測超時的時間,可以通過引數_asm_hbeatiowait來控制:
alter system set "_asm_hbeatiowait"=
為了確認,這個引數是在11.2.0.3之後出現的,我將全部資料庫版本都查詢一遍,具體可以參考下面資訊:
======================10.2=====================
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - Prod
PL/SQL Release 10.2.0.5.0 - Production
CORE 10.2.0.5.0 Production
TNS for Linux: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - Production
SQL> select ksppinm as "hidden parameter", ksppstvl as "value" from x$ksppi join x$ksppcv using (indx) where ksppinm like '\_%' escape '\' and ksppinm like '%undo%' order by ksppinm;
hidden parameter value
-------------------------------------------------------------------------------- ----------
_asm_acd_chunks 1
_asm_allow_only_raw_disks TRUE
_asm_allow_resilver_corruption FALSE
_asm_ausize 1048576
_asm_blksize 4096
_asm_direct_con_expire_time 120
_asm_disk_repair_time 14400
_asm_droptimeout 60
_asm_emulmax 10000
_asm_emultimeout 0
_asm_fob_tac_frequency 3
hidden parameter value
-------------------------------------------------------------------------------- ----------
_asm_instlock_quota 0
_asm_kfdpevent 0
_asm_libraries ufs
_asm_maxio 1048576
_asm_skip_resize_check FALSE
_asm_stripesize 131072
_asm_stripewidth 8
_asm_wait_time 18
_asmlib_test 0
_asmsid asm
21 rows selected.
======================11.2.0.1=====================
sqlplus / as sysdba
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> select ksppinm as "hidden parameter", ksppstvl as "value" from x$ksppi join x$ksppcv using (indx) where ksppinm like '\_%' escape '\' and ksppinm like '%asm_hb%' order by ksppinm;
hidden parameter value
--------------------------------------------------------------------------------
_asm_hbeatwaitquantum 2
======================11.2.0.2=====================
$ sqlplus / as sysdba
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining
and Real Application Testing options
SQL> select ksppinm as "hidden parameter", ksppstvl as "value" from x$ksppi join x$ksppcv using (indx) where ksppinm like '\_%' escape '\' and ksppinm like '%asm_hb%' order by ksppinm;
hidden parameter value
--------------------------------------------------------------------------------
_asm_hbeatwaitquantum 2
在11.2.0.3.0之後才有這個引數出現,也就是說ASM例項對磁碟超時的檢測是在11.2.0.3之後才出現的
======================11.2.0.3=====================
sys@R11203> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
SQL> select ksppinm as "hidden parameter", ksppstvl as "value" from x$ksppi join x$ksppcv using (indx) where ksppinm like '\_%' escape '\' and ksppinm like '%undo%' order by ksppinm;
hidden parameter value
hidden parameter value
-------------------------------------------------- --------------------
_asm_hbeatiowait 15
_asm_hbeatwaitquantum 2
======================11.2.0.4=====================
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - Production
SQL> select ksppinm as "hidden parameter", ksppstvl as "value" from x$ksppi join x$ksppcv using (indx) where ksppinm like '\_%' escape '\' and ksppinm like '%undo%' order by ksppinm;
hidden parameter value
-------------------------------------------------------------------------------- ---------
_asm_hbeatiowait 15 <<<<<<<<<<<<<<<<<<<
_asm_hbeatwaitquantum 2
======================12.1.0.1=====================
$ sqlplus / as sysdba
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
SQL> select ksppinm as "hidden parameter", ksppstvl as "value" from x$ksppi join x$ksppcv using (indx) where ksppinm like '\_%' escape '\' and ksppinm like '%asm_hb%' order by ksppinm;
hidden parameter value
--------------------------------------------------------------------------------
_asm_hbeatiowait 15
_asm_hbeatwaitquantum 2
在12.1.0.2之後,這個引數預設值被調整為120s
======================12.1.0.2=====================
$ sqlplus / as sysdba
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
SQL> select ksppinm as "hidden parameter", ksppstvl as "value" from x$ksppi join x$ksppcv using (indx) where ksppinm like '\_%' escape '\' and ksppinm like '%asm_hb%' order by ksppinm;
hidden parameter value
--------------------------------------------------------------------------------
_asm_hbeatiowait 120
_asm_hbeatwaitquantum 2
希望總結的這個知識點,對你有幫助。日常中,經常感嘆,這個問題很簡單,但是不sure,測試過後,記錄下來,以備查詢。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/22990797/viewspace-1655015/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- ASM之建立ASM例項時的常見故障ASM
- ASM diskgroup dismount with "Waited 15 secs for write IO to PST"ASMAI
- ASM之建立ASM磁碟ASM
- ASM之建立ASM例項ASM
- ASM之ASM相關概念ASM
- ASM之建立ASM例項及ASM資料庫ASM資料庫
- 檢查asm磁碟組狀態的檢視v$asm_diskgroupASM
- Oracle ASM檢視資訊OracleASM
- 心跳檢測機制
- asm files,asm directories,asm templatesASM
- ASM之快速理解ASM
- Flex ASM自動重定位ASM例項測試FlexASM
- ASM動態效能檢視ASM
- ASM磁碟空間的檢視ASM
- asm中template特性測試!ASM
- 【ASM】如何建立ASM磁碟ASM
- 【ASM學習】ASM 管理ASM
- 【ASM學習】ASM文件ASM
- asm例項查詢asm相關檢視hang住解決方法ASM
- redis主從超時檢測Redis
- ASM之磁碟建立及管理ASM
- 【ASM】ASM基礎知識ASM
- V$ASM_DISK 檢視含義ASM
- oracle10g_asm_v$asm_disk之header_statusOracleASMHeader
- ASMASM
- 規劃ASM DISK GROUP、檢視asm 磁碟當前狀態、mount or dismount 磁碟組ASM
- 【ORACLE ASM】ASM 支援工具簡介OracleASM
- asm-windows下安裝asmASMWindows
- rman copy asm datafile(rename asm datafile)ASM
- ASM 翻譯系列第八彈:ASM Internal ASM file extent mapASM
- ASM 翻譯系列第十彈:ASM Internal ASM DISK headerASMHeader
- 利用RMAN將非ASM檔案移動到ASM裡 - [ASM]ASM
- solaris10_oracle10g_asm_non_asm遷移資料庫測試OracleASM資料庫
- oracle11gRAC之asm管理OracleASM
- 檢視ASM的Extent分佈情況ASM
- 對oracle asm 磁碟組進行檢查OracleASM
- 利用kfod工具檢視asm磁碟組資訊ASM
- ASM 翻譯系列第三十二彈:ASM INTERNAL Find block in ASMASMBloC