oracle 11.2 rac應用不定時中斷

聽海★藍心夢發表於2013-09-30

環境:

作業系統:HP UNIX 11.31

資料庫:oracle 11.2.0.3.6 rac

故障:

節點1總是發現應用自動斷開,提示監聽錯誤,資料庫無法連線。

檢查節點節點1日誌:
/oracle/app/grid11.2.0/log/racdb1/alertracdb1.log

2013-09-30 10:27:56.609
[crsd(6294)]CRS-2765:Resource 'ora.net1.network' has failed on server 'racdb1'.
2013-09-30 10:33:17.086
[crsd(6294)]CRS-2765:Resource 'ora.net1.network' has failed on server 'racdb1'.

/oracle/app/grid11.2.0/log/racdb1/agent/crsd/orarootagent_root/orarootagent_root.log
2013-09-30 10:33:16.034: [ default][10854]ICMP Ping from 192.168.66.129 to 192.168.66.1
2013-09-30 10:33:17.069: [ora.net1.network][10854] {0:2:15033} [check] NetworkAgent::checkLink returned false
2013-09-30 10:33:17.070: [    AGFW][10] {0:2:15033} ora.net1.network racdb1 1 state changed from: ONLINE to: OFFLINE
2013-09-30 10:33:17.071: [    AGFW][10] {0:2:15033} Switching online monitor to offline one
2013-09-30 10:33:17.071: [    AGFW][10] {0:2:15033} Started implicit monitor for [ora.net1.network racdb1 1] interval=60000
delay=60000
2013-09-30 10:33:17.071: [    AGFW][10] {0:2:15037} Generating new Tint for unplanned state change. Original Tint: {0:2:15033}
2013-09-30 10:33:17.071: [    AGFW][10] {0:2:15037} Agent sending message to PE: RESOURCE_STATUS[Proxy] ID 20481:1367803
2013-09-30 10:33:17.134: [    AGFW][10] {0:2:15037} Agent received the message: RESOURCE_START[ora.net1.network racdb1 1] ID
 4098:116509
2013-09-30 10:33:17.134: [    AGFW][10] {0:2:15037} Preparing START command for: ora.net1.network racdb1 1
2013-09-30 10:33:17.134: [    AGFW][10] {0:2:15037} ora.net1.network racdb1 1 state changed from: OFFLINE to: STARTING
2013-09-30 10:33:17.140: [ora.net1.network][10855] {0:2:15037} [start] (:CLSN00107:) clsn_agent::start {
2013-09-30 10:33:17.140: [ora.net1.network][10855] {0:2:15037} [start] NetworkAgent::init enter {
2013-09-30 10:33:17.141: [ora.net1.network][10855] {0:2:15037} [start] Checking if lan900 Interface is fine
2013-09-30 10:33:17.211: [    AGFW][10] {0:2:15037} Agent received the message: RESOURCE_PROBE[ora.racdb1.vip 1 1] ID 4097:1
16510
2013-09-30 10:33:17.212: [    AGFW][10] {0:2:15037} Preparing CHECK command for: ora.racdb1.vip 1 1
2013-09-30 10:33:17.222: [    AGFW][10] {0:2:15037} Agent sending last reply for: RESOURCE_PROBE[ora.racdb1.vip 1 1] ID 4097
:116510

BUG:
HP-UX: GI ora.net1.network Goes Offline/Online Intermittently with "NetworkAgent::checkLink returned false" (文件 ID 1534994.1)

Cause
The issue was investigated in Bug 16039587, the cause is HP-UX bug, basically the contention of address memory range lock on kernel memory causes poll(2) timeout and affects orarootagent process.

Solution
Apply OS kernel patch PHKL_42850.

打完補丁後,系統恢復正常。應用不再出現中斷顯現,叢集資源也不再offliine.


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/751371/viewspace-773639/,如需轉載,請註明出處,否則將追究法律責任。

相關文章