Oracle 9i,10g,11gR1基於Linux的RAC都需要Hangcheck-Timer模組
Hangcheck-Timer Module Requirements for Oracle 9i, 10g, and 11g RAC on Linux [ID 726833.1] | |||||
| |||||
修改時間 29-JUL-2010 型別 REFERENCE 狀態 PUBLISHED |
In this Document
Applies to:
Oracle Server - Enterprise Edition - Version: 10.1.0.2 to 11.1.0.7 - Release: 10.1 to 11.1Oracle Server - Enterprise Edition - Version: 9.2.0.8 to 11.1.0.7 [Release: 9.2 to 11.1]
Linux x86
Linux x86-64
Purpose
Hangcheck_timer module is required to run a supported configuration in Oracle Real Application Clusters environments on Linux, with Oracle releases 9i, 10g, or 11g RAC. This note identifies and outlines the requirements needed to configure hangcheck-timer in an Oracle Enterprise Linux, Red Hat Linux, or SUSE Linux environment.Scope
This article is provided for product management, system architects, and system administrators involved in deploying and configuring Oracle RAC 9i, 10g, or 11g in a Linux environment. This document will also be useful to field engineers and consulting organizations to facilitate installations and configuration requirements of Oracle in a Linux RAC environment.Hangcheck-Timer Module Requirements for Oracle 9i, 10g, and 11g RAC on Linux
Starting in release 9.2.0.2 and later, Oracle RAC environments required using a new I/O fencing model, named the hangcheck-timer module. This module was implemented to replace the Watchdog module, which provided similar fencing functionality. Hangcheck-timer was subsequently delivered as part of the standard kernel distribution for Linux kernel releases 2.4 and above.
Hangcheck-timer should be loaded at boot time, and monitors the Linux kernel for long operating system hangs that could affect the reliability of a RAC node. It runs in kernel mode and uses the Time Stamp Counter (TSC) to catch scheduling delays or node hangs. This is done by setting a timer, then checking when the timer fires as to whether it was delayed by more than the allowed margin of error. If the duration exceeds the allowed time of (hangcheck_tick + hangcheck_margin seconds), the machine is restarted. Hangcheck-timer will not cause reboots to occur due to CPU starvation.
Hangcheck-timer requires three configuration parameters:
- hangcheck_tick - defines how often, in seconds, the hangcheck-timer checks the node for hangs. The default value is 60 seconds.
- hangcheck_margin - defines how much margin is allowed, in seconds, between expected scheduling and real scheduling time. The default value is 180 seconds.
- hangcheck_reboot - determines if the hangcheck-timer restarts the node if the kernel fails to respond within the sum of the hangcheck_tick and hangcheck_margin parameter values. If the value of hangcheck_reboot is equal to or greater than 1, then the hangcheck-timer module restarts the system. If the hangcheck_reboot parameter is set to zero, then the hangcheck-timer module will not reboot the node, even if a hang is detected. The default value varies by kernel version. In the 2.4 kernel, the default is 1. In 2.6 kernels, the default is 0.
- 9i: Assuming the default setting of "oracm misscount" is set to 220 seconds:
hangcheck_tick=30 hangcheck_margin=180 hangcheck_reboot=1 - 10g/11g: Assuming the default setting of "CSS misscount" is set to either 30 or 60 seconds:
hangcheck_tick=1 hangcheck_margin=10 hangcheck_reboot=1
You must always ensure that the Cluster misscount setting is greater than the sum of the setting for hangcheck_tick + hangcheck_margin.
@ Unpublished information for Oracle Support Internal Use:
When running Oracle Clusterware on Linux, hangcheck-timer should always be configured on each RAC cluster node, as the functionality of this module is required to provide I/O Fencing to ensure no stray writes will occur from an evicted node in a RAC cluster. To verify if the hangcheck-timer module is running on a node execute as the root or oracle user:
hangcheck-timer 2672 0
If the hangcheck-timer module is loaded (running) you will see output similar to above. When hangcheck-timer is not loaded no output is generated, and the command prompt is returned to the user.
In an Oracle Enterprise Linux, Red Hat 4/5, or SUSE 9/10 environment the hangcheck-timer module is loaded using the modprobe command:
In order to ensure the module is loaded at boot time, you should also place the same command in the appropriate local command execution directory (e.g. /etc/rc.d/rc.local, or /etc/init.d/boot.local). In earlier releases, hangcheck-timer was loaded using insmod in place of modprobe. Consult your release specific documentation to determine which initialization method is required.
Hangcheck-timer will provide message logging to the system messages log when a failure is detected, and a node restart is initiated by the module:
- When Hangcheck-timer reboots it may leave "Hangcheck: hangcheck is restarting the machine" message in /var/log/messages
- If you see the following message in /var/log/messages: "Hangcheck: hangcheck value past margin!" this means a reboot was required but was not performed, because hangcheck_reboot was not set to 1. If this message is seen, you must reload the hangcheck module as described earlier in this note, with the hangcheck_reboot value set to 1.
Known Issues
- Bug:6125546 which can prevent hangcheck-timer from rebooting in RHEL4 (fixed in 2.6.9.56 or RHEL4.6)
References
- Using Diagwait as a diagnostic to get more information for diagnosing Oracle Clusterware Node evictions- Changes in Oracle Clusterware on Linux with the 10.2.0.4 Patchset
http://dbdev.us.oracle.com/twiki/bin/view/Cluster/IOFencingHangcheckOprocd
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/23135684/viewspace-706767/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Linux: Hangcheck-Timer Module Requirements for Oracle 9i,10g,11gR1 RAC_726833.1LinuxGCUIREMOracle
- Oracle 9i, 10g, and 11g RAC on Linux所需要的Hangcheck-Timer Module介紹OracleLinuxGC
- 基於LINUX的Oracle 10G RAC管理維護學習手記LinuxOracle 10g
- 向基於 Linux 的 Oracle RAC 10g 叢集新增新節點LinuxOracle
- [ZT] 向基於Linux的Oracle RAC 10g叢集新增新節點LinuxOracle
- 向基於 Linux 的 Oracle RAC 10g 叢集新增新節點(zt)LinuxOracle
- 在vmware上基於紅旗linux 5.0安裝oracle 10g racLinuxOracle 10g
- (轉載)基於LINUX的Oracle 10G RAC管理維護學習手記LinuxOracle 10g
- oracle 9i 10G 11G 的RAC 穩定性比較Oracle
- 【官方文件 oracle documentation】oracle官方文件總彙(9i,10g,11gR1, 11gR2)Oracle
- Placement of Voting disk and OCR Files in Oracle RAC 10g and 11gR1 [ID 293819.1]Oracle
- Oracle 10g rac升級需要注意的事項Oracle 10g
- oracle 10g rac install for linuxOracle 10gLinux
- ORACLE 10G RAC for Linux AS4Oracle 10gLinux
- Linux 下Oracle 10G RAC 管理LinuxOracle 10g
- hangcheck-timer模組GC
- 【官方文件】【Doc】oracle官方文件總彙(9i,10g,11gR1, 11gR2)Oracle
- 安裝Oracle 10g RAC是否需要安裝HACMPOracle 10gACM
- 安裝Oracle 10g RAC是否需要安裝HACMP?Oracle 10gACM
- 關於Oracle 9i RAC enqueue等待的一點測試OracleENQ
- Linux & Oracle 10g RAC --- .bash_profileLinuxOracle 10g
- Oracle 10g RAC for linux 的完全解除安裝Oracle 10gLinux
- Oracle 9i/10g的官方教材Oracle
- 10gR2 RAC 配置時間同步和hangcheck-timer模組GC
- unix/linux環境中Oracle 10G RAC OFF和RAC ONLinuxOracle 10g
- VMWARE+linux+oracle 10g RAC 之四LinuxOracle 10g
- VMWARE+linux+oracle 10g RAC 之三LinuxOracle 10g
- VMWARE+linux+oracle 10g RAC 之二LinuxOracle 10g
- VMWARE+linux+oracle 10g RAC 之一LinuxOracle 10g
- Oracle Flashback (9i & 10g) [zt]Oracle
- [筆記]Semaphores Tunning on RedHat Linux for Oracle 9i or 10g筆記RedhatLinuxOracle
- Oracle 9i RAC on PowerHA5.5Oracle
- Oracle 9i RAC 互聯效能Oracle
- Oracle 10g RAC中的DRMOracle 10g
- Oracle 10g RAC NFSOracle 10gNFS
- Oracle 10g RAC TAFOracle 10g
- Oracle 10g RAC CRS-0184 linuxOracle 10gLinux
- oracle 10g的dmp如何匯入9iOracle 10g