Configure the hangcheck-timer Kernel Module
Perform the following configuration procedures on all nodes in the cluster!
Oracle9i Release 1 (9.0.1) and Oracle9i Release 2 ( 9.2.0.1) used a userspace watchdog daemon called watchdogd to monitor the health of the cluster and to restart a RAC node in case of a failure. Starting with Oracle9i Release 2 (9.2.0.2) (and still available in Oracle 10g Release 2), the watchdog daemon has been deprecated by a Linux kernel module named hangcheck-timer which addresses availability and reliability problems much better. The hang-check timer is loaded into the Linux kernel and checks if the system hangs. It will set a timer and check the timer after a certain amount of time. There is a configurable threshold to hang-check that, if exceeded will reboot the machine. Although the hangcheck-timer module is not required for Oracle Clusterware (Cluster Manager) operation, it is highly recommended by Oracle.
The hangcheck-timer.ko Module
The hangcheck-timer module uses a kernel-based timer that periodically checks the system task scheduler to catch delays in order to determine the health of the system. If the system hangs or pauses, the timer resets the node. The hangcheck-timer module uses the Time Stamp Counter (TSC) CPU register, which is incremented at each clock signal. The TCS offers much more accurate time measurements because this register is updated by the hardware automatically.
Much more information about the hangcheck-timer project can be found .
Installing the hangcheck-timer.ko Module
The hangcheck-timer was originally shipped only by Oracle; however, this module is now included with Red Hat Linux starting with kernel versions 2.4.9-e.12 and higher. If you followed the steps in ("Obtain & Install New Linux Kernel / FireWire Modules"), then the hangcheck-timer is already included for you. Use the following to confirm:
# find /lib/modules -name "hangcheck-timer.ko" /lib/modules/2.6.9-11.0.0.10.3.EL/kernel/drivers/char/hangcheck-timer.ko /lib/modules/2.6.9-22.EL/kernel/drivers/char/hangcheck-timer.koIn the above output, we care about the hangcheck timer object (hangcheck-timer.ko) in the /lib/modules/2.6.9-11.0.0.10.3.EL/kernel/drivers/char directory.
Configuring and Loading the hangcheck-timer Module
There are two key parameters to the hangcheck-timer module:
- hangcheck-tick: This parameter defines the period of time between checks of system health. The default value is 60 seconds; Oracle recommends setting it to 30 seconds.
- hangcheck-margin: This parameter defines the maximum hang delay that should be tolerated before hangcheck-timer resets the RAC node. It defines the margin of error in seconds. The default value is 180 seconds; Oracle recommends setting it to 180 seconds.
system hang time > (hangcheck_tick + hangcheck_margin)Configuring Hangcheck Kernel Module Parameters
Each time the hangcheck-timer kernel module is loaded (manually or by Oracle), it needs to know what value to use for each of the two parameters we just discussed: (hangcheck-tick and hangcheck-margin). These values need to be available after each reboot of the Linux server. To do that, make an entry with the correct values to the /etc/modprobe.conf file as follows:
# su - # echo "options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180" >> /etc/modprobe.confEach time the hangcheck-timer kernel module gets loaded, it will use the values defined by the entry I made in the /etc/modprobe.conf file.
Manually Loading the Hangcheck Kernel Module for Testing
Oracle is responsible for loading the hangcheck-timer kernel module when required. For that reason, it is not required to perform a modprobe or insmod of the hangcheck-timer kernel module in any of the startup files (i.e. /etc/rc.local).
It is only out of pure habit that I continue to include a modprobe of the hangcheck-timer kernel module in the /etc/rc.local file. Someday I will get over it, but realize that it does not hurt to include a modprobe of the hangcheck-timer kernel module during startup.
So to keep myself sane and able to sleep at night, I always configure the loading of the hangcheck-timer kernel module on each startup as follows:
# echo "/sbin/modprobe hangcheck-timer" >> /etc/rc.local
(Note: You don't have to manually load the hangcheck-timer kernel module using modprobe or insmod after each reboot. The hangcheck-timer module will be loaded by Oracle automatically when needed.)
Now, to test the hangcheck-timer kernel module to verify it is picking up the correct parameters we defined in the /etc/modprobe.conf file, use the modprobe command. Although you could load the hangcheck-timer kernel module by passing it the appropriate parameters (e.g. insmod hangcheck-timer hangcheck_tick=30 hangcheck_margin=180), we want to verify that it is picking up the options we set in the /etc/modprobe.conf file.
To manually load the hangcheck-timer kernel module and verify it is using the correct values defined in the /etc/modprobe.conf file, run the following command:
# su - # modprobe hangcheck-timer # grep Hangcheck /var/log/messages | tail -2 Sep 27 23:11:51 linux2 kernel: Hangcheck: starting hangcheck timer 0.5.0 (tick is 30 seconds, margin is 180 seconds)
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/10130206/viewspace-1036186/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 向kernel module 傳遞引數(Passing Arugments to Kernel Module)
- 11.2.0.3 RAC 全過程--3.Kernel Configure
- Kernel Module實戰指南(四):系統呼叫劫持
- Xamarin.Android模擬器提示HAX kernel module is not InstalledAndroid
- hangcheck-timer模組GC
- Oprocd & Hangcheck-timerGC
- Oracle 9i, 10g, and 11g RAC on Linux所需要的Hangcheck-Timer Module介紹OracleLinuxGC
- 如何處理VirtualBox啟動錯誤訊息:The vboxdrv kernel module is not loaded
- Linux: Hangcheck-Timer Module Requirements for Oracle 9i,10g,11gR1 RAC_726833.1LinuxGCUIREMOracle
- module hmrclient is not a registered callable moduleclient
- rman configure
- 核心引數kernel.shmall和kernel.shmmaxHMM
- 排錯./configure: error: the HTTP XSLT module requires the libxml2/libxslt libraries. You can either doErrorHTTPUIXML
- Hugemem Kernel ExplainedAI
- set autotrace on [configure]
- ./configure 幫助
- Module of ORacleOracle
- Linux Kernel File IO Syscall Kernel-Source-Code Analysis(undone)Linux
- TypeScript 裡的 module 解析過程 - Module ResolutionTypeScript
- [Android]多module合成單一module技巧Android
- Linux kernel mapLinux
- kernel 啟動流程
- LangChain vs Semantic KernelLangChain
- Linux Kernel(核)Linux
- [轉]Hugemem Kernel ExplainedAI
- Hugemem Kernel Explained [轉]AI
- Unix kernel parameters for OracleOracle
- Linux系統下oprocd和hangcheck-timer的作用LinuxGC
- docker install and configureDocker
- configure shared serverServer
- mysql configure 引數MySql
- rman:configure exclude for tablespace ...
- configure net card in Solaris
- Xenomai-2.6.0-configureAI
- rlwrap ./configure報錯configure: WARNING: No termcap nor curses library found
- Go module vendorGo
- Vuex之moduleVue
- exprots && module exportsExport