Reference:
1. Installing Oracle Database 10g with Real Application Cluster (RAC) on Red Hat Enterprise Linux Advanced Server 3 2. Installing Oracle 9i Real Application Cluster (RAC) on Red Hat Linux Advanced Server 2.1
3. Oracle Official Document
<> B10766-04 4. Metalink Notes:
Note 240575.1 RAC on Linux Best Practices Note 184821.1 Step-By-Step Installation of 9.2.0.5 RAC on Linux 5. Hardware and Network Requirements:
6. Software Requirements: Installation Steps 1. Configuring the Network 1.1 Setting up /etc/hosts on each node, vi /etc/hosts --------------------------- # Public hostnames for e.g. eth0 interfaces (public network) 123.45.67.86 test1 # RAC node 1 123.45.67.87 test2 # RAC node 2 # Private hostnames, private network for e.g. eth1 interfaces (Interconnect) 192.168.0.1 rac1 # RAC node 1 192.168.0.2 rac2 # RAC node 2 ------------------------------- 1.2 Configuring the Network Interfaces (NICs) (Important)
On each node, to configure the network interfaces (in this example eth0 and eth1), run the following command on each node. su - root redhat-config-network NOTE: You do not have to configure the network alias names for the public VIP. This will be done by Oracle's Virtual Internet Protocol Configuration Assistant (VIPCA). NOTE: When the network configuration is done, it is important to make sure that the name of the public RAC nodes is displayed when you execute the following command: $ hostname test1 You can verify the new configured NICs by running the command: /sbin/ifconfig 2. Creating Oracle User Accounts and Setting Oracle Environments
On each node, su - root groupadd -g 700 dba # group of users to be granted with SYSDBA system privilege groupadd -g 701 oinstall # group owner of Oracle files useradd -c "Oracle software owner" -u 700 -g oinstall -G dba oracle passwd oracle To verify the oracle account, enter the following command: # id oracle uid=700(oracle) gid=701(oinstall) groups=701(oinstall),700(dba) su - oracle cat >> ~oracle/.bash_profile << EOF export ORACLE_BASE=/orabase export ORACLE_SID=orcl1 # Each RAC node must have a unique Oracle SID! export ORACLE_HOME=$ORACLE_BASE/product/10.1.0/db_1 export PATH=$PATH:$ORACLE_HOME/bin EOF 3. Configuring Shared NFS Devices (Important) On each node su - root mkdir /oradata chmod 777 /oradata on rac1 vi /etc/exports /oradata rac2(rw,no_root_squash) on rac2 vi /etc/fstab rac1:/oradata /oradata nfs hard,intr,vers=3,proto=udp,suid,noac 0 0 on rac1, dd if=/dev/zero of=/oradata/voting.disk bs=1M count=20 dd if=/dev/zero of=/oradata/ors.disk bs=1M count=100 on each node, chown root:dba /oradata/voting.disk chmod 664 /oradata/ors.disk chown oracle:dba /oradata/ors.disk chmod 664 /oradata/ors.disk 4. Configuring the "hangcheck-timer" Kernel Module Oracle uses the Linux kernel module hangcheck-timer to monitor the system health of the cluster and to reset a RAC node in case of failures. The hangcheck-timer module uses a kernel-based timer to periodically check the system task scheduler. This timer resets the node when the system hangs or pauses. This module uses the Time Stamp Counter (TSC) CPU register which is a counter that is incremented at each clock signal. The TCS offers very accurate time measurements since this register is updated by the hardware automatically. The hangcheck-timer module comes now with the kernel: find /lib/modules -name "hangcheck-timer.o" The hangcheck-timer module has the following two parameters: hangcheck_tick This parameter defines the period of time between checks of system health. The default value is 60 seconds. Oracle recommends to set it to 30 seconds. hangcheck_margin This parameter defines the maximum hang delay that should be tolerated before hangcheck-timer resets the RAC node. It defines the margin of error in seconds. The default value is 180 seconds. Oracle recommends to set it to 180 seconds. These two parameters indicate how long a RAC node must hang before the hangcheck-timer module will reset the system. A node reset will occur when the following is true: system hang time > (hangcheck_tick + hangcheck_margin) To load the module with the right parameter settings, make entries to the /etc/modules.conf file. To do that, add the following line to the /etc/modules.conf file: # su - root # echo "options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180" >> /etc/modules.conf Now you can run modprobe to load the module with the configured parameters in /etc/modules.conf: # su - root # modprobe hangcheck-timer # grep Hangcheck /var/log/messages |tail -2 Jul 5 00:46:09 test1 kernel: Hangcheck: starting hangcheck timer 0.8.0 (tick is 180 seconds, margin is 60 seconds). Jul 5 00:46:09 test1 kernel: Hangcheck: Using TSC. # Note: You do not have to run modprobe after each reboot. The hangcheck-timer module will be loaded automatically by Oracle when needed! 5. Setting up RAC Nodes for Remote Access (Important) When you run the Oracle Installer on a RAC node, it will use ssh to copy Oracle software and data to other RAC nodes. Therefore, the oracle user on the RAC node where Oracle Installer is launched must be able to login to other RAC nodes without having to provide a password or passphrase. The following procedure shows how ssh can be configured that no password is requested for oracle ssh logins. To create an authentication key for oracle, enter the following command on all RAC node: (the ~/.ssh directory will be created automatically if it doesn't exist yet) su - oracle $ ssh-keygen -t dsa -b 1024 Generating public/private dsa key pair. Enter file in which to save the key (/root/.ssh/id_dsa): Press ENTER Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter a passphrase Enter same passphrase again: Etner a passphrase Your identification has been saved in /root/.ssh/id_dsa. Your public key has been saved in /root/.ssh/id_dsa.pub. The key fingerprint is: e0:71:b1:5b:31:b8:46:d3:a9:ae:df:6a:70:98:26:82 Copy the pulic key for oracle from each RAC node to all other RAC nodes. For example, run the following commands on all RAC nodes: su - oracle ssh test1 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys ssh test2 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys Now verify that oracle on each RAC node can login to all other RAC nodes without a password. Make sure that ssh only asks for the passphrase. Note, however, that the first time you ssh to another server you will get a message stating that the authenticity of the host cannot be established. Enter "yes" at the prompt to continue the connection. For example, run the following commands on all RAC nodes to verify that no password is asked: su - oracle ssh test1 hostname ssh rac1 hostname ssh test2 hostname ssh rac2 hostname And later, before runInstaller is launched, I will show how ssh can be configured that no passphrase has to be entered for oracle ssh logins. 6. Checking Packages (RPMs) Some packages will be missing when you selected the Installation Type "Advanced Server" during the Red Hat Advanced Server installation. The following additional RPMs are required: rpm -q gcc glibc-devel glibc-headers glibc-kernheaders cpp compat-libstdc++ To install these RPMS, run: su - root rpm -ivh gcc-3.2.3-24.i386.rpm glibc-devel-2.3.2-95.6.i386.rpm glibc-headers-2.3.2-95.6.i386.rpm glibc-kernheaders-2.4-8.34.i386.rpm cpp-3.2.3-24.i386.rpm compat-libstdc++-7.3-2.96.123.i386.rpm The opemotif RPM is also required, otherwise you won't pass Oracle's recommended operating system packages test. If it's not installed on your system, run su - root rpm -ivh openmotif-2.2.2-16.i386.rpm I recommend using the latest RPM version. 7. Adjusting Network Settings Oracle now uses UDP as the default protocol on Linux for interprocess communication, such as cache fusion buffer transfers between the instances. It is strongly suggested to adjust the default and maximum send buffer size (SO_SNDBUF socket option) to 256 KB, and the default and maximum receive buffer size (SO_RCVBUF socket option) to 256 KB. The receive buffers are used by TCP and UDP to hold received data until is is read by the application. The receive buffer cannot overflow because the peer is not allowed to send data beyond the buffer size window. This means that datagrams will be discarded if they don't fit in the socket receive buffer. This could cause the sender to overwhelm the receiver. The default and maximum window size can be changed in the proc file system without reboot: su - root sysctl -w net.core.rmem_default=262144 # Default setting in bytes of the socket receive buffer sysctl -w net.core.wmem_default=262144 # Default setting in bytes of the socket send buffer sysctl -w net.core.rmem_max=262144 # Maximum socket receive buffer size which may be set by using the SO_RCVBUF socket option sysctl -w net.core.wmem_max=262144 # Maximum socket send buffer size which may be set by using the SO_SNDBUF socket option To make the change permanent, add the following lines to the /etc/sysctl.conf file, which is used during the boot process: net.core.rmem_default=262144 net.core.wmem_default=262144 net.core.rmem_max=262144 net.core.wmem_max=262144 8. Check other OS related config,such as: tmp space Swap space Shared memory Semaphores File handles Please read Oracle official document <> if you need more detailed information. 9. Installing Oracle CRS(Cluster Ready Service) In order to install the Cluster Ready Services (CRS) R1 (10.1.0.2) on all RAC nodes, OUI has to be launched on only one RAC node. In my example I will run OUI always on rac1. To install CRS, insert the "Cluster Ready Services (CRS) R1 (10.1.0.2)" CD (downloadedd image name: "ship.crs.cpio.gz"), and mount it: su - root mount /mnt/cdrom Use the oracle terminal that you prepared for ssh at Automating Authentication for oracle ssh Logins and execute runInstaller: oracle$ /mnt/cdrom/runInstaller
- Welcome Screen: Click Next - Inventory directory and credentials: Click Next - Unix Group Name: Use "oinstall". - Root Script Window: Open another window, login as root, and run /tmp/orainstRoot.sh on the node where you launched runInstaller. After you've run the script, click Continue. - File Locations: I used the recommended default values: Destination Name: OraCr10g_home1 Destination Path: /orabase/product/10.1.0/crs_1 Click Next - Language Selection: Click Next - Cluster Configuration: Cluster Name: crs Cluster Nodes: Public Node Name: test1 Private Node Name: rac1 Public Node Name: test2 Private Node Name: rac2 Click Next - Private Interconnect Enforcement: Interface Name: eth0 Subnet: 123.45.67.86 Interface Type: Public Interface Name: eth1 Subnet: 192.168.0.1 Interface Type: Private Click Next - Oracle Cluster Registry: OCR Location: /oradata/orcl/OCRFile Click Next - Voting Disk: Voting disk file name: /oradata/orcl/CSSFile Click Next - Root Script Window: Open another window, login as root, and execute /orabase/product/oraInventory/orainstRoot.sh on ALL RAC Nodes! NOTE: For any reason Oracle does not create the log directory "/orabase/product/10.1.0/crs_1/log". If there are problems with CRS, it will create log files in this directory, but only if it exists. Therefore make sure to create this directory as oracle: oracle$ mkdir /orabase/product/10.1.0/crs_1/log After you've run the script, click Continue. - Setup Privileges Script Window: Open another window, login as root, and execute /orabase/product/10.1.0/crs_1/root.sh on ALL RAC Nodes one by one! Note that his can take a while. On the last RAC node, the output of the script was as follows: ... CSS is active on these nodes. test1 test2 CSS is active on all nodes. Oracle CRS stack installed and running under init(1M) Click OK - Summary: Click Install - When installation is completed, click Exit. One way to verify the CRS installation is to display all the nodes where CRS was installed: oracle$ /u01/app/oracle/product/10.1.0/crs_1/bin/olsnodes -n
The result should be: test1 1 test2 2 10. Installing Oracle Database 10G with CRS (Don't create database at this time) Reference: In order to install the Oracle Database 10g R1 (10.1.0.2) Software with Real Application Clusters (RAC) on all RAC nodes, OUI has to be launched on only one RAC node. In my example I will run OUI on test1 To install the RAC Database software, insert the Oracle Database 10g R1 (10.1.0.2) CD ("ship.db.cpio.gz"), and mount: su - root mount /mnt/cdrom Use the oracle terminal that you prepared for ssh at Automating Authentication for oracle ssh Logins, and execute runInstaller: oracle$ /mnt/cdrom/runInstaller
- Welcome Screen: Click Next - File Locations: I used the default values: Destination Name: OraDb10g_home1 Destination Path: /orabase/product/10.1.0/db_1 Click Next. - Hardware Cluster Installation Mode: Select "Cluster Installation" Click "Select All" to select all servers: test1, test2 Click Next NOTE: If it stops here and the status of a RAC node is "Node not reachable", then perform the following checks: - Check if the node where you launched OUI is able to do ssh without a passphrase to the RAC node where the status is set to "Node not reachable". - Check if the CRS is running this RAC node. - Installation Type: I selected "Enterprise Edition". Click Next. - Product-specific Prerequisite Checks: Make sure that the status of each Check is set to "Succeeded". Click Next - Database Configuration: I selected "Do not create a starter database" since we have to create the database with dbca. Oracle Database 10g R1 (10.1) OUI will not be able to discover disks that are marked as Linux ASMLib. For more information, see Click Next - Summary: Click Install - Setup Privileges Window: Open another window, login as root, and execute /orabase/product/10.1.0/db_1/root.sh on ALL RAC Nodes one by one! NOTE: Make also sure that X is relinked to your local desktop since this script will launch the "VIP Configuration Assistant" tool which is a GUI based utility! VIP Configuration Assistant Tool: Note: 1. From metalink, the virtual ip address should be static address; 2. We encountered some new bugs at this time. please reconfig the parameter LD_ASSUME_KERNEL=2.4.19 first ; 3. You may encounter some issue which causes vipca can't execute, The workaround is modify "vipca.sh" or launch it using the following mode: ./vipca.sh silent=false (This Assistant tool will come up only once when root.sh is executed the first time in your RAC cluster) - Welcome Click Next - Network Interfaces: I selected both interfaces, eth0 and eth1. Click Next - Virtual IPs for cluster notes: (for the alias names and IP address, see Setting Up the /etc/hosts File) Node Name: test1 IP Alias Name: test1-vip IP address: 123.45.67.86 Subnet Mask: xxx.xxx.xxx.xxx (modify it to adapt to real environment) Node Name: test2 IP Alias Name: test2-vip IP address: 123.45.67.87 Subnet Mask: xxx.xxx.xxx.xxx (modify it to adapt to real environment) Click Next - Summary: Click Finish - Configuration Assistant Progress Dialog: Click OK after configuration is complete. - Configuration Results: Click Exit Click OK to close the Setup Privilege Window. - End of Installation: Click Exit If OUI terminates abnormally (happend to me several times), or if anything else goes wrong, remove the following files/directories and start over again: su - oracle rm -rf /orabase/product/10.1.0/db_1
11. Creating RAC Databases with DBCA Reference: To install the RAC database and the instances on all RAC nodes, OUI has to be launched on only one RAC node. In my example I will run OUI on test1 Use the oracle terminal that you prepared for ssh at Automating Authentication for oracle ssh Logins, and execute dbca. But before you execute dbca, make sure that $ORACLE_HOME and $PATH are set: oracle$ . ~oracle/.bash_profile oracle$ dbca - Welcome Screen: Select "Oracle Real Application Clusters database" Click Next - Operations: Select "Create Database" Click Next - Node Selection: Click "Select All". Make sure all your RAC nodes show up and are selected! If dbca hangs here, then you probably didn't follow the steps as outlined at Automating Authentication for oracle ssh Logins Click Next - Database Templates: I selected "General Purpose". Click Next - Database Identification: Global Database Name: orcl SID Prefix: orcl Click Next - Management Option: I selected "Use Database Control for Database Management". Click Next - Database Credentials: I selected "Use the Same Password for All Accounts". Enter the password and make sure the password does not start with a digit number. Click Next - Storage Options: I selected "File System", see Click Next - Database File Locations: Select "Use Oracle BASE" Click Next - Recovery Configuration: Using recovery options like Flash Recovery Area is out of scope for this article. So I did not select any recovery options. Click Next - Database Content: I did not select Sample Schemas or Custom Scripts. Click Next - Database Services: Click "Add" and enter a Service Name: I entered "orcltest". I selected TAF Policy "Basic". Click Next - Initialization Parameters: Change settings as needed. Click Next - Database Storage: Change settings as needed. Click Next - Creation Options: Check "Create Database" Click Finish - Summary: Click OK Now the database is being created. The following error message came up: "Unable to copy the file "test2:/etc/oratab" to "/tmp/oratab.test2". I clicked "Ignore". I have to investigate this. Your RAC cluster should now be up and running. To verify, try to connect to each instance from one of the RAC nodes: $ sqlplus $ sqlplus
After you connected to an instance, enter the following SQL command to verify your connection: SQL> select instance_name from v$instance;
12. Configure parameter files (init.ora,listener.ora, tnsnames.ora): # Add by Leo Step 1: Create a new initialization parameter file 'init.ora' on host [test1], this file will be shared by each instance. 1. Invoke SQLPLUS, issue the following command: SQL> create pfile='/oradata/orcl/init.ora' from spfile; 2. open "init.ora", add the following lines: orcl1.local_listener=listener_orcl1 orcl2.local_listener=listener_orcl2 orcl1.remote_listener=listeners_orcl orcl2.remote_listener=listeners_orcl remove this line: *.remote_listener=listeners_orcl (* represents this parameter will be used by each instance) save the modification. 3. Invoke SQLPLUS again (generate a new spfile based on ) SQL> shutdown immediate; SQL> exit SQL> startup pfile='/oradata/orcl/init.ora' SQL> create spfile from pfile='/oradata/orcl/init.ora' SQL> shutdown immediate; SQL> exit SQL> startup (new spfile will be used this time) Step 2: Edit "listener.ora" on each node (location: $ORACLE_HOME/nerwork/admin)
Example: listener = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(host=123.45.67.86)(port=1522)) (ADDRESS = (PROTOCOL = TCP)(host=your-vip-addr)(port=1521)) ) sid_list_listener = (SID_LIST = (SID_DESC = (SID_NAME = PLSExtProc) (ORACLE_HOME = /orabase/product/10.1.0/db_1) (PROGRAM = extproc) ) (SID_DESC = (SID_NAME=orcl1) (ORACLE_HOME=/orabase/product/10.1.0/db_1) ) ) Note: . If you don't change the listener's default port(1521), It's unnecessary to add SID's entry into this file, Oracle background process PMON will discovery all available services every 60 seconds.
. If you use listener's default port(1521), It's unnecessary to config parameter 'instance_name.local_listener=xxx' in "init.ora" , you also shouldn't add this entry in "tnsnames.ora": local_listener_name = ... Step 3: Edit "tnsnames.ora" on each node (location: $ORACLE_HOME/nerwork/admin)
Example: listener_orcl1=(address=(protocol=tcp)(host=123.45.67.86)(port=1522)) listener_orcl2=(address=(protocol=tcp)(host=123.45.67.87)(port=1522)) listeners_orcl = (address_list = (address=(protocol=tcp)(host=test1-vip)(port=1521)) (address=(protocol=tcp)(host=test2-vip)(port=1521)) ) # database connection: orcl orcl = (description = (load_balance=on) (address=(protocol=tcp)(host=test1-vip)(port=1521)) (address=(protocol=tcp)(host=test2-vip)(port=1521)) (connect_data = (service_name=orcl) ) )
# instance connection: orcl1 orcl1 = (description = (address=(protocol=tcp)(host=test1-vip)(port=1521)) (address=(protocol=tcp)(host=test2-vip)(port=1521)) (load_balance = yes) (connect_data = (server=dedicated) (service_name=orcl) (instance_name=orcl1) ) )
# instance connection: orcl2 orcl2 = (description = (address=(protocol=tcp)(host=test1-vip)(port=1521)) (address=(protocol=tcp)(host=test2-vip)(port=1521)) (load_balance = yes) (connect_data = (server=dedicated) (service_name=orcl2) (failover_mode = (type = select) (method = basic) (retries=180) (delay=5) ) ) ) # TAF policy of PRECONNECT ORCLTEST_RAC = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = test1-vip)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = test2-vip)(PORT = 1521)) (LOAD_BALANCE = yes) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = orcl) (FAILOVER_MODE = (TYPE = SELECT) (METHOD = BASIC) (RETRIES = 180) (DELAY = 5) ) ) )
# external entry: extproc extproc_connection_orcl= (description= (addresslist= (address=(protocol=ipc)(key=extproc0)) (connect_data= (sid=PLSExtProc) (presentation=RO) ) )
13. Configure EM environment for RAC: # Add by Leo Steps: 1. Usually we only need run DBCA again to check the following things exist: . Check whether a schema "SYSMAN" has been created (select username from all_users); . Check if tablespace "SYSAUX" exists (select tablespace_name from dba_tablespaces); . Check if there is a directory has been created under $ORACLE_HOME; . Issue these commands to check dbconsole's status: emctl start(stop/status) dbconsole . Trouble Shooting: Q: Target's information is wrong or lost A: . Go to directory: $ORACLE_HOME/hostname_sid/emd, open "targets.xml" to check if all available targets have been discovered by EM; . If the file doesn't exist, You can try: emctl stop dbconsole; remove all files under these directories; $ORACLE_HOME/hostname_sid/emd/upload $ORACLE_HOME/hostname_sid/emd/emd check file , make sure that at least agent and host's info can be found from this file; emctl start dbconsole; 2. If you've modified some settings of EM and the repository has been created, you can execute "emca -r ..." to configure the EM repository.
3. Launch "" on all nodes to check it.
Oracle 10g RAC Issues, Problems and Errors This section describes other issues, problems and errors pertaining to installing Oracle 10g with RAC which has not been covered so far.
Gtk-WARNING **: libgdk_pixbuf.so.2: cannot open shared object file: No such file or directory
This error can come up when you run ocfstool. To fix this error, install the gdk-pixbuf RPM: rpm -ivh gdk-pixbuf-0.18.0-8.1.i386.rpm /orabase/product/10.1.0/crs_1/bin/crs_stat.bin: error while loading shared libraries: libstdc++-libc6.2-2.so.3: cannot open shared object file: No such file or directory /orabase/product/10.1.0/crs_1/bin/crs_stat.bin: error while loading shared libraries: libstdc++-libc6.2-2.so.3: cannot open shared object file: No such file or directory PRKR-1061 : Failed to run remote command to get node configuration for node test1pup PRKR-1061 : Failed to run remote command to get node configuration for node test1pup
This error can come up when you run root.sh. To fix this error, install the compat-libstdc++ RPM and rerun root.sh: rpm -ivh compat-libstdc++-7.3-2.96.122.i386.rpm mount: fs type ocfs not supported by kernel
The OCFS kernel module was not loaded. See Configuring and Loading OCFS for more information. ORA-00603: ORACLE server session terminated by fatal error or SQL> startup nomount ORA-29702: error occurred in Cluster Group Service operation If the trace file looks like this: /orabase/product/10.1.0/db_1/rdbms/log/orcl1_ora_7424.trc ... kgefec: fatal error 0 *** 2004-03-13 20:50:28.201 ksedmp: internal or fatal error ORA-00603: ORACLE server session terminated by fatal error ORA-27504: IPC error creating OSD context ORA-27300: OS system dependent operation:gethostbyname failed with status: 0 ORA-27301: OS failure message: Error 0 ORA-27302: failure occurred at: sskgxpmyip4 Current SQL information unavailable - no session. ----- Call Stack Trace ----- calling call entry argument values in hex location type point (? means dubious value) -------------------- -------- -------------------- ---------------------------- ksedmp()+493 call ksedst()+0 0 ? 0 ? 0 ? 1 ? 0 ? 0 ? ksfdmp()+14 call ksedmp()+0 3 ? BFFF783C ? A483593 ? BF305C0 ? 3 ? BFFF8310 ? Make sure that the name of the RAC node is not listed for the loopback address in the /etc/hosts file similar to this example: 127.0.0.1 test1 localhost.localdomain localhost The entry should rather look like this: 127.0.0.1 localhost.localdomain localhost |