RAC and Oracle Clusterware Best Practices and Starter Kit (Solaris)_811280.1

rongshiyuan發表於2014-11-09

RAC and Oracle Clusterware Best Practices and Starter Kit (Solaris) (文件 ID 811280.1)


In this Document

Purpose
Scope
Details
  RAC and Oracle Clusterware Best Practices and Starter Kit (Platform Independent)
   RAC Platform Specific Best Practices and Starter Kits
  RAC on Solaris Step by Step Installation Instructions
  RAC on Solaris Best Practices
  OS Configuration Considerations
  Virtualization Considerations
  Storage Considerations
  Network Considerations
  Hardware/Vendor Specific Considerations
  Oracle Software Considerations
  Community Discussions
References

Applies to:

Oracle Database - Enterprise Edition - Version 10.2.0.1 to 12.1.0.1 [Release 10.2 to 12.1]
Oracle Solaris on SPARC (64-bit)

Purpose


The goal of the Oracle Real Application Clusters (RAC) series of Best Practice and Starter Kit notes provides customers with quick knowledge transfer of generic and platform specific best practices for implementing, upgrading and maintaining an Oracle RAC system. This document is compiled and maintained based on Oracle's experience with its global RAC customer base.

This Starter Kit is not meant to replace or supplant the Oracle Documentation set, but rather, it is meant as a supplement to the same. It is imperative that the Oracle Documentation be read, understood, and referenced to provide answers to any questions that may not be clearly addressed by this Starter Kit.

All recommendations should be carefully reviewed by your own operations group and should only be implemented if the potential gain as measured against the associated risk warrants implementation. Risk assessments can only be made with a detailed knowledge of the system, application, and business environment.

As every customer environment is unique, the success of any Oracle Database implementation, including implementations of Oracle RAC, is predicated on a successful test environment. It is thus imperative that any recommendations from this Starter Kit are thoroughly tested and validated using a testing environment that is a replica of the target production environment before being implemented in the production environment to ensure that there is no negative impact associated with the recommendations that are made.

Scope

This article applies to all new and existing RAC implementations as well as RAC upgrades.

Details

RAC and Oracle Clusterware Best Practices and Starter Kit (Platform Independent)

The following document focuses on RAC and Oracle Clusterware Best Practices that are applicable to all platforms including a white paper on available RAC System Load Testing Tools and RAC System Test Plan outlines for 10gR2 & 11gR1 and 11gR2:

Document 810394.1 RAC and Oracle Clusterware Best Practices and Starter Kit (Platform Independent)

 

 RAC Platform Specific Best Practices and Starter Kits

The following notes contain detailed platform specific best practices including Step-By-Step installation cookbooks (downloadable in PDF format):

Document 811306.1 RAC and Oracle Clusterware Best Practices and Starter Kit (Linux)
Document 811280.1 RAC and Oracle Clusterware Best Practices and Starter Kit (Solaris)
Document 811271.1 RAC and Oracle Clusterware Best Practices and Starter Kit (Windows)
Document 811293.1 RAC and Oracle Clusterware Best Practices and Starter Kit (AIX)
Document 811303.1 RAC and Oracle Clusterware Best Practices and Starter Kit (HP-UX)

 

RAC on Solaris Step by Step Installation Instructions

Click here for a Step By Step guide for installing Oracle RAC 10gR2 on Solaris.
Click here for a Step By Step guide for installing Oracle RAC 11gR1 on Solaris.
Click here for a Step By Step guide for installing Oracle RAC 11gR2 on Solaris.

 

RAC on Solaris Best Practices

The Best Practices in this section are specific to the Solaris Platform. That said, it is essential that the Platform Independent Best Practices found in Document 810394.1 be reviewed in addition to the content provided in this Document.

OS Configuration Considerations

  • Validate your hardware/software configuration against the RAC Technologies Matrix for Unix.

  • To proactively avoid/prevent the issues associated with DISM, it is highly recommended to disable the use of DISM (Dynamic Intimate Shared Memory) on ALL instances having an SGA larger than 4GB.  See Document 1606318.1 for details.

  • 11gR2 requires Solaris 10 update 6 or greater. Reference Document 971464.1.

  • Installations of 11.2.0.2 Grid Infrastructure on Solaris 10 Update 10 may fail when loading the ADVM driver. Proactive action on this issue can be taking by installing Solaris CR 7075118 prior to installation. This issue ONLY impacts 11.2.0.2 installations on Solaris 10 Update 10, GI 11.2.0.2.4 and above have a fix (Bug# 12614853) to address this issue within GI. Additional information can be found in Document 1346207.1.

  • Ensure all required OS packages are installed and system prerequisites have been properly implemented for your particular release of Oracle.  This information is documented in Document 169706.1 as well as the install guides for your particular release.

  • In pre-11gR2 clusters, system times are to be synchronized across cluster nodes using NTPD and NTPD should be configured to slew time to prevent false reboots.  Configure NTP client as per Document 759143.1 to take corrective action on this issue.

  • With 11gR2, Cluster Time Synchronization Daemon (CTSSD) can be used in place of NTPD. CTSSD will synchronize time with a reference node in the cluster when an NTPD is not found to be configured. Should you require synchronization from an external time source you must use NTPD which will cause CTSSD to run in "observer" mode. However, if NTP is running, then it must to be configured with the slewing option as documented in the Oracle Grid Infrastructure Installation Guide 11g Release 2 (11.2) for Oracle Solaris - Section - 2.13 Network Time Protocol Setting.

  • Ensure SunOS 5.10 kernel patch 127111-03 or higher is installed in order to address a condition with a deadlock in the Solaris 10 paging algorithm. Reference Document 460424.1.

  • There is an interoperability issue between Solaris 10's memory management and Oracle's Automatic PGA Memory Management in a Solaris 10 environment with a large number of CPUs. The issue causes CPU spikes happening at random intervals and increases system cpu time.  Corrective action can be found in Document 460424.1.

  • OS panics with "segspt_free_pages: bad large page" when stopping Oracle Clusterware, refer to note 1392254.1 for details.

  • If explorer output is requested by SUN Support, DO NOT execute with the "-w default,all" arguments on a node where Oracle Clusterware is running.  This will cause a node eviction.  Instead execute with the following arguments: "-w !proc".  See Document 1487321.1 (RAC on Solaris: Node crashes after running explorer -w all) for additional information.

  • Ensure that the Solaris sun4v Deadman Panic issue documented in Document 1411516.1 is proactively addressed by installing the appropriate patches for the given Solaris SPARC release as documented in Document 1411516.1 (Solaris 10 and Solaris 11 Systems on sun4v Platforms May Hang or Encounter a Deadman Panic).

Virtualization Considerations

Storage Considerations

  • Skip the first 1Mb when creating raw device to avoid overwriting the disk VTOC.  See Document 367715.1 for details.

  • When using NFS, ensure that the correct mount options are used. The proper mount options for NFS are defined in Document 359515.1.

  • Use Solaris MPxIO or other 3rd party IO multi-pathing to protect against HBA or SAN switch failures.

  • Make sure Solaris SAN Foundation Kit (SFK) patches are installed for Solaris 8 and Solaris 9. Reference Document 392639.1

Network Considerations

  • For 11.2.0.2 (GI and RDBMS) and above, it is highly recommended that Oracle Redundant Interconnect/HAIP be used for interconnect redundancy.  See Document 1210883.1 for details.

  • When using Sun Cluster, Oracle Clusterware 11.2.0.2 and above will detect the presence of clprivnet0 and not enable any HAIPs. The Redundant Interconnect Usage feature is thereby disabled. clprivnet0 will provide the required availability for redundant network interfaces used for the interconnect.

  • For pre-11.2.0.2 environments use IPMP NIC redundancy for the private interconnect.  Configuration details can be found in the following notes:
    •  For 11.2.0.1 see Document 1069584.1 - Solaris IPMP and Trunking for the cluster interconnect in Oracle Grid Infrastructure 11g Rel. 2
    •  For pre-11gR2 see Document 368464.1 - How to Setup IPMP as Cluster Interconnect

  • If public network failover is required, IPMP My be used - Configuration details can be found in Document 730732.1.  Oracle recommends enabling "local VIP failover" via adding all viable public networks (must reside on the same subnet as the VIPs) to the network resource.

  • The Solaris OS default setting for udp_recv_hiwat and udp_xmit_hiwat are too low. For heavy cluster interconnect traffic, Oracle has increased the recommended value of both kernel parameters to at least 65536 to improve UDP throughput.  See the Oracle Grid Infrastructure Installation Guide 11g Release 2 (11.2) for Solaris Operating System (same applies to pre-11gR2 installations).

  • For Solaris versions earlier than Solaris 10, tune the sq_max_size to protect against dropped packets on the interconnect.  See Document 1004755.1 for details.

  • Add “set ce:ce_taskq_disable=1” to /etc/system if the ce network interface is used for cluster interconnect to prevent frequent node reboots. Reference Document 437420.1

  • E1000 NIC can cause kernel panic or CSS packet corruption on Solaris 10 and patch 118833-30 or later for SPARC Platform is required.

  • Jumbo frames with MTU 9000 may cause problems with HP NC7170 drivers.

Hardware/Vendor Specific Considerations

  • Ensure minimum BIOS version 2.35.3.3 is used for SUN V40Z DUAL CORE machines, for ECC memory checking.

  • Ensure SUN V40Z 2.6V memory management voltage regulator issues. A SUN CE can identify if the voltage regulator is beginning to fail. The new VRM (Voltage Regulator Module) revision board from rev 1.0 to rev 2.0.

  • T-Series machines on SUN use a ‘threading model’ to make it look like there are more CPUs then the machine actually has. On these Cool Threads machines, (the example here is a machine with 32 cores), it appears to Oracle that there are actually 8 times as many CPU’s (8 threads per core) - so we automatically set CPU_COUNT to 256, and then the DB will not open. The workaround here is to manually set the CPU_COUNT to a reasonable value (i.e. 16, or 32, etc, depending on the # of actual cores).

  • Ensure the network cards don't use the shared PCI-X bus slots 2 and 3 on a SUN V40Z.

Oracle Software Considerations

The Software Considerations in this section are specific to the Solaris Platform. That said, it is highly recommended that the Platform Independent Best Practices found in Document 810394.1 be reviewed.

  • For 11.2.0.1 where IPMP is used for public and/or cluster interconnect, critical merge Patch 10094017 should be applied to both Grid Infrastructure and RDBMS Oracle homes.  See Document 1069254.1 for details.

  • If using IPMP for the cluster interconnect in an 11.2.0.2 environment be sure to take corrective action on Bug 10357258 (Many HAIP created after active NIC fails in IPMP) by applying Patch 12666373 (includes the 11.2.0.2.2 GI PSU).   This issue is fixed in 11.2.0.2.3 (GI PSU3).

  • For 11.2.0.2 when using Jumbo Frames for the interconnect HAIP may not start with the proper MTU size, see Document 1290585.1.  It is recommended to apply Patch 12666373 (includes the 11.2.0.2.2 GI PSU) which corrects the following issues (in addition to the 11.2.0.2.2 fixes): 
    • Bug 10357258 - [IPMP] HUNDREDS OF DUP IP AFTER INTRA-NODE FAILOVER 
    • Bug 9795321 - MTU SIZE FOR VIP UNDER 11GR2 GRID INFRASTRUCTURE
      Note: Bug 10357258 is corrected in GI PSU3 (11.2.0.2.3), Bug 9795321 is not corrected in 11.2.0.2.3 at the time of this writing (workaround by disabling Jumbo Frames).  Both bugs are fixed in the upcoming 11.2.0.2.4 GI PSU.

 

Community Discussions



References

NOTE:810394.1 - RAC and Oracle Clusterware Best Practices and Starter Kit (Platform Independent)
NOTE:811271.1 - RAC and Oracle Clusterware Best Practices and Starter Kit (Windows)
NOTE:759143.1 - NTP leap second event causing Oracle Clusterware node reboot
NOTE:811280.1 - RAC and Oracle Clusterware Best Practices and Starter Kit (Solaris)
NOTE:811293.1 - RAC and Oracle Clusterware Best Practices and Starter Kit (AIX)
NOTE:811306.1 - RAC and Oracle Clusterware Best Practices and Starter Kit (Linux)
NOTE:1060645.1 - Solaris: 11.2.0.1 root.sh Fails to Create OCR
NOTE:971464.1 - FAQ - 11gR2 requires Solaris 10 update 6 or greater
NOTE:551704.1 - Linux OS Service 'ntpd'
NOTE:460424.1 - Solaris 10 memory management conflicts with Automatic PGA Memory Management

BUG:9795321 - MTU SIZE FOR VIP UNDER 11GR2 GRID INFRASTRUCTURE
NOTE:169706.1 - Oracle Database (RDBMS) on Unix AIX,HP-UX,Linux,Mac OS X,Solaris,Tru64 Unix Operating Systems Installation and Configuration Requirements Quick Reference (8.0.5 to 11.2)
NOTE:1004755.1 - Solaris[TM] Operating System: Tuning the sq_max_size Parameter
NOTE:1069254.1 - Solaris: 11gR2 VIP / SCAN VIP and Dependent Resources Offline after Active Public NIC in IPMP Group Fails
NOTE:1069584.1 - Solaris IPMP and Trunking for the cluster interconnect in Oracle Grid Infrastructure 11g Rel. 2
NOTE:359515.1 - Mount Options for Oracle files when used with NFS on NAS devices
NOTE:367715.1 - Failed To Upgrade Oracle Cluster Registry Configuration When Running Root.Sh
NOTE:368464.1 - How to Setup IPMP as Cluster Interconnect
NOTE:392639.1 - Where To Find The Required Patches For 10gR2 On Solaris 9 When Using A SAN And MPxIO
NOTE:437420.1 - Frequently Node Reboots, There Is Nothing In The Logs
NOTE:811303.1 - RAC and Oracle Clusterware Best Practices and Starter Kit (HP-UX)
NOTE:1210883.1 - Grid Infrastructure Redundant Interconnect and ora.cluster_interconnect.haip
NOTE:1290585.1 - Solaris: Wrong MTU Size for VIP or HAIP
NOTE:1346207.1 - 11.2.0.2 Grid Infrastructure root.sh or rootupgrade.sh Fails on Solaris 10 Update 10
NOTE:1392254.1 - Solaris Panics with "segspt_free_pages: bad large page"
NOTE:730732.1 - How to Configure Solaris Link-based IPMP for Oracle VIP
NOTE:1478482.1 - ASM DISKGROUP WITH ORA-27063 and SVR4 ERROR: 5: I/O ERROR ON SUN SOLARIS
 

文件詳細資訊

 
為此文件評級 通過電子郵件傳送此文件的連結在新視窗中開啟文件可列印頁
型別:
狀態:
上次主更新:
上次更新:
語言:
BULLETIN
PUBLISHED
2014-6-16
2014-6-16
English簡體中文日本語???
     
 

相關產品

 
     
 

資訊中心

 
     
 

文件引用

 
     
 

最近檢視

 
     
 

相關內容

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/17252115/viewspace-1326290/,如需轉載,請註明出處,否則將追究法律責任。

相關文章