【轉】How to recover from root.sh on 11.2 Grid Infrastructure Failed
從10g的clusterware到11g Release2的Grid Infrastructure,Oracle往RAC這個框架裡塞進了太多東西。雖然照著Step by Step Installation指南步步為營地去安裝11.2.0.1的GI,但在實際執行root.sh指令碼的時候,不免又要出現這樣那樣的錯誤。例如下面的一例:
[root@rh3 grid]# ./root.sh Running Oracle 11g root.sh script... The following environment variables are set as: ORACLE_OWNER= maclean ORACLE_HOME= /u01/app/11.2.0/grid Enter the full pathname of the local bin directory: [/usr/local/bin]: The file "dbhome" already exists in /usr/local/bin. Overwrite it? (y/n) [n]: The file "oraenv" already exists in /usr/local/bin. Overwrite it? (y/n) [n]: The file "coraenv" already exists in /usr/local/bin. Overwrite it? (y/n) [n]: Entries will be added to the /etc/oratab file as needed by Database Configuration Assistant when a database is created Finished running generic part of root.sh script. Now product-specific root actions will be performed. 2011-03-28 20:43:13: Parsing the host name 2011-03-28 20:43:13: Checking for super user privileges 2011-03-28 20:43:13: User has super user privileges Using configuration parameter file: /u01/app/11.2.0/grid/crs/install/crsconfig_params LOCAL ADD MODE Creating OCR keys for user 'root', privgrp 'root'.. Operation successful. Adding daemon to inittab CRS-4123: Oracle High Availability Services has been started. ohasd is starting ADVM/ACFS is not supported on oraclelinux-release-5-5.0.2
一個節點上的root.sh指令碼執行居然提示說ADVM/ACFS不支援OEL 5.5,但實際上Redhat 5或者OEL 5是目前僅有的少數支援ACFS的平臺(The ACFS install would be on a supported Linux release – either Oracle Enterprise Linux 5 or Red Hat 5)。
檢索Metalink發現這是一個Linux平臺上的。
因為以上Not Supported錯誤資訊在另外一個節點(也是Enterprise Linux Enterprise Linux Server release 5.5 (Carthage)) 執行root.sh指令碼時並未出現,那麼一般只要找出2個節點間的差異就可能解決問題了:
未出錯節點上release相關rpm包的情況 [maclean@rh6 tmp]$ cat /etc/issue Enterprise Linux Enterprise Linux Server release 5.5 (Carthage) Kernel \r on an \m [maclean@rh6 tmp]$ rpm -qa|grep release enterprise-release-notes-5Server-17 enterprise-release-5-0.0.22 出錯節點上release相關rpm包的情況 [root@rh3 tmp]# rpm -qa | grep release oraclelinux-release-5-5.0.2 enterprise-release-5-0.0.22 enterprise-release-notes-5Server-17
以上可以看到相比起沒有出錯的節點,出錯節點上多安裝了一個名為oraclelinux-release-5-5.0.2的rpm包,我們嘗試來解除安裝該rpm是否能解決問題,補充實際上該問題也可以透過修改/tmp/.linux_release檔案的內容為enterprise-release-5-0.0.17來解決,而無需如我們這裡做的解除安裝名為oraclelinux-release-5*的rpm軟體包:
[root@rh3 install]# rpm -e oraclelinux-release-5-5.0.2 [root@rh3 grid]# ./root.sh Running Oracle 11g root.sh script... The following environment variables are set as: ORACLE_OWNER= maclean ORACLE_HOME= /u01/app/11.2.0/grid Enter the full pathname of the local bin directory: [/usr/local/bin]: The file "dbhome" already exists in /usr/local/bin. Overwrite it? (y/n) [n]: The file "oraenv" already exists in /usr/local/bin. Overwrite it? (y/n) [n]: The file "coraenv" already exists in /usr/local/bin. Overwrite it? (y/n) [n]: Entries will be added to the /etc/oratab file as needed by Database Configuration Assistant when a database is created Finished running generic part of root.sh script. Now product-specific root actions will be performed. 2011-03-28 20:57:21: Parsing the host name 2011-03-28 20:57:21: Checking for super user privileges 2011-03-28 20:57:21: User has super user privileges Using configuration parameter file: /u01/app/11.2.0/grid/crs/install/crsconfig_params CRS is already configured on this node for crshome=0 Cannot configure two CRS instances on the same cluster. Please deconfigure before proceeding with the configuration of new home.
再次在失敗節點上執行root.sh,被提示告知需要首先deconfigure然後才能再次配置。在官方文件
/* 同為管理Grid Infra所以仍需要root使用者來執行以下操作 */ [root@rh3 grid]# pwd /u01/app/11.2.0/grid /* 目前位於GRID_HOME目錄下 */ [root@rh3 grid]# cd crs/install /* 以-deconfig選項執行一個名為rootcrs.pl的指令碼 */ [root@rh3 install]# ./rootcrs.pl -deconfig 2011-03-28 21:03:05: Parsing the host name 2011-03-28 21:03:05: Checking for super user privileges 2011-03-28 21:03:05: User has super user privileges Using configuration parameter file: ./crsconfig_params VIP exists.:rh3 VIP exists.: //192.168.1.105/255.255.255.0/eth0 VIP exists.:rh6 VIP exists.: //192.168.1.103/255.255.255.0/eth0 GSD exists. ONS daemon exists. Local port 6100, remote port 6200 eONS daemon exists. Multicast port 20796, multicast IP address 234.227.83.81, listening port 2016 Please confirm that you intend to remove the VIPs rh3 (y/[n]) y ACFS-9200: Supported CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rh3' CRS-2673: Attempting to stop 'ora.crsd' on 'rh3' CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'rh3' CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'rh3' CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'rh3' succeeded CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'rh3' has completed CRS-2677: Stop of 'ora.crsd' on 'rh3' succeeded CRS-2673: Attempting to stop 'ora.mdnsd' on 'rh3' CRS-2673: Attempting to stop 'ora.gpnpd' on 'rh3' CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'rh3' CRS-2673: Attempting to stop 'ora.ctssd' on 'rh3' CRS-2673: Attempting to stop 'ora.evmd' on 'rh3' CRS-2677: Stop of 'ora.cssdmonitor' on 'rh3' succeeded CRS-2677: Stop of 'ora.mdnsd' on 'rh3' succeeded CRS-2677: Stop of 'ora.gpnpd' on 'rh3' succeeded CRS-2677: Stop of 'ora.evmd' on 'rh3' succeeded CRS-2677: Stop of 'ora.ctssd' on 'rh3' succeeded CRS-2673: Attempting to stop 'ora.cssd' on 'rh3' CRS-2677: Stop of 'ora.cssd' on 'rh3' succeeded CRS-2673: Attempting to stop 'ora.diskmon' on 'rh3' CRS-2673: Attempting to stop 'ora.gipcd' on 'rh3' CRS-2677: Stop of 'ora.gipcd' on 'rh3' succeeded CRS-2677: Stop of 'ora.diskmon' on 'rh3' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rh3' has completed CRS-4133: Oracle High Availability Services has been stopped. Successfully deconfigured Oracle clusterware stack on this node /* 如果以上deconfig操作未能成功反向配置那麼可以以-FORCE選項執行rootcrs.pl指令碼 */ [root@rh3 install]# ./rootcrs.pl -deconfig -force 2011-03-28 21:41:00: Parsing the host name 2011-03-28 21:41:00: Checking for super user privileges 2011-03-28 21:41:00: User has super user privileges Using configuration parameter file: ./crsconfig_params VIP exists.:rh3 VIP exists.: //192.168.1.105/255.255.255.0/eth0 VIP exists.:rh6 VIP exists.: //192.168.1.103/255.255.255.0/eth0 GSD exists. ONS daemon exists. Local port 6100, remote port 6200 eONS daemon exists. Multicast port 20796, multicast IP address 234.227.83.81, listening port 2016 ACFS-9200: Supported CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rh3' CRS-2673: Attempting to stop 'ora.crsd' on 'rh3' CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'rh3' CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'rh3' CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'rh3' succeeded CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'rh3' has completed CRS-2677: Stop of 'ora.crsd' on 'rh3' succeeded CRS-2673: Attempting to stop 'ora.mdnsd' on 'rh3' CRS-2673: Attempting to stop 'ora.gpnpd' on 'rh3' CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'rh3' CRS-2673: Attempting to stop 'ora.ctssd' on 'rh3' CRS-2673: Attempting to stop 'ora.evmd' on 'rh3' CRS-2677: Stop of 'ora.cssdmonitor' on 'rh3' succeeded CRS-2677: Stop of 'ora.mdnsd' on 'rh3' succeeded CRS-2677: Stop of 'ora.gpnpd' on 'rh3' succeeded CRS-2677: Stop of 'ora.evmd' on 'rh3' succeeded CRS-2677: Stop of 'ora.ctssd' on 'rh3' succeeded CRS-2673: Attempting to stop 'ora.cssd' on 'rh3' CRS-2677: Stop of 'ora.cssd' on 'rh3' succeeded CRS-2673: Attempting to stop 'ora.diskmon' on 'rh3' CRS-2673: Attempting to stop 'ora.gipcd' on 'rh3' CRS-2677: Stop of 'ora.gipcd' on 'rh3' succeeded CRS-2677: Stop of 'ora.diskmon' on 'rh3' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rh3' has completed CRS-4133: Oracle High Availability Services has been stopped. Successfully deconfigured Oracle clusterware stack on this node /* 所幸以上這招總是能夠奏效,否則豈不是每次都要完全解除安裝後重新安裝GI? */
順利完成以上反向配置CRS後,就可以再次嘗試執行多災多難的root.sh了:
[root@rh3 grid]# pwd /u01/app/11.2.0/grid [root@rh3 grid]# ./root.sh Running Oracle 11g root.sh script... The following environment variables are set as: ORACLE_OWNER= maclean ORACLE_HOME= /u01/app/11.2.0/grid Enter the full pathname of the local bin directory: [/usr/local/bin]: The file "dbhome" already exists in /usr/local/bin. Overwrite it? (y/n) [n]: The file "oraenv" already exists in /usr/local/bin. Overwrite it? (y/n) [n]: The file "coraenv" already exists in /usr/local/bin. Overwrite it? (y/n) [n]: Entries will be added to the /etc/oratab file as needed by Database Configuration Assistant when a database is created Finished running generic part of root.sh script. Now product-specific root actions will be performed. 2011-03-28 21:07:29: Parsing the host name 2011-03-28 21:07:29: Checking for super user privileges 2011-03-28 21:07:29: User has super user privileges Using configuration parameter file: /u01/app/11.2.0/grid/crs/install/crsconfig_params LOCAL ADD MODE Creating OCR keys for user 'root', privgrp 'root'.. Operation successful. Adding daemon to inittab CRS-4123: Oracle High Availability Services has been started. ohasd is starting FATAL: Module oracleoks not found. FATAL: Module oracleadvm not found. FATAL: Module oracleacfs not found. acfsroot: ACFS-9121: Failed to detect /dev/asm/.asm_ctl_spec. acfsroot: ACFS-9310: ADVM/ACFS installation failed. acfsroot: ACFS-9311: not all components were detected after the installation. CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node rh6, number 1, and is terminating An active cluster was found during exclusive startup, restarting to join the cluster CRS-2672: Attempting to start 'ora.mdnsd' on 'rh3' CRS-2676: Start of 'ora.mdnsd' on 'rh3' succeeded CRS-2672: Attempting to start 'ora.gipcd' on 'rh3' CRS-2676: Start of 'ora.gipcd' on 'rh3' succeeded CRS-2672: Attempting to start 'ora.gpnpd' on 'rh3' CRS-2676: Start of 'ora.gpnpd' on 'rh3' succeeded CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rh3' CRS-2676: Start of 'ora.cssdmonitor' on 'rh3' succeeded CRS-2672: Attempting to start 'ora.cssd' on 'rh3' CRS-2672: Attempting to start 'ora.diskmon' on 'rh3' CRS-2676: Start of 'ora.diskmon' on 'rh3' succeeded CRS-2676: Start of 'ora.cssd' on 'rh3' succeeded CRS-2672: Attempting to start 'ora.ctssd' on 'rh3' CRS-2676: Start of 'ora.ctssd' on 'rh3' succeeded CRS-2672: Attempting to start 'ora.crsd' on 'rh3' CRS-2676: Start of 'ora.crsd' on 'rh3' succeeded CRS-2672: Attempting to start 'ora.evmd' on 'rh3' CRS-2676: Start of 'ora.evmd' on 'rh3' succeeded /u01/app/11.2.0/grid/bin/srvctl start vip -i rh3 ... failed Preparing packages for installation... cvuqdisk-1.0.7-1 Configure Oracle Grid Infrastructure for a Cluster ... failed Updating inventory properties for clusterware Starting Oracle Universal Installer... Checking swap space: must be greater than 500 MB. Actual 5023 MB Passed The inventory pointer is located at /etc/oraInst.loc The inventory is located at /s01/oraInventory 'UpdateNodeList' was successful.
以上雖然繞過了”ADVM/ACFS is not supported”的問題,但又出現了”FATAL: Module oracleoks/oracleadvm/oracleacfs not found”,Linux下ACFS/ADVM相關載入Module無法找到的問題,查了下metalink發現這是GI 11.2.0.2中2個被確認的或,而實際我所安裝的是11.2.0.1版本的GI…….. 好了,所幸我目前的環境是使用NFS的儲存,所以如ADVM/ACFS這些儲存選項的問題可以忽略不計,準備在11.2.0.2上再測試下。
不得不說11.2.0.1版本GI的安裝存在太多的問題,以至於Oracle Support不得不撰寫了不少相關故障診斷的文件,例如:,。目前為止還沒體驗過11.2.0.2的GI,希望它不像上一個版本那麼糟糕!
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/7191998/viewspace-772226/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 【MOS】How to backup or restore OLR in 11.2/12c Grid InfrastructureRESTASTStruct
- 11.2.0.1 Grid Infrastructure Installation Failed at .... While Running root.shASTStructAIWhile
- How to Troubleshoot Grid Infrastructure Startup IssuesASTStruct
- How to recover from USB pipe errorsError
- Database Creation on 11.2 Grid Infrastructure with Role SeparationDatabaseASTStruct
- How to restore and recover a database from an RMAN backup_881395.1RESTDatabase
- oracle linux 11.2 rac grid infrastructure add scan ipOracleLinuxASTStruct
- redhat linux 11.2 rac grid infrastructure add scan ipRedhatLinuxASTStruct
- backup or restore OLR in 11.2 Grid Infrastructure (Doc ID 1193643.1)RESTASTStruct
- Oracle 11R2 Grid Infrastructure執行root.sh指令碼rootcrs.pl execution failed的處理OracleASTStruct指令碼AI
- How to Troubleshoot Grid Infrastructure Startup Issues [ID 1050908.1]ASTStruct
- 【RAC】How to Troubleshoot Grid Infrastructure Startup Issues [ID 1050908.1]ASTStruct
- Pre 11.2 Database Issues in 11gR2 Grid Infrastructure Environment_948456.1DatabaseASTStruct
- How to Configure A Second Listener on a Separate Network in 11.2 Grid
- [轉]How to release space from databaseDatabase
- Oracle Grid Infrastructure for a Standalone ServerOracleASTStructServer
- clone grid INfrastructure Home and clusterwareASTStruct
- Oracle Clusterware and Oracle Grid InfrastructureOracleASTStruct
- Oracle grid infrastructure 解除安裝OracleASTStruct
- DNS and DHCP Setup Example for Grid Infrastructure GNSDNSASTStruct
- 記錄下 patch Grid Infrastructure for StandaloneASTStruct
- how to clean failed crsAI
- Oracle:GRID 下 root.sh 指令碼Oracle指令碼
- Oracle Grid Infrastructure Patch Set Update 11.2.0.4.3OracleASTStruct
- 12c-How To Recover Root ContainerAI
- How to recover windows volume control?Windows
- 【Oracle】11gR2的安裝Grid執行root.sh出現ohasd failed解決方案OracleAI
- 安裝grid時如何回退root.sh
- 重新配置 11gR2 Grid InfrastructureASTStruct
- 【MOS】Top 5 Grid Infrastructure Startup Issues (文件 ID 1368382.1)ASTStruct
- Apply PSU for Grid Infrastructure Standalone and DB with Oracle RestartAPPASTStructOracleREST
- 升級Grid Infrastructure到10.2.0.2 遭遇bug 9413827ASTStruct
- 11gr2安裝Grid 時 node2 root.sh失敗Start of resource “ora.asm -init” failedASMAI
- How to migrate data from Oracle to MSSQLSERVEROracleSQLServer
- How to audit failed logon attemptsAIGo
- Troubleshoot Grid Infrastructure Startup Issues (Doc ID 1050908.1)ASTStruct
- 【GRID】Grid Infrastructure 啟動的五大問題 (Doc ID 1526147.1)ASTStruct
- how to move a MediaWiki wiki from one server to anotherServer