Solaris10下Silent模式安裝Oracle1106RAC環境(八)

dbhelper發表於2015-01-22

主機環境基本上和前面文章中描述的SOLARIS10安裝Oracle1106rac的環境基本完全一致,最主要的區別在於沒有了VOLUMN CLUSTER MANAGER,於是這裡打算使用OracleASM。由於安裝操作沒有什麼區別,所以這次安裝選擇了SILENT靜默模式來安裝RAC

描述安裝過程中碰到的問題。

Solaris10Silent模式安裝Oracle1106RAC環境(一):http://yangtingkun.itpub.net/post/468/477442

Solaris10Silent模式安裝Oracle1106RAC環境(二):http://yangtingkun.itpub.net/post/468/477443

Solaris10Silent模式安裝Oracle1106RAC環境(三):http://yangtingkun.itpub.net/post/468/477444

Solaris10Silent模式安裝Oracle1106RAC環境(四):http://yangtingkun.itpub.net/post/468/477446

Solaris10Silent模式安裝Oracle1106RAC環境(五):http://yangtingkun.itpub.net/post/468/477447

Solaris10Silent模式安裝Oracle1106RAC環境(六):http://yangtingkun.itpub.net/post/468/477448

Solaris10Silent模式安裝Oracle1106RAC環境(七):http://yangtingkun.itpub.net/post/468/477600

 

 

雖然Silent安裝和RAC11G的安裝都不只一次,但是RACSILENT模式安裝還是第一次,而且SILENT安裝ASM也是第一次,所以碰到一些問題也是在所難免。

第一個錯誤其實是粗心造成的,在Silent模式安裝Cluster的時候,報錯錯誤資訊為:

$ ./runInstaller -silent -noconfig -responseFile /data/cluster/response/my_crs.rsp
Starting Oracle Universal Installer...

Checking Temp space: must be greater than 180 MB.   Actual 57590 MB    Passed
Checking swap space: must be greater than 150 MB.   Actual 57802 MB    Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2008-08-29_01-52-22PM. Please wait ...$ Oracle Universal Installer, Version 11.1.0.6.0 Production
Copyright (C) 1999, 2007, Oracle. All rights reserved.

OUI-10203:The specified response file '/data/cluster/response/my_crs.rsp' is not found. Make sure that the response file specified exists and you have read privileges to this file.

檢查metalink,發現問題是由於RESPONSE引數FROM_LOCATION的設定不正確造成的。由於這個引數當時是採用相對路徑,個人認為預設值就應該是正確的,因此沒有去驗證。

將其設定為正確的絕對路徑後,重新執行安裝,問題解決。

第二個錯誤還是粗心造成的,仍然是CLUSTER的安裝過程,報錯如下:

OUI-10155:Error while setting variable sl_tableList: The following node names are invalid, as they do not resolve to a valid IP address:

ser2-vip.

檢查RESPONSE引數SL_TABLELIST未發現任何異常,結果發現在/etc/hosts檔案中設定錯誤:

127.0.0.1       localhost
172.0.2.62      ser1    ser1.   loghost
172.0.2.63      ser2
172.0.2.68      ser1-vip
172.0.2.69      serv2-vip
10.0.2.1        ser1-priv
10.0.2.2        ser2-priv

這裡誤將ser2-vip敲錯為serv2-vip,導致錯誤的產生,將檔案修改為:

127.0.0.1       localhost
172.0.2.62      ser1    ser1.   loghost
172.0.2.63      ser2
172.0.2.68      ser1-vip
172.0.2.69      ser2-vip
10.0.2.1        ser1-priv
10.0.2.2        ser2-priv

檢查兩個節點的所有網路設定,確認無誤後,重新執行runInstaller,問題解決。

問題三是問題二的延伸,由於沒有設定uninstall的時候刪除安裝目錄,導致目錄不為空,出現了下面的錯誤:

SEVERE:OUI-10029:You have specified a non-empty directory to install this product. It is recommended to specify either an empty or a non-existent directory. You may, however, choose to ignore this message if the directory contains Operating System generated files or subdirectories like lost+found.

這個問題處理很簡單,手工清除目錄就可以了。

第四個問題是執行/data/oracle/product/11.1/crs/root.sh指令碼是碰到了,錯誤如下:

root@ser1 # . /data/oracle/product/11.1/crs/root.sh
WARNING: directory '/data/oracle/product/11.1' is not owned by root
WARNING: directory '/data/oracle/product' is not owned by root
WARNING: directory '/data/oracle' is not owned by root
WARNING: directory '/data' is not owned by root
Checking to see if Oracle CRS stack is already configured

Setting the permissions on OCR backup directory
Setting up Network socket directories
Failed to upgrade Oracle Cluster Registry configuration

檢查了半天,最後發現居然是以前碰到過的一個錯誤,Oracle不認“s0”分割槽。具體描述可以參考:http://yangtingkun.itpub.net/post/468/272473

簡單的說就是:Oracle預設不會使用s0分割槽,如果指定了s0分割槽作為ocrvoting disk,那麼在執行root.sh時也會收到同樣的錯誤資訊:Failed to upgrade Oracle Cluster Registry configuration

而最初的配置正是將ocr配置為/dev/rdsk/emcpower0a,這個對應的正是s0分割槽。所以導致了這個錯誤。

於是只能將整個CLUSTER完全解除安裝,重新設定ocrvoting disk對應的裸裝置。再次安裝則沒有出現任何問題。

第五個問題仍然是重複以前的錯誤。明明在以前碰到過,結果還是重蹈覆轍,詳細描述可以參考http://yangtingkun.itpub.net/post/468/407375

在軟體安裝完成的時候,需要執行root.sh,不過這個指令碼存在兩個bug,一個是./root.sh的時候,‘.’和‘/’之間不能加空格,否則會陷入死迴圈。二是如果silent模式建立資料庫,會導致root.sh裡面的引數OUI_SILENT設定錯誤,需要手工將其設定為FALSE,然後再執行指令碼,可惜這兩次問題這次又全都碰到了,希望下次安裝11g的時候不要再次犯同樣的錯誤。

第六個問題是SILENT建立ASM時候碰到的

$ dbca -silent -responseFile /data/database/response/my_asm.rsp
Look at the log file "/data/oracle/cfgtoollogs/dbca/silent.log" for further details.
$ more /data/oracle/cfgtoollogs/dbca/silent.log
Failed to retrieve network listener resources required for the Real Application Clusters high availability extensions configurations
 on the following nodes: [ser1", "ser2].

Do you want listeners on port 1521 with prefix LISTENER to be created on nodes [ser1", "ser2] automatically?  If you would like to c
onfigure the listener with different properties, run NetCA before continuing.
Listener creation failed with error: Failed to create a profile for listener "[LISTENER_SER1"]" on node "ser1"", "PRKC-1056 : Failed
 to get the hostname for node ser1"
PRKH-1001 : HASContext Internal Error
  [OCR Error(Native: getHostName:[21])]"..

It is strongly recommended to run NetCA to configure listeners before continuing.  Do you want to continue with the operation?
The operation will be stopped. Re-run DBCA after successfully running NetCA.

顯然根據錯誤提示,Oracle解析節點名稱的時候出現了錯誤,經過檢查發現輸入節點引數的時候忘了新增大括弧,將引數改為NODELIST={"ser1","ser2"}之後,再次安裝,發現問題依舊,不過這次提示錯誤又包含了大括弧的部分。

DBCARESPONSE設定和OUIRESPONSE設定還有一定的區別,將引數改為NODELIST=ser1,ser2,再次嘗試安裝,錯誤消失。

第七個錯誤還是ASM的配置相關,配置ASM時出現下面的問題:

$ dbca -silent -responseFile /data/database/response/my_asm.rsp
Look at the log file "/data/oracle/cfgtoollogs/dbca/silent1.log" for further details.
$ more /data/oracle/cfgtoollogs/dbca/silent1.log
Failed to retrieve network listener resources required for the Real Application Clusters high availability extensions configurations
 on the following nodes: [ser1, ser2].

Do you want listeners on port 1521 with prefix LISTENER to be created on nodes [ser1, ser2] automatically?  If you would like to con
figure the listener with different properties, run NetCA before continuing.
ORA-15018: diskgroup cannot be created
ORA-15031: disk specification '/dev/rdsk/emcpower0g' matches no disks
ORA-15025: could not open disk '/dev/rdsk/emcpower0g'
ORA-15056: additional error message

由於路徑沒有任何問題,感覺是許可權的問題,於是使用root對裸裝置授權:

root@ser1 # ls -l /dev/rdsk/emcpower0g
lrwxrwxrwx   1 root     root          33 May 30 14:58 /dev/rdsk/emcpower0g -> ../../devices/pseudo/emcp@0:g,raw
root@ser1 # chown oracle:oinstall /dev/rdsk/emcpower0g
root@ser1 # ls -l /dev/rdsk/emcpower0g
lrwxrwxrwx   1 root     root          33 May 30 14:58 /dev/rdsk/emcpower0g -> ../../devices/pseudo/emcp@0:g,raw

雖然授權之後裸裝置許可權沒有發生變化,不過ASM配置的錯誤提示已經發生了變化:

$ dbca -silent -responseFile /data/database/response/my_asm.rsp
Look at the log file "/data/oracle/cfgtoollogs/dbca/silent2.log" for further details.
$ more /data/oracle/cfgtoollogs/dbca/silent2.log
Could not mount the diskgroup on remote node ser2 using connection service ser2:1521:+ASM2.  Ensure that the listener is running on
this node and the ASM instance is registered to the listener.  Received the following error:

 ORA-15032: not all alterations performed
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"

Could not mount the diskgroup on remote node ser2 using connection service ser2:1521:+ASM2.  Ensure that the listener is running on
this node and the ASM instance is registered to the listener.  Received the following error:

 ORA-15032: not all alterations performed
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"

看來同樣的授權應該在節點2上執行:

root@ser2 # chown oracle:oinstall emcpower0g

重新執行dbca,錯誤解決。

第八個問題是建立資料庫時出現的,由於沒有設定ASM對應的密碼,所以安裝過程中出現提示輸入密碼,可以一旦鍵入回車,則指令碼執行就停止了:

$ dbca -silent -responseFile /data/database/response/my_db.rsp
Enter ASM SYS user password:
 
$

ASM_SYS_PASSWORD作為引數的一部分,重新執行,問題消失。

不過這次執行沒有任何的錯誤提示了:

$ dbca -silent -responseFile /data/database/response/my_db.rsp

命令直接結束,沒有任何的錯誤或正確的提示。

只好找到dbca的日誌存放目錄:

$ cd /data/oracle/cfgtoollogs/dbca/
$ ls -l
total 50
-rw-r-----   1 oracle   oinstall     840 Sep  2 15:33 silent.log
-rw-r-----   1 oracle   oinstall     854 Sep  2 15:40 silent0.log
-rw-r-----   1 oracle   oinstall     581 Sep  2 15:46 silent1.log
-rw-r-----   1 oracle   oinstall     697 Sep  2 15:51 silent2.log
-rw-r-----   1 oracle   oinstall       1 Sep  2 16:06 silent3.log
-rw-r-----   1 oracle   oinstall   19714 Sep  2 17:47 trace.log_OraDbHome1
$ more trace.log_OraDbHome1
[main] [17:47:42:597] [CommandLineArguments.process:639]  CommandLineArguments->process: number of arguments = 3
[main] [17:47:42:605] [CommandLineArguments.loadNodeinfo:3885]  CommandLineArguments:loadNodeinfo: length of m_nodeinfo = 2
[main] [17:47:42:606] [CommandLineArguments.loadNodeinfo:3894]  CommandLineArguments->loadNodeinfo: Node is {"ser1"
[main] [17:47:42:606] [CommandLineArguments.loadNodeinfo:3894]  CommandLineArguments->loadNodeinfo: Node is "ser2"}
[main] [17:47:42:609] [CommandLineArguments.validateArguments:3372]  CommandLineArguments->process: in Operation Type is Creation/Ge
nerateScripts Mode condition
[main] [17:47:42:609] [OracleHome.hasEELicense:204]  Running script. to determine licensing: /data/oracle/product/11.1/database/bin/b
ndlchk
.
.
.
[Thread-7] [17:47:44:658] [StreamReader.run:65]  OUTPUT>ser1    1
[Thread-7] [17:47:44:661] [StreamReader.run:65]  OUTPUT>ser2    2
[main] [17:47:44:668] [RuntimeExec.runCommand:144]  runCommand: process returns 0
[main] [17:47:44:668] [RuntimeExec.runCommand:161]  RunTimeExec: output>
[main] [17:47:44:668] [RuntimeExec.runCommand:164]  ser1        1
[main] [17:47:44:668] [RuntimeExec.runCommand:164]  ser2        2
[main] [17:47:44:669] [RuntimeExec.runCommand:170]  RunTimeExec: error>
[main] [17:47:44:669] [RuntimeExec.runCommand:192]  Returning from RunTimeExec.runCommand
[main] [17:47:44:669] [ClusterInfo.getClusterNodeMap:960]  Number of nodes=2
[main] [17:47:44:670] [ClusterInfo.getClusterNodeMap:972]  i=0 nodeName=ser1 nodeNum=1
[main] [17:47:44:670] [ClusterInfo.getClusterNodeMap:972]  i=1 nodeName=ser2 nodeNum=2
[main] [17:47:44:670] [StepContext.getInstanceNumbers:2968]  firstNodeNum=1
[main] [17:47:44:670] [StepContext.getInstanceNumbers:2991]  node={"ser1" nodeNum=null
[main] [17:47:44:671] [StepContext.getInstanceNumbers:3006]  nodeNames[0]={"ser1" instance number=null
[main] [17:47:44:671] [StepContext.getInstanceNumbers:2991]  node="ser2"} nodeNum=null
[main] [17:47:44:671] [StepContext.getInstanceNumbers:3006]  nodeNames[1]="ser2"} instance number=null
Exception in thread "main" java.lang.NumberFormatException: null
        at java.lang.Integer.parseInt(Integer.java:415)
        at java.lang.Integer.parseInt(Integer.java:497)
        at oracle.sysman.assistants.util.step.StepContext.getDBInstanceNumbers(StepContext.java:3038)
        at oracle.sysman.assistants.dbca.backend.Host.createOPSConfiguration(Host.java:957)
        at oracle.sysman.assistants.dbca.backend.SilentHost.performOperation(SilentHost.java:186)
        at oracle.sysman.assistants.dbca.backend.Host.startOperation(Host.java:3090)
        at oracle.sysman.assistants.dbca.Dbca.execute(Dbca.java:115)
        at oracle.sysman.assistants.dbca.Dbca.main(Dbca.java:180)

根據錯誤日誌,似乎問題仍然出現在大括號的地方,去掉大括弧後重試,問題解決。Oracle自己內部工具標準都不統一,OUIDBCA之間就有這麼明顯的差別。而且最關鍵的問題是,Oracledbca自己的response檔案中開頭的註釋中寫到:StringList :  {"",""}。只能說在這一點上,Oracledbca處理存在bug

注意和前面ASM一樣,這裡不但要去掉大括弧,也要去掉雙引號。

 

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/8494287/viewspace-1410421/,如需轉載,請註明出處,否則將追究法律責任。

相關文章