客戶一套ORACLE 10.2.0.4 的crs 問題處理
由於客戶更換HBA 卡和光纖交換機介面後,後來發現資料庫沒起來,下面是處理過程
客戶環境 兩個 ibm p570 os 6100-04-01-0944 oracle 10.2.0.4
遠端發現 第2 node ORACLE 安裝軟體的檔案按系統 已經100%了,哎,肯定是哪個程式瘋狂的寫吧lv撐滿。
檢視 crs.log 發現基本所有資訊都是這個
2014-10-02 21:54:15.523: [ OCRRAW][1]proprdc_propr_fcl: proprhandle_fcl->propr_fcl_page[3980]=0x0
2014-10-02 21:54:15.523: [ OCRRAW][1]proprdc_propr_fcl: proprhandle_fcl->propr_fcl_page[3981]=0x0
2014-10-02 21:54:15.523: [ OCRRAW][1]proprdc_propr_fcl: proprhandle_fcl->propr_fcl_page[3982]=0x0
這種報錯在google上根本查不到,好吧,去MOS 看看 ,mos也比較少,找到了一些相似的問題,說是10.2.0.4 bug。
先檢視 crs alert 日誌檔案,發現了重大資訊
crsd(201070)]CRS-1006:The OCR location /dev/rhdisk2 is inaccessible. Details in /oracle/product/10.2.0/crs/log/jxsmdb2/crsd/crsd.log.
2014-10-02 22:32:28.215
[crsd(164818)]CRS-1006:The OCR location /dev/rhdisk2 is inaccessible. Details in /oracle/product/10.2.0/crs/log/jxsmdb2/crsd/crsd.log.
磁碟有問題啦。。
檢視你2號節點 hdisk2 hdisk6 磁碟組屬性,使用者,許可權等都是正常
crw-rw---- 1 oracle oinstall 24, 10 Oct 03 09:55 /dev/rhdisk10
crw-rw---- 1 oracle oinstall 24, 11 Oct 03 09:55 /dev/rhdisk11
crw-rw---- 1 oracle oinstall 24, 12 Oct 03 09:48 /dev/rhdisk12
crw-rw---- 1 oracle oinstall 24, 13 Oct 03 09:48 /dev/rhdisk13
crw-rw---- 1 oracle oinstall 24, 14 Oct 03 09:47 /dev/rhdisk14
crw-rw---- 1 oracle oinstall 24, 15 Oct 03 09:45 /dev/rhdisk15
crw-rw---- 1 root oinstall 24, 2 Oct 03 09:55 /dev/rhdisk2
crw-rw---- 1 oracle oinstall 24, 3 Oct 03 09:55 /dev/rhdisk3
crw-rw---- 1 oracle oinstall 24, 4 Oct 03 09:55 /dev/rhdisk4
crw-rw---- 1 oracle oinstall 24, 5 Oct 03 09:55 /dev/rhdisk5
crw-rw---- 1 root oinstall 24, 6 Oct 03 09:55 /dev/rhdisk6
crw-rw---- 1 oracle oinstall 24, 7 Oct 03 09:55 /dev/rhdisk7
crw-rw---- 1 oracle oinstall 24, 8 Oct 03 09:30 /dev/rhdisk8
crw-rw---- 1 oracle oinstall 24, 9 Oct 03 09:09 /dev/rhdisk9
再檢視1號機器
crw-rw---- 1 oracle system 24, 10 Oct 03 09:55 /dev/rhdisk10
crw-rw---- 1 oracle system 24, 11 Oct 03 09:55 /dev/rhdisk11
crw-rw---- 1 root system 24, 12 Oct 03 09:48 /dev/rhdisk12
crw-rw---- 1 root system 24, 13 Oct 03 09:48 /dev/rhdisk13
crw-rw---- 1 root system 24, 14 Oct 03 09:47 /dev/rhdisk14
crw-rw---- 1 root system 24, 15 Oct 03 09:45 /dev/rhdisk15
crw-rw---- 1 root system 24, 2 Oct 03 09:55 /dev/rhdisk2
crw-rw---- 1 root system 24, 3 Oct 03 09:55 /dev/rhdisk3
crw-rw---- 1 root system 24, 4 Oct 03 09:55 /dev/rhdisk4
crw-rw---- 1 root system 24, 5 Oct 03 09:55 /dev/rhdisk5
crw-rw---- 1 root system 24, 6 Oct 03 09:55 /dev/rhdisk6
crw-rw---- 1 root system 24, 7 Oct 03 09:55 /dev/rhdisk7
crw-rw---- 1 root system 24, 8 Oct 03 09:30 /dev/rhdisk8
crw-rw---- 1 root system 24, 9 Oct 03 09:09 /dev/rhdisk9
把1號機器的磁碟許可權和,陣列改成和2號機器一樣
crw-rw---- 1 oracle oinstall 24, 10 Oct 03 09:55 /dev/rhdisk10
crw-rw---- 1 oracle oinstall 24, 11 Oct 03 09:55 /dev/rhdisk11
crw-rw---- 1 oracle oinstall 24, 12 Oct 03 09:48 /dev/rhdisk12
crw-rw---- 1 oracle oinstall 24, 13 Oct 03 09:48 /dev/rhdisk13
crw-rw---- 1 oracle oinstall 24, 14 Oct 03 09:47 /dev/rhdisk14
crw-rw---- 1 oracle oinstall 24, 15 Oct 03 09:45 /dev/rhdisk15
crw-rw---- 1 root oinstall 24, 2 Oct 03 09:55 /dev/rhdisk2
crw-rw---- 1 oracle oinstall 24, 3 Oct 03 09:55 /dev/rhdisk3
crw-rw---- 1 oracle oinstall 24, 4 Oct 03 09:55 /dev/rhdisk4
crw-rw---- 1 oracle oinstall 24, 5 Oct 03 09:55 /dev/rhdisk5
crw-rw---- 1 root oinstall 24, 6 Oct 03 09:55 /dev/rhdisk6
crw-rw---- 1 oracle oinstall 24, 7 Oct 03 09:55 /dev/rhdisk7
crw-rw---- 1 oracle oinstall 24, 8 Oct 03 09:30 /dev/rhdisk8
crw-rw---- 1 oracle oinstall 24, 9 Oct 03 09:09 /dev/rhdisk9
但是2號好節點還是起不來,依然報同樣的錯誤,
檢視1號機器和2號機器的hdisk2 ,hdisk6 屬性
PCM PCM/friend/otherapdisk Path Control Module False
PR_key_value none Persistant Reserve Key Value True
algorithm fail_over Algorithm True
autorecovery no Path/Ownership Autorecovery True
clr_q no Device CLEARS its Queue on error True
cntl_delay_time 0 Controller Delay Time True
cntl_hcheck_int 0 Controller Health Check Interval True
dist_err_pcnt 0 Distributed Error Percentage True
dist_tw_width 50 Distributed Error Sample Time True
hcheck_cmd inquiry Health Check Command True
hcheck_interval 60 Health Check Interval True
hcheck_mode nonactive Health Check Mode True
location Location Label True
lun_id 0x0 Logical Unit Number ID False
lun_reset_spt yes LUN Reset Supported True
max_retry_delay 60 Maximum Quiesce Time True
max_transfer 0x40000 Maximum TRANSFER Size True
node_name 0x200400a0b811758c FC Node Name False
pvid none Physical volume identifier False
q_err yes Use QERR bit True
q_type simple Queuing TYPE True
queue_depth 10 Queue DEPTH True
reassign_to 120 REASSIGN time out value True
reserve_policy no_reserve Reserve Policy True
rw_timeout 30 READ/WRITE time out value True
scsi_id 0x10300 SCSI ID False
start_timeout 60 START unit time out value True
unique_id 3E213600A0B800011758C0000C04C4BBE8D500F1815 FAStT03IBMfcp Unique device identifier False
ww_name 0x201500a0b811758c FC World Wide Name False
再看1 號節點
PCM PCM/friend/otherapdisk Path Control Module False
PR_key_value none Persistant Reserve Key Value True
algorithm fail_over Algorithm True
autorecovery no Path/Ownership Autorecovery True
clr_q no Device CLEARS its Queue on error True
cntl_delay_time 0 Controller Delay Time True
cntl_hcheck_int 0 Controller Health Check Interval True
dist_err_pcnt 0 Distributed Error Percentage True
dist_tw_width 50 Distributed Error Sample Time True
hcheck_cmd inquiry Health Check Command True
hcheck_interval 60 Health Check Interval True
hcheck_mode nonactive Health Check Mode True
location Location Label True
lun_id 0x0 Logical Unit Number ID False
lun_reset_spt yes LUN Reset Supported True
max_retry_delay 60 Maximum Quiesce Time True
max_transfer 0x40000 Maximum TRANSFER Size True
node_name 0x200400a0b811758c FC Node Name False
pvid none Physical volume identifier False
q_err yes Use QERR bit True
q_type simple Queuing TYPE True
queue_depth 10 Queue DEPTH True
reassign_to 120 REASSIGN time out value True
reserve_policy single_path Reserve Policy True
rw_timeout 30 READ/WRITE time out value True
scsi_id 0x10300 SCSI ID False
start_timeout 60 START unit time out value True
unique_id 3E213600A0B800011758C0000C04C4BBE8D500F1815 FAStT03IBMfcp Unique device identifier False
ww_name 0x201500a0b811758c FC World Wide Name False
發現 1號機器 hdisk2 和hdisk6 (ocr 磁碟)怎麼是single_path 按道理應該是共享的。後來發現1號機器的所有rac 磁碟都是這樣的。
立刻改掉
Root使用者
for i in 2 3 4 5 6 7 8 9 10 11 12 13 14 15
do chdev -l hdisk$i -a reserve_policy=no_reserve
do
結果發現 hdisk2 和hdisk6 改不了,裝置比較busy
0514-062 Cannot perform the requested function because the
specified device is busy.
刪除磁碟還是不行
# rmdev -dl hdisk6
Method error (/usr/lib/methods/ucfgdevice):
0514-062 Cannot perform the requested function because the
specified device is busy.
想想應該是2號機器 把ocr磁碟佔用了,所以我怎麼操作都不允許
檢視crs程式
oracle 196786 155908 0 09:05:18 - 0:00 /oracle/product/10.2.0/crs/bin/oclsomon.bin
root 103266 102694 1 09:05:17 - 0:47 /oracle/product/10.2.0/crs/bin/crsd.bin reboot
oracle 107362 192550 0 09:05:19 - 0:05 /oracle/product/10.2.0/crs/bin/ocssd.bin
1號機器 停止crs,發現crs的程式還是存在,這裡介紹一下1號節點自從前幾天換了hba,手動停止 crsctl stop crs 命令感覺不好使了
重啟1號機器還是更改不了磁碟,停止不了crs,索性root使用者進位制crs自動啟動,再重啟兩個機器
As root user on all node
cd /etc/
# ./init.crs disable crs
啟動之後這下沒有任何crs程式 ,1號機器嘗試更改磁碟屬性,這下可以了。。哈哈
# ps -ef |grep crs
root 102694 1 0 08:54:41 - 0:00 /bin/sh /etc/init.crsd run
root 151958 180262 0 08:59:54 pts/0 0:00 grep crs
# chdev -l hdisk2 -a reserve_policy=no_reserve
hdisk2 changed
# chdev -l hdisk6 -a reserve_policy=no_reserve
hdisk6 changed
#
現在在2號節點啟動crs
# ./crsct start crs
檢視 crs alertlog
[crsd(201070)]CRS-1006:The OCR location /dev/rhdisk2 is inaccessible. Details in /oracle/product/10.2.0/crs/log/jxsmdb2/crsd/crsd.log.
2014-10-02 22:32:28.215
[crsd(164818)]CRS-1006:The OCR location /dev/rhdisk2 is inaccessible. Details in /oracle/product/10.2.0/crs/log/jxsmdb2/crsd/crsd.log.
2014-10-02 22:32:28.476
[crsd(164818)]CRS-1005:The OCR upgrade was completed. Version has changed from 169870336 to 169870336. Details in /oracle/product/10.2.0/crs/log/jxsmdb2/crsd/crsd.log.
2014-10-02 22:32:28.477
[crsd(164818)]CRS-1012:The OCR service started on node jxsmdb2.
2014-10-02 22:32:28.751
[crsd(164818)]CRS-1201:CRSD started on node jxsmdb2.
[cssd(70408)]CRS-1603:CSSD on node jxsmdb2 shutdown by user.
2014-10-03 09:05:23.615
[cssd(107362)]CRS-1605:CSSD voting file is online: /dev/rhdisk4. Details in /oracle/product/10.2.0/crs/log/jxsmdb2/cssd/ocssd.log.
2014-10-03 09:05:23.815
[cssd(107362)]CRS-1605:CSSD voting file is online: /dev/rhdisk3. Details in /oracle/product/10.2.0/crs/log/jxsmdb2/cssd/ocssd.log.
2014-10-03 09:05:23.815
[cssd(107362)]CRS-1605:CSSD voting file is online: /dev/rhdisk5. Details in /oracle/product/10.2.0/crs/log/jxsmdb2/cssd/ocssd.log.
[cssd(107362)]CRS-1601:CSSD Reconfiguration complete. Active nodes are jxsmdb2 .
2014-10-03 09:08:44.541
[evmd(99266)]CRS-1401:EVMD started on node jxsmdb2.
2014-10-03 09:08:44.585
[crsd(103266)]CRS-1005:The OCR upgrade was completed. Version has changed from 169870336 to 169870336. Details in /oracle/product/10.2.0/crs/log/jxsmdb2/crsd/crsd.log.
2014-10-03 09:08:44.586
[crsd(103266)]CRS-1012:The OCR service started on node jxsmdb2.
2014-10-03 09:08:46.874
[crsd(103266)]CRS-1201:CRSD started on node jxsmdb2.
2014-10-03 09:08:47.163
[crsd(103266)]CRS-1205:Auto-start failed for the CRS resource . Details in jxsmdb2.
2014-10-03 09:08:47.183
[crsd(103266)]CRS-1205:Auto-start failed for the CRS resource . Details in jxsmdb2.
2014-10-03 09:09:43.287
[crsd(103266)]CRS-1205:Auto-start failed for the CRS resource . Details in jxsmdb2.
2014-10-03 09:09:43.297
[crsd(103266)]CRS-1205:Auto-start failed for the CRS resource . Details in jxsmdb2.
2014-10-03 09:09:45.746
[crsd(103266)]CRS-1205:Auto-start failed for the CRS resource . Details in jxsmdb2.
檢視crsd.log
2014-10-03 09:05:19.356: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9
2014-10-03 09:05:19.357: [ CRSRTI][1]32CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2014-10-03 09:05:20.702: [ COMMCRS][261]clsc_connect: (1106704d0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_jxsmdb2_crs))
2014-10-03 09:05:20.702: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9
2014-10-03 09:05:20.702: [ CRSRTI][1]32CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2014-10-03 09:05:22.041: [ COMMCRS][263]clsc_connect: (1106704d0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_jxsmdb2_crs))
2014-10-03 09:05:22.041: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9
2014-10-03 09:05:22.041: [ CRSRTI][1]32CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2014-10-03 09:05:23.380: [ COMMCRS][265]clsc_connect: (1106704d0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_jxsmdb2_crs))
2014-10-03 09:05:23.380: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9
2014-10-03 09:05:23.380: [ CRSRTI][1]32CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2014-10-03 09:08:44.482: [ CLSVER][1]32Active Version from OCR:10.2.0.4.0
2014-10-03 09:08:44.482: [ CLSVER][1]32Active Version and Software Version are same
2014-10-03 09:08:44.485: [ CRSMAIN][1]32Initializing OCR
2014-10-03 09:08:44.491: [ OCRRAW][1]proprioo: for disk 0 (/dev/rhdisk2), id match (1), my id set (1551842756,1866535888) total id sets (1), 1st set (1551842756,1866535888), 2nd set (0,0) my votes (1), total votes (2)
2014-10-03 09:08:44.491: [ OCRRAW][1]proprioo: for disk 1 (/dev/rhdisk6), id match (1), my id set (1551842756,1866535888) total id sets (1), 1st set (1551842756,1866535888), 2nd set (0,0) my votes (1), total votes (2)
2014-10-03 09:08:44.574: [ OCRMAS][3352]th_master:12: I AM THE NEW OCR MASTER at incar 1. Node Number 2
2014-10-03 09:08:44.575: [ OCRRAW][3352]proprioo: for disk 0 (/dev/rhdisk2), id match (1), my id set (1551842756,1866535888) total id sets (1), 1st set (1551842756,1866535888), 2nd set (0,0) my votes (1), total votes (2)
2014-10-03 09:08:44.575: [ OCRRAW][3352]proprioo: for disk 1 (/dev/rhdisk6), id match (1), my id set (1551842756,1866535888) total id sets (1), 1st set (1551842756,1866535888), 2nd set (0,0) my votes (1), total votes (2)
2014-10-03 09:08:44.596: [ OCRMAS][3352]th_master: Deleted ver keys from cache (master)
2014-10-03 09:08:44.596: [ CRSD][1]32ENV Logging level for Module: allcomp 0
2014-10-03 09:08:44.597: [ CRSD][1]32ENV Logging level for Module: default 0
2014-10-03 09:08:44.598: [ CRSD][1]32ENV Logging level for Module: COMMCRS 0
2014-10-03 09:08:44.598: [ CRSD][1]32ENV Logging level for Module: COMMNS 0
2014-10-03 09:08:44.599: [ CRSD][1]32ENV Logging level for Module: CRSUI 0
2014-10-03 09:08:44.600: [ CRSD][1]32ENV Logging level for Module: CRSCOMM 0
2014-10-03 09:08:44.600: [ CRSD][1]32ENV Logging level for Module: CRSRTI 0
2014-10-03 09:08:44.601: [ CRSD][1]32ENV Logging level for Module: CRSMAIN 0
2014-10-03 09:08:44.602: [ CRSD][1]32ENV Logging level for Module: CRSPLACE 0
2014-10-03 09:08:44.603: [ CRSD][1]32ENV Logging level for Module: CRSAPP 0
2014-10-03 09:08:44.603: [ CRSD][1]32ENV Logging level for Module: CRSRES 0
2014-10-03 09:08:44.604: [ CRSD][1]32ENV Logging level for Module: CRSOCR 0
2014-10-03 09:08:44.605: [ CRSD][1]32ENV Logging level for Module: CRSTIMER 0
2014-10-03 09:08:44.605: [ CRSD][1]32ENV Logging level for Module: CRSEVT 0
2014-10-03 09:08:44.606: [ CRSD][1]32ENV Logging level for Module: CRSD 0
2014-10-03 09:08:44.607: [ CRSD][1]32ENV Logging level for Module: CLUCLS 0
2014-10-03 09:08:44.607: [ CRSD][1]32ENV Logging level for Module: CLSVER 0
2014-10-03 09:08:44.608: [ CRSD][1]32ENV Logging level for Module: OCRRAW 0
2014-10-03 09:08:44.609: [ CRSD][1]32ENV Logging level for Module: OCROSD 0
2014-10-03 09:08:44.609: [ CRSD][1]32ENV Logging level for Module: CSSCLNT 0
2014-10-03 09:08:44.610: [ CRSD][1]32ENV Logging level for Module: OCRAPI 0
2014-10-03 09:08:44.611: [ CRSD][1]32ENV Logging level for Module: OCRUTL 0
2014-10-03 09:08:44.612: [ CRSD][1]32ENV Logging level for Module: OCRMSG 0
2014-10-03 09:08:44.612: [ CRSD][1]32ENV Logging level for Module: OCRCLI 0
2014-10-03 09:08:44.613: [ CRSD][1]32ENV Logging level for Module: OCRCAC 0
2014-10-03 09:08:44.614: [ CRSD][1]32ENV Logging level for Module: OCRSRV 0
2014-10-03 09:08:44.614: [ CRSD][1]32ENV Logging level for Module: OCRMAS 0
2014-10-03 09:08:44.615: [ CRSMAIN][1]32Filename is /oracle/product/10.2.0/crs/crs/init/jxsmdb2.pid
2014-10-03 09:08:44.651: [ CRSMAIN][1]32Using Authorizer location: /oracle/product/10.2.0/crs/crs/auth/
[ clsdmt][8235]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=jxsmdb2DBG_CRSD))
2014-10-03 09:08:44.667: [ CRSMAIN][1]32Initializing RTI
2014-10-03 09:08:44.719: [CRSTIMER][8749]32Timer Thread Starting.
2014-10-03 09:08:44.740: [ CRSRES][1]32Parameter SECURITY = 1, running in USER Mode
2014-10-03 09:08:44.743: [ CRSMAIN][1]32Initializing EVMMgr
2014-10-03 09:08:44.942: [ COMMCRS][9006]clsc_connect: (1139c41d0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))
2014-10-03 09:08:46.745: [ CRSMAIN][1]32CRSD locked during state recovery, please wait.
2014-10-03 09:08:46.824: [ CRSMAIN][1]32CRSD recovered, unlocked.
2014-10-03 09:08:46.847: [ CRSMAIN][1]32QS socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=ora_crsqs))
2014-10-03 09:08:46.847: [ CRSMAIN][1]32QS socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=ora_crsqs))
2014-10-03 09:08:46.855: [ CRSMAIN][1]32CRSD UI socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=CRSD_UI_SOCKET))
2014-10-03 09:08:46.873: [ CRSMAIN][1]32E2E socket on: (ADDRESS=(PROTOCOL=tcp)(HOST=jxsmdb2_priv)(PORT=49896))
2014-10-03 09:08:46.873: [ CRSMAIN][1]32Starting Threads
2014-10-03 09:08:46.874: [ CRSMAIN][10292]32Starting runCommandServer for (UI = 1, E2E = 0). 0
2014-10-03 09:08:46.874: [ CRSMAIN][10549]32Starting runCommandServer for (UI = 1, E2E = 0). 1
2014-10-03 09:08:46.874: [ CRSMAIN][1]32CRS Daemon Started.
2014-10-03 09:08:46.888: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.901: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.911: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.925: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.934: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.942: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.950: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.958: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.966: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.974: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.983: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.991: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.999: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:47.173: [ CRSRES][11834]32startRunnable: setting CLI values
2014-10-03 09:08:47.188: [ CRSRES][11834]32Attempting to start `ora.jxsmdb2.vip` on member `jxsmdb2`
2014-10-03 09:08:47.189: [ CRSRES][11577]32startRunnable: setting CLI values
2014-10-03 09:08:47.199: [ CRSRES][11577]32Attempting to start `ora.jxsmdb2.ASM2.asm` on member `jxsmdb2`
2014-10-03 09:08:49.742: [ CRSRES][11834]32Start of `ora.jxsmdb2.vip` on member `jxsmdb2` succeeded.
2014-10-03 09:08:49.775: [ CRSRES][11834]32startRunnable: setting CLI values
2014-10-03 09:08:49.783: [ CRSRES][11834]32Attempting to start `ora.jxsmdb2.LISTENER_JXSMDB2.lsnr` on member `jxsmdb2`
2014-10-03 09:08:53.948: [ CRSRES][11834]32Start of `ora.jxsmdb2.LISTENER_JXSMDB2.lsnr` on member `jxsmdb2` succeeded.
2014-10-03 09:08:54.410: [ CRSRES][12619]32CRS-1002: Resource 'ora.jxsmdb2.LISTENER_JXSMDB2.lsnr' is already running on member 'jxsmdb2'
2014-10-03 09:09:08.992: [ CRSRES][12625]32startRunnable: setting CLI values
2014-10-03 09:09:08.999: [ CRSRES][12625]32Attempting to start `ora.jxsmdb2.ons` on member `jxsmdb2`
2014-10-03 09:09:11.139: [ CRSRES][12625]32Start of `ora.jxsmdb2.ons` on member `jxsmdb2` succeeded.
2014-10-03 09:09:11.216: [ CRSRES][11577]32Start of `ora.jxsmdb2.ASM2.asm` on member `jxsmdb2` succeeded.
2014-10-03 09:09:11.239: [ CRSRES][11577]32startRunnable: setting CLI values
2014-10-03 09:09:11.244: [ CRSRES][11577]32Attempting to start `ora.jxsmk.jxsmk2.inst` on member `jxsmdb2`
2014-10-03 09:09:43.269: [ CRSRES][11577]32Start of `ora.jxsmk.jxsmk2.inst` on member `jxsmdb2` succeeded.
2014-10-03 09:09:43.277: [ CRSRES][12894]32Skip online resource: ora.jxsmdb2.ons
2014-10-03 09:09:43.319: [ CRSRES][13151]32startRunnable: setting CLI values
2014-10-03 09:09:43.345: [ CRSRES][12637]32startRunnable: setting CLI values
2014-10-03 09:09:43.349: [ CRSRES][13151]32Attempting to start `ora.jxsmk.db` on member `jxsmdb2`
2014-10-03 09:09:43.358: [ CRSRES][11610]32startRunnable: setting CLI values
2014-10-03 09:09:43.365: [ CRSRES][12637]32Attempting to start `ora.jxsmdb2.gsd` on member `jxsmdb2`
2014-10-03 09:09:43.371: [ CRSRES][11610]32Attempting to start `ora.jxsmdb1.vip` on member `jxsmdb2`
2014-10-03 09:09:43.916: [ CRSRES][13151]32Start of `ora.jxsmk.db` on member `jxsmdb2` succeeded.
2014-10-03 09:09:44.378: [ CRSRES][12637]32Start of `ora.jxsmdb2.gsd` on member `jxsmdb2` succeeded.
2014-10-03 09:09:44.416: [ CRSRES][13668]32CRS-1002: Resource 'ora.jxsmk.db' is already running on member 'jxsmdb2'
2014-10-03 09:09:45.730: [ CRSRES][11610]32Start of `ora.jxsmdb1.vip` on member `jxsmdb2` succeeded.
檢視ocssd.log
jxsmdb2->cd cssd
jxsmdb2->tail -f ocssd.log
[ CSSD]2014-10-03 09:05:23.603 [1] >TRACE: clssnmFatalInit: fatal mode enabled
[ CSSD]2014-10-03 09:05:23.692 [2829] >TRACE: clssnmClusterListener: Listening on (ADDRESS=(PROTOCOL=tcp)(HOST=jxsmdb2_priv)(PORT=49895))
[ CSSD]2014-10-03 09:05:23.699 [2829] >TRACE: clssnmconnect: connecting to node(1), con(1112d8b10), flags 0x0003
[ CSSD]2014-10-03 09:05:23.700 [2829] >TRACE: clssnmDiscHelper: jxsmdb1, node(1) connection failed, con (1112d8b10), probe(0)
[ CSSD]2014-10-03 09:05:23.741 [3086] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_2))
[ CSSD]2014-10-03 09:05:23.741 [3086] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_jxsmdb2_crs))
[ CSSD]2014-10-03 09:05:23.752 [3857] >TRACE: clssgmPeerListener: Listening on (ADDRESS=(PROTOCOL=tcp)(DEV=25)(HOST=191.191.191.101)(PORT=32823))
[ CSSD]2014-10-03 09:05:23.804 [1544] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(3) wrtcnt(78639) LATS(190241056) Disk lastSeqNo(78639)
[ CSSD]2014-10-03 09:05:30.781 [4628] >TRACE: clssnmRcfgMgrThread: Local Join
[ CSSD]2014-10-03 09:08:44.082 [4628] >WARNING: clssnmLocalJoinEvent: takeover succ
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmDoSyncUpdate: Initiating sync 1
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmDoSyncUpdate: diskTimeout set to (27000)ms
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmSetupAckWait: Ack message type (11)
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmSetupAckWait: node(2) is ALIVE
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmSendSync: syncSeqNo(1)
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmWaitForAcks: Ack message type(11), ackCount(1)
[ CSSD]2014-10-03 09:08:44.082 [2829] >TRACE: clssnmHandleSync: diskTimeout set to (27000)ms
[ CSSD]2014-10-03 09:08:44.082 [2829] >TRACE: clssnmHandleSync: Acknowledging sync: src[2] srcName[jxsmdb2] seq[1] sync[1]
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmWaitForAcks: done, msg type(11)
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmDoSyncUpdate: node(2) is transitioning from joining state to active state
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmSetupAckWait: Ack message type (13)
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmSetupAckWait: node(2) is ACTIVE
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmWaitForAcks: Ack message type(13), ackCount(1)
[ CSSD]2014-10-03 09:08:44.082 [2829] >TRACE: clssnmSendVoteInfo: node(2) syncSeqNo(1)
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmWaitForAcks: done, msg type(13)
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmCheckDskInfo: Checking disk info...
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmCheckDskInfo: diskTimeout set to (200000)ms
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmEvict: Start
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmWaitOnEvictions: Start
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmSetupAckWait: Ack message type (15)
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmSetupAckWait: node(2) is ACTIVE
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmSendUpdate: syncSeqNo(1)
[ CSSD]2014-10-03 09:08:44.083 [4628] >TRACE: clssnmWaitForAcks: Ack message type(15), ackCount(1)
[ CSSD]2014-10-03 09:08:44.083 [2829] >TRACE: clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)
[ CSSD]2014-10-03 09:08:44.083 [2829] >TRACE: clssnmUpdateNodeState: node 1, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)
[ CSSD]2014-10-03 09:08:44.083 [2829] >TRACE: clssnmUpdateNodeState: node 2, state (2/3) unique (1412298321/1412298321) prevConuni(0) birth (1/1) (old/new)
[ CSSD]2014-10-03 09:08:44.083 [2829] >USER: clssnmHandleUpdate: SYNC(1) from node(2) completed
[ CSSD]2014-10-03 09:08:44.083 [2829] >USER: clssnmHandleUpdate: NODE 2 (jxsmdb2) IS ACTIVE MEMBER OF CLUSTER
[ CSSD]2014-10-03 09:08:44.083 [2829] >TRACE: clssnmHandleUpdate: diskTimeout set to (200000)ms
[ CSSD]2014-10-03 09:08:44.083 [4628] >TRACE: clssnmWaitForAcks: done, msg type(15)
[ CSSD]2014-10-03 09:08:44.083 [4628] >TRACE: clssnmDoSyncUpdate: Sync 1 complete!
[ CSSD]2014-10-03 09:08:44.101 [1] >USER: NMEVENT_SUSPEND [00][00][00][00]
[ CSSD]2014-10-03 09:08:44.105 [4885] >TRACE: clssgmReconfigThread: started for reconfig (1)
[ CSSD]2014-10-03 09:08:44.105 [4885] >USER: NMEVENT_RECONFIG [00][00][00][04]
[ CSSD]2014-10-03 09:08:44.105 [4885] >TRACE: clssgmEstablishConnections: 1 nodes in cluster incarn 1
[ CSSD]2014-10-03 09:08:44.105 [3857] >TRACE: clssgmPeerListener: connects done (1/1)
[ CSSD]2014-10-03 09:08:44.105 [4885] >TRACE: clssgmEstablishMasterNode: MASTER for 1 is node(2) birth(1)
[ CSSD]2014-10-03 09:08:44.105 [4885] >TRACE: clssgmChangeMasterNode: requeued 0 RPCs
[ CSSD]2014-10-03 09:08:44.105 [4885] >TRACE: clssgmMasterCMSync: Synchronizing group/lock status
[ CSSD]2014-10-03 09:08:44.105 [4885] >TRACE: clssgmMasterSendDBDone: group/lock status synchronization complete
[ CSSD]CLSS-3000: reconfiguration successful, incarnation 1 with 1 nodes
[ CSSD]CLSS-3001: local node number 2, master node number 2
[ CSSD]2014-10-03 09:08:44.105 [4885] >TRACE: clssgmReconfigThread: completed for reconfig(1), with status(1)
[ CSSD]2014-10-03 09:08:44.266 [3086] >TRACE: clssgmCommonAddMember: clsomon joined (2/0x1000000/#CSS_CLSSOMON
檢視crs 服務
crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE OFFLINE
ora....B1.lsnr application ONLINE OFFLINE
ora....db1.gsd application ONLINE OFFLINE
ora....db1.ons application ONLINE OFFLINE
ora....db1.vip application ONLINE ONLINE jxsmdb2
ora....SM2.asm application ONLINE ONLINE jxsmdb2
ora....B2.lsnr application ONLINE ONLINE jxsmdb2
ora....db2.gsd application ONLINE ONLINE jxsmdb2
ora....db2.ons application ONLINE ONLINE jxsmdb2
ora....db2.vip application ONLINE ONLINE jxsmdb2
ora.jxsmk.db application ONLINE ONLINE jxsmdb2
ora....k1.inst application ONLINE OFFLINE
ora....k2.inst application ONLINE ONLINE jxsmdb2
資料庫終於在2號節點起來了
現在想想最開始的mos說的bug的原因估計是 ocr無法訪問,導致的。Mos上說的打pach 應該是在磁碟,硬體,系統都沒啥問題的情形下。
啟動1號機器crs
->tail -f al*
[cssd(143604)]CRS-1605:CSSD voting file is online: /dev/rhdisk5. Details in /oracle/product/10.2.0/crs/log/jxsmdb1/cssd/ocssd.log.
[cssd(143604)]CRS-1601:CSSD Reconfiguration complete. Active nodes are jxsmdb1 jxsmdb2 .
2014-09-30 17:39:11.803
[crsd(139286)]CRS-1012:The OCR service started on node jxsmdb1.
2014-09-30 17:39:12.848
[evmd(151694)]CRS-1401:EVMD started on node jxsmdb1.
2014-09-30 17:39:15.807
[crsd(139286)]CRS-1201:CRSD started on node jxsmdb1.
2014-10-02 00:05:49.042
[crsd(159746)]CRS-1011:OCR cannot determine that the OCR content contains the latest updates. Details in /oracle/product/10.2.0/crs/log/jxsmdb1/crsd/crsd.log.
Terminated
可以看到crs起來了
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/26175573/viewspace-1290649/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 處理客戶小機問題[一則]
- CRS-2409問題的處理
- oracle 10g crs啟動不成功問題處理Oracle 10g
- 如何應付難纏客戶和處理低薪問題
- CRS-0184: Cannot communicate with the CRS daemon. 問題處理
- 客戶端無法同步時間伺服器問題處理客戶端伺服器
- 對客戶又是供應商的客戶清賬處理
- oracle客戶端同sap衝突的問題Oracle客戶端
- 某省ORACLE10G RAC資料庫CRS啟動失敗問題處理Oracle資料庫
- Oracle啟動問題處理Oracle
- Oracle壞塊問題處理Oracle
- 一次oracle 10.2.0.4當機事故的處理Oracle
- oracle 10g crs 10.2.0.3 升級到10.2.04不成功問題處理Oracle 10g
- RAC升級到10.2.0.4碰到的幾個問題及處理辦法
- python socketserver處理客戶端的流程PythonServer客戶端
- Redis 是如何處理命令的(客戶端)Redis客戶端
- crontab對oracle操作問題處理Oracle
- oracle SP2-問題處理Oracle
- 不安裝oracle client客戶端通過plsql developer連線oracle10.2.0.4Oracleclient客戶端SQLDeveloper
- 客戶預付款處理和設定
- Kafka 處理器客戶端介紹Kafka客戶端
- 以客戶端為中心的錯誤處理客戶端
- ORACLE問題處理十個指令碼Oracle指令碼
- Oracle delete 高水位線處理問題Oracledelete
- JAVA FTP客戶端問題JavaFTP客戶端
- [TEAP早期試讀]真正的好處:客戶端批量處理客戶端
- xfire 客戶端呼叫webservice的問題客戶端Web
- TSM客戶端的排程問題客戶端
- 【問題處理】因誤修改inittab檔案導致Oracle 10gR2 CRS無法啟動Oracle 10g
- 處理問題的方法
- xml處理的問題XML
- ZooKeeper客戶端事件序列化處理客戶端事件
- nginx 處理客戶端請求的完整過程Nginx客戶端
- 專案分享九:客戶端的異常處理客戶端
- 解決客戶資料庫oracle_sid問題資料庫Oracle
- Oracle日常問題處理ORA-04031Oracle
- oracle taf unknown 問題處理過程Oracle
- Oracle資料庫中的逐行處理問題NEOracle資料庫