客戶一套ORACLE 10.2.0.4 的crs 問題處理
由於客戶更換HBA 卡和光纖交換機介面後,後來發現資料庫沒起來,下面是處理過程
客戶環境 兩個 ibm p570 os 6100-04-01-0944 oracle 10.2.0.4
遠端發現 第2 node ORACLE 安裝軟體的檔案按系統 已經100%了,哎,肯定是哪個程式瘋狂的寫吧lv撐滿。
檢視 crs.log 發現基本所有資訊都是這個
2014-10-02 21:54:15.523: [ OCRRAW][1]proprdc_propr_fcl: proprhandle_fcl->propr_fcl_page[3980]=0x0
2014-10-02 21:54:15.523: [ OCRRAW][1]proprdc_propr_fcl: proprhandle_fcl->propr_fcl_page[3981]=0x0
2014-10-02 21:54:15.523: [ OCRRAW][1]proprdc_propr_fcl: proprhandle_fcl->propr_fcl_page[3982]=0x0
這種報錯在google上根本查不到,好吧,去MOS 看看 ,mos也比較少,找到了一些相似的問題,說是10.2.0.4 bug。
先檢視 crs alert 日誌檔案,發現了重大資訊
crsd(201070)]CRS-1006:The OCR location /dev/rhdisk2 is inaccessible. Details in /oracle/product/10.2.0/crs/log/jxsmdb2/crsd/crsd.log.
2014-10-02 22:32:28.215
[crsd(164818)]CRS-1006:The OCR location /dev/rhdisk2 is inaccessible. Details in /oracle/product/10.2.0/crs/log/jxsmdb2/crsd/crsd.log.
磁碟有問題啦。。
檢視你2號節點 hdisk2 hdisk6 磁碟組屬性,使用者,許可權等都是正常
crw-rw---- 1 oracle oinstall 24, 10 Oct 03 09:55 /dev/rhdisk10
crw-rw---- 1 oracle oinstall 24, 11 Oct 03 09:55 /dev/rhdisk11
crw-rw---- 1 oracle oinstall 24, 12 Oct 03 09:48 /dev/rhdisk12
crw-rw---- 1 oracle oinstall 24, 13 Oct 03 09:48 /dev/rhdisk13
crw-rw---- 1 oracle oinstall 24, 14 Oct 03 09:47 /dev/rhdisk14
crw-rw---- 1 oracle oinstall 24, 15 Oct 03 09:45 /dev/rhdisk15
crw-rw---- 1 root oinstall 24, 2 Oct 03 09:55 /dev/rhdisk2
crw-rw---- 1 oracle oinstall 24, 3 Oct 03 09:55 /dev/rhdisk3
crw-rw---- 1 oracle oinstall 24, 4 Oct 03 09:55 /dev/rhdisk4
crw-rw---- 1 oracle oinstall 24, 5 Oct 03 09:55 /dev/rhdisk5
crw-rw---- 1 root oinstall 24, 6 Oct 03 09:55 /dev/rhdisk6
crw-rw---- 1 oracle oinstall 24, 7 Oct 03 09:55 /dev/rhdisk7
crw-rw---- 1 oracle oinstall 24, 8 Oct 03 09:30 /dev/rhdisk8
crw-rw---- 1 oracle oinstall 24, 9 Oct 03 09:09 /dev/rhdisk9
再檢視1號機器
crw-rw---- 1 oracle system 24, 10 Oct 03 09:55 /dev/rhdisk10
crw-rw---- 1 oracle system 24, 11 Oct 03 09:55 /dev/rhdisk11
crw-rw---- 1 root system 24, 12 Oct 03 09:48 /dev/rhdisk12
crw-rw---- 1 root system 24, 13 Oct 03 09:48 /dev/rhdisk13
crw-rw---- 1 root system 24, 14 Oct 03 09:47 /dev/rhdisk14
crw-rw---- 1 root system 24, 15 Oct 03 09:45 /dev/rhdisk15
crw-rw---- 1 root system 24, 2 Oct 03 09:55 /dev/rhdisk2
crw-rw---- 1 root system 24, 3 Oct 03 09:55 /dev/rhdisk3
crw-rw---- 1 root system 24, 4 Oct 03 09:55 /dev/rhdisk4
crw-rw---- 1 root system 24, 5 Oct 03 09:55 /dev/rhdisk5
crw-rw---- 1 root system 24, 6 Oct 03 09:55 /dev/rhdisk6
crw-rw---- 1 root system 24, 7 Oct 03 09:55 /dev/rhdisk7
crw-rw---- 1 root system 24, 8 Oct 03 09:30 /dev/rhdisk8
crw-rw---- 1 root system 24, 9 Oct 03 09:09 /dev/rhdisk9
把1號機器的磁碟許可權和,陣列改成和2號機器一樣
crw-rw---- 1 oracle oinstall 24, 10 Oct 03 09:55 /dev/rhdisk10
crw-rw---- 1 oracle oinstall 24, 11 Oct 03 09:55 /dev/rhdisk11
crw-rw---- 1 oracle oinstall 24, 12 Oct 03 09:48 /dev/rhdisk12
crw-rw---- 1 oracle oinstall 24, 13 Oct 03 09:48 /dev/rhdisk13
crw-rw---- 1 oracle oinstall 24, 14 Oct 03 09:47 /dev/rhdisk14
crw-rw---- 1 oracle oinstall 24, 15 Oct 03 09:45 /dev/rhdisk15
crw-rw---- 1 root oinstall 24, 2 Oct 03 09:55 /dev/rhdisk2
crw-rw---- 1 oracle oinstall 24, 3 Oct 03 09:55 /dev/rhdisk3
crw-rw---- 1 oracle oinstall 24, 4 Oct 03 09:55 /dev/rhdisk4
crw-rw---- 1 oracle oinstall 24, 5 Oct 03 09:55 /dev/rhdisk5
crw-rw---- 1 root oinstall 24, 6 Oct 03 09:55 /dev/rhdisk6
crw-rw---- 1 oracle oinstall 24, 7 Oct 03 09:55 /dev/rhdisk7
crw-rw---- 1 oracle oinstall 24, 8 Oct 03 09:30 /dev/rhdisk8
crw-rw---- 1 oracle oinstall 24, 9 Oct 03 09:09 /dev/rhdisk9
但是2號好節點還是起不來,依然報同樣的錯誤,
檢視1號機器和2號機器的hdisk2 ,hdisk6 屬性
PCM PCM/friend/otherapdisk Path Control Module False
PR_key_value none Persistant Reserve Key Value True
algorithm fail_over Algorithm True
autorecovery no Path/Ownership Autorecovery True
clr_q no Device CLEARS its Queue on error True
cntl_delay_time 0 Controller Delay Time True
cntl_hcheck_int 0 Controller Health Check Interval True
dist_err_pcnt 0 Distributed Error Percentage True
dist_tw_width 50 Distributed Error Sample Time True
hcheck_cmd inquiry Health Check Command True
hcheck_interval 60 Health Check Interval True
hcheck_mode nonactive Health Check Mode True
location Location Label True
lun_id 0x0 Logical Unit Number ID False
lun_reset_spt yes LUN Reset Supported True
max_retry_delay 60 Maximum Quiesce Time True
max_transfer 0x40000 Maximum TRANSFER Size True
node_name 0x200400a0b811758c FC Node Name False
pvid none Physical volume identifier False
q_err yes Use QERR bit True
q_type simple Queuing TYPE True
queue_depth 10 Queue DEPTH True
reassign_to 120 REASSIGN time out value True
reserve_policy no_reserve Reserve Policy True
rw_timeout 30 READ/WRITE time out value True
scsi_id 0x10300 SCSI ID False
start_timeout 60 START unit time out value True
unique_id 3E213600A0B800011758C0000C04C4BBE8D500F1815 FAStT03IBMfcp Unique device identifier False
ww_name 0x201500a0b811758c FC World Wide Name False
再看1 號節點
PCM PCM/friend/otherapdisk Path Control Module False
PR_key_value none Persistant Reserve Key Value True
algorithm fail_over Algorithm True
autorecovery no Path/Ownership Autorecovery True
clr_q no Device CLEARS its Queue on error True
cntl_delay_time 0 Controller Delay Time True
cntl_hcheck_int 0 Controller Health Check Interval True
dist_err_pcnt 0 Distributed Error Percentage True
dist_tw_width 50 Distributed Error Sample Time True
hcheck_cmd inquiry Health Check Command True
hcheck_interval 60 Health Check Interval True
hcheck_mode nonactive Health Check Mode True
location Location Label True
lun_id 0x0 Logical Unit Number ID False
lun_reset_spt yes LUN Reset Supported True
max_retry_delay 60 Maximum Quiesce Time True
max_transfer 0x40000 Maximum TRANSFER Size True
node_name 0x200400a0b811758c FC Node Name False
pvid none Physical volume identifier False
q_err yes Use QERR bit True
q_type simple Queuing TYPE True
queue_depth 10 Queue DEPTH True
reassign_to 120 REASSIGN time out value True
reserve_policy single_path Reserve Policy True
rw_timeout 30 READ/WRITE time out value True
scsi_id 0x10300 SCSI ID False
start_timeout 60 START unit time out value True
unique_id 3E213600A0B800011758C0000C04C4BBE8D500F1815 FAStT03IBMfcp Unique device identifier False
ww_name 0x201500a0b811758c FC World Wide Name False
發現 1號機器 hdisk2 和hdisk6 (ocr 磁碟)怎麼是single_path 按道理應該是共享的。後來發現1號機器的所有rac 磁碟都是這樣的。
立刻改掉
Root使用者
for i in 2 3 4 5 6 7 8 9 10 11 12 13 14 15
do chdev -l hdisk$i -a reserve_policy=no_reserve
do
結果發現 hdisk2 和hdisk6 改不了,裝置比較busy
0514-062 Cannot perform the requested function because the
specified device is busy.
刪除磁碟還是不行
# rmdev -dl hdisk6
Method error (/usr/lib/methods/ucfgdevice):
0514-062 Cannot perform the requested function because the
specified device is busy.
想想應該是2號機器 把ocr磁碟佔用了,所以我怎麼操作都不允許
檢視crs程式
oracle 196786 155908 0 09:05:18 - 0:00 /oracle/product/10.2.0/crs/bin/oclsomon.bin
root 103266 102694 1 09:05:17 - 0:47 /oracle/product/10.2.0/crs/bin/crsd.bin reboot
oracle 107362 192550 0 09:05:19 - 0:05 /oracle/product/10.2.0/crs/bin/ocssd.bin
1號機器 停止crs,發現crs的程式還是存在,這裡介紹一下1號節點自從前幾天換了hba,手動停止 crsctl stop crs 命令感覺不好使了
重啟1號機器還是更改不了磁碟,停止不了crs,索性root使用者進位制crs自動啟動,再重啟兩個機器
As root user on all node
cd /etc/
# ./init.crs disable crs
啟動之後這下沒有任何crs程式 ,1號機器嘗試更改磁碟屬性,這下可以了。。哈哈
# ps -ef |grep crs
root 102694 1 0 08:54:41 - 0:00 /bin/sh /etc/init.crsd run
root 151958 180262 0 08:59:54 pts/0 0:00 grep crs
# chdev -l hdisk2 -a reserve_policy=no_reserve
hdisk2 changed
# chdev -l hdisk6 -a reserve_policy=no_reserve
hdisk6 changed
#
現在在2號節點啟動crs
# ./crsct start crs
檢視 crs alertlog
[crsd(201070)]CRS-1006:The OCR location /dev/rhdisk2 is inaccessible. Details in /oracle/product/10.2.0/crs/log/jxsmdb2/crsd/crsd.log.
2014-10-02 22:32:28.215
[crsd(164818)]CRS-1006:The OCR location /dev/rhdisk2 is inaccessible. Details in /oracle/product/10.2.0/crs/log/jxsmdb2/crsd/crsd.log.
2014-10-02 22:32:28.476
[crsd(164818)]CRS-1005:The OCR upgrade was completed. Version has changed from 169870336 to 169870336. Details in /oracle/product/10.2.0/crs/log/jxsmdb2/crsd/crsd.log.
2014-10-02 22:32:28.477
[crsd(164818)]CRS-1012:The OCR service started on node jxsmdb2.
2014-10-02 22:32:28.751
[crsd(164818)]CRS-1201:CRSD started on node jxsmdb2.
[cssd(70408)]CRS-1603:CSSD on node jxsmdb2 shutdown by user.
2014-10-03 09:05:23.615
[cssd(107362)]CRS-1605:CSSD voting file is online: /dev/rhdisk4. Details in /oracle/product/10.2.0/crs/log/jxsmdb2/cssd/ocssd.log.
2014-10-03 09:05:23.815
[cssd(107362)]CRS-1605:CSSD voting file is online: /dev/rhdisk3. Details in /oracle/product/10.2.0/crs/log/jxsmdb2/cssd/ocssd.log.
2014-10-03 09:05:23.815
[cssd(107362)]CRS-1605:CSSD voting file is online: /dev/rhdisk5. Details in /oracle/product/10.2.0/crs/log/jxsmdb2/cssd/ocssd.log.
[cssd(107362)]CRS-1601:CSSD Reconfiguration complete. Active nodes are jxsmdb2 .
2014-10-03 09:08:44.541
[evmd(99266)]CRS-1401:EVMD started on node jxsmdb2.
2014-10-03 09:08:44.585
[crsd(103266)]CRS-1005:The OCR upgrade was completed. Version has changed from 169870336 to 169870336. Details in /oracle/product/10.2.0/crs/log/jxsmdb2/crsd/crsd.log.
2014-10-03 09:08:44.586
[crsd(103266)]CRS-1012:The OCR service started on node jxsmdb2.
2014-10-03 09:08:46.874
[crsd(103266)]CRS-1201:CRSD started on node jxsmdb2.
2014-10-03 09:08:47.163
[crsd(103266)]CRS-1205:Auto-start failed for the CRS resource . Details in jxsmdb2.
2014-10-03 09:08:47.183
[crsd(103266)]CRS-1205:Auto-start failed for the CRS resource . Details in jxsmdb2.
2014-10-03 09:09:43.287
[crsd(103266)]CRS-1205:Auto-start failed for the CRS resource . Details in jxsmdb2.
2014-10-03 09:09:43.297
[crsd(103266)]CRS-1205:Auto-start failed for the CRS resource . Details in jxsmdb2.
2014-10-03 09:09:45.746
[crsd(103266)]CRS-1205:Auto-start failed for the CRS resource . Details in jxsmdb2.
檢視crsd.log
2014-10-03 09:05:19.356: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9
2014-10-03 09:05:19.357: [ CRSRTI][1]32CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2014-10-03 09:05:20.702: [ COMMCRS][261]clsc_connect: (1106704d0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_jxsmdb2_crs))
2014-10-03 09:05:20.702: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9
2014-10-03 09:05:20.702: [ CRSRTI][1]32CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2014-10-03 09:05:22.041: [ COMMCRS][263]clsc_connect: (1106704d0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_jxsmdb2_crs))
2014-10-03 09:05:22.041: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9
2014-10-03 09:05:22.041: [ CRSRTI][1]32CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2014-10-03 09:05:23.380: [ COMMCRS][265]clsc_connect: (1106704d0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_jxsmdb2_crs))
2014-10-03 09:05:23.380: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9
2014-10-03 09:05:23.380: [ CRSRTI][1]32CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2014-10-03 09:08:44.482: [ CLSVER][1]32Active Version from OCR:10.2.0.4.0
2014-10-03 09:08:44.482: [ CLSVER][1]32Active Version and Software Version are same
2014-10-03 09:08:44.485: [ CRSMAIN][1]32Initializing OCR
2014-10-03 09:08:44.491: [ OCRRAW][1]proprioo: for disk 0 (/dev/rhdisk2), id match (1), my id set (1551842756,1866535888) total id sets (1), 1st set (1551842756,1866535888), 2nd set (0,0) my votes (1), total votes (2)
2014-10-03 09:08:44.491: [ OCRRAW][1]proprioo: for disk 1 (/dev/rhdisk6), id match (1), my id set (1551842756,1866535888) total id sets (1), 1st set (1551842756,1866535888), 2nd set (0,0) my votes (1), total votes (2)
2014-10-03 09:08:44.574: [ OCRMAS][3352]th_master:12: I AM THE NEW OCR MASTER at incar 1. Node Number 2
2014-10-03 09:08:44.575: [ OCRRAW][3352]proprioo: for disk 0 (/dev/rhdisk2), id match (1), my id set (1551842756,1866535888) total id sets (1), 1st set (1551842756,1866535888), 2nd set (0,0) my votes (1), total votes (2)
2014-10-03 09:08:44.575: [ OCRRAW][3352]proprioo: for disk 1 (/dev/rhdisk6), id match (1), my id set (1551842756,1866535888) total id sets (1), 1st set (1551842756,1866535888), 2nd set (0,0) my votes (1), total votes (2)
2014-10-03 09:08:44.596: [ OCRMAS][3352]th_master: Deleted ver keys from cache (master)
2014-10-03 09:08:44.596: [ CRSD][1]32ENV Logging level for Module: allcomp 0
2014-10-03 09:08:44.597: [ CRSD][1]32ENV Logging level for Module: default 0
2014-10-03 09:08:44.598: [ CRSD][1]32ENV Logging level for Module: COMMCRS 0
2014-10-03 09:08:44.598: [ CRSD][1]32ENV Logging level for Module: COMMNS 0
2014-10-03 09:08:44.599: [ CRSD][1]32ENV Logging level for Module: CRSUI 0
2014-10-03 09:08:44.600: [ CRSD][1]32ENV Logging level for Module: CRSCOMM 0
2014-10-03 09:08:44.600: [ CRSD][1]32ENV Logging level for Module: CRSRTI 0
2014-10-03 09:08:44.601: [ CRSD][1]32ENV Logging level for Module: CRSMAIN 0
2014-10-03 09:08:44.602: [ CRSD][1]32ENV Logging level for Module: CRSPLACE 0
2014-10-03 09:08:44.603: [ CRSD][1]32ENV Logging level for Module: CRSAPP 0
2014-10-03 09:08:44.603: [ CRSD][1]32ENV Logging level for Module: CRSRES 0
2014-10-03 09:08:44.604: [ CRSD][1]32ENV Logging level for Module: CRSOCR 0
2014-10-03 09:08:44.605: [ CRSD][1]32ENV Logging level for Module: CRSTIMER 0
2014-10-03 09:08:44.605: [ CRSD][1]32ENV Logging level for Module: CRSEVT 0
2014-10-03 09:08:44.606: [ CRSD][1]32ENV Logging level for Module: CRSD 0
2014-10-03 09:08:44.607: [ CRSD][1]32ENV Logging level for Module: CLUCLS 0
2014-10-03 09:08:44.607: [ CRSD][1]32ENV Logging level for Module: CLSVER 0
2014-10-03 09:08:44.608: [ CRSD][1]32ENV Logging level for Module: OCRRAW 0
2014-10-03 09:08:44.609: [ CRSD][1]32ENV Logging level for Module: OCROSD 0
2014-10-03 09:08:44.609: [ CRSD][1]32ENV Logging level for Module: CSSCLNT 0
2014-10-03 09:08:44.610: [ CRSD][1]32ENV Logging level for Module: OCRAPI 0
2014-10-03 09:08:44.611: [ CRSD][1]32ENV Logging level for Module: OCRUTL 0
2014-10-03 09:08:44.612: [ CRSD][1]32ENV Logging level for Module: OCRMSG 0
2014-10-03 09:08:44.612: [ CRSD][1]32ENV Logging level for Module: OCRCLI 0
2014-10-03 09:08:44.613: [ CRSD][1]32ENV Logging level for Module: OCRCAC 0
2014-10-03 09:08:44.614: [ CRSD][1]32ENV Logging level for Module: OCRSRV 0
2014-10-03 09:08:44.614: [ CRSD][1]32ENV Logging level for Module: OCRMAS 0
2014-10-03 09:08:44.615: [ CRSMAIN][1]32Filename is /oracle/product/10.2.0/crs/crs/init/jxsmdb2.pid
2014-10-03 09:08:44.651: [ CRSMAIN][1]32Using Authorizer location: /oracle/product/10.2.0/crs/crs/auth/
[ clsdmt][8235]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=jxsmdb2DBG_CRSD))
2014-10-03 09:08:44.667: [ CRSMAIN][1]32Initializing RTI
2014-10-03 09:08:44.719: [CRSTIMER][8749]32Timer Thread Starting.
2014-10-03 09:08:44.740: [ CRSRES][1]32Parameter SECURITY = 1, running in USER Mode
2014-10-03 09:08:44.743: [ CRSMAIN][1]32Initializing EVMMgr
2014-10-03 09:08:44.942: [ COMMCRS][9006]clsc_connect: (1139c41d0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))
2014-10-03 09:08:46.745: [ CRSMAIN][1]32CRSD locked during state recovery, please wait.
2014-10-03 09:08:46.824: [ CRSMAIN][1]32CRSD recovered, unlocked.
2014-10-03 09:08:46.847: [ CRSMAIN][1]32QS socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=ora_crsqs))
2014-10-03 09:08:46.847: [ CRSMAIN][1]32QS socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=ora_crsqs))
2014-10-03 09:08:46.855: [ CRSMAIN][1]32CRSD UI socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=CRSD_UI_SOCKET))
2014-10-03 09:08:46.873: [ CRSMAIN][1]32E2E socket on: (ADDRESS=(PROTOCOL=tcp)(HOST=jxsmdb2_priv)(PORT=49896))
2014-10-03 09:08:46.873: [ CRSMAIN][1]32Starting Threads
2014-10-03 09:08:46.874: [ CRSMAIN][10292]32Starting runCommandServer for (UI = 1, E2E = 0). 0
2014-10-03 09:08:46.874: [ CRSMAIN][10549]32Starting runCommandServer for (UI = 1, E2E = 0). 1
2014-10-03 09:08:46.874: [ CRSMAIN][1]32CRS Daemon Started.
2014-10-03 09:08:46.888: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.901: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.911: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.925: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.934: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.942: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.950: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.958: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.966: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.974: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.983: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.991: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:46.999: [ CRSRES][1]32 startup = 1
2014-10-03 09:08:47.173: [ CRSRES][11834]32startRunnable: setting CLI values
2014-10-03 09:08:47.188: [ CRSRES][11834]32Attempting to start `ora.jxsmdb2.vip` on member `jxsmdb2`
2014-10-03 09:08:47.189: [ CRSRES][11577]32startRunnable: setting CLI values
2014-10-03 09:08:47.199: [ CRSRES][11577]32Attempting to start `ora.jxsmdb2.ASM2.asm` on member `jxsmdb2`
2014-10-03 09:08:49.742: [ CRSRES][11834]32Start of `ora.jxsmdb2.vip` on member `jxsmdb2` succeeded.
2014-10-03 09:08:49.775: [ CRSRES][11834]32startRunnable: setting CLI values
2014-10-03 09:08:49.783: [ CRSRES][11834]32Attempting to start `ora.jxsmdb2.LISTENER_JXSMDB2.lsnr` on member `jxsmdb2`
2014-10-03 09:08:53.948: [ CRSRES][11834]32Start of `ora.jxsmdb2.LISTENER_JXSMDB2.lsnr` on member `jxsmdb2` succeeded.
2014-10-03 09:08:54.410: [ CRSRES][12619]32CRS-1002: Resource 'ora.jxsmdb2.LISTENER_JXSMDB2.lsnr' is already running on member 'jxsmdb2'
2014-10-03 09:09:08.992: [ CRSRES][12625]32startRunnable: setting CLI values
2014-10-03 09:09:08.999: [ CRSRES][12625]32Attempting to start `ora.jxsmdb2.ons` on member `jxsmdb2`
2014-10-03 09:09:11.139: [ CRSRES][12625]32Start of `ora.jxsmdb2.ons` on member `jxsmdb2` succeeded.
2014-10-03 09:09:11.216: [ CRSRES][11577]32Start of `ora.jxsmdb2.ASM2.asm` on member `jxsmdb2` succeeded.
2014-10-03 09:09:11.239: [ CRSRES][11577]32startRunnable: setting CLI values
2014-10-03 09:09:11.244: [ CRSRES][11577]32Attempting to start `ora.jxsmk.jxsmk2.inst` on member `jxsmdb2`
2014-10-03 09:09:43.269: [ CRSRES][11577]32Start of `ora.jxsmk.jxsmk2.inst` on member `jxsmdb2` succeeded.
2014-10-03 09:09:43.277: [ CRSRES][12894]32Skip online resource: ora.jxsmdb2.ons
2014-10-03 09:09:43.319: [ CRSRES][13151]32startRunnable: setting CLI values
2014-10-03 09:09:43.345: [ CRSRES][12637]32startRunnable: setting CLI values
2014-10-03 09:09:43.349: [ CRSRES][13151]32Attempting to start `ora.jxsmk.db` on member `jxsmdb2`
2014-10-03 09:09:43.358: [ CRSRES][11610]32startRunnable: setting CLI values
2014-10-03 09:09:43.365: [ CRSRES][12637]32Attempting to start `ora.jxsmdb2.gsd` on member `jxsmdb2`
2014-10-03 09:09:43.371: [ CRSRES][11610]32Attempting to start `ora.jxsmdb1.vip` on member `jxsmdb2`
2014-10-03 09:09:43.916: [ CRSRES][13151]32Start of `ora.jxsmk.db` on member `jxsmdb2` succeeded.
2014-10-03 09:09:44.378: [ CRSRES][12637]32Start of `ora.jxsmdb2.gsd` on member `jxsmdb2` succeeded.
2014-10-03 09:09:44.416: [ CRSRES][13668]32CRS-1002: Resource 'ora.jxsmk.db' is already running on member 'jxsmdb2'
2014-10-03 09:09:45.730: [ CRSRES][11610]32Start of `ora.jxsmdb1.vip` on member `jxsmdb2` succeeded.
檢視ocssd.log
jxsmdb2->cd cssd
jxsmdb2->tail -f ocssd.log
[ CSSD]2014-10-03 09:05:23.603 [1] >TRACE: clssnmFatalInit: fatal mode enabled
[ CSSD]2014-10-03 09:05:23.692 [2829] >TRACE: clssnmClusterListener: Listening on (ADDRESS=(PROTOCOL=tcp)(HOST=jxsmdb2_priv)(PORT=49895))
[ CSSD]2014-10-03 09:05:23.699 [2829] >TRACE: clssnmconnect: connecting to node(1), con(1112d8b10), flags 0x0003
[ CSSD]2014-10-03 09:05:23.700 [2829] >TRACE: clssnmDiscHelper: jxsmdb1, node(1) connection failed, con (1112d8b10), probe(0)
[ CSSD]2014-10-03 09:05:23.741 [3086] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_2))
[ CSSD]2014-10-03 09:05:23.741 [3086] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_jxsmdb2_crs))
[ CSSD]2014-10-03 09:05:23.752 [3857] >TRACE: clssgmPeerListener: Listening on (ADDRESS=(PROTOCOL=tcp)(DEV=25)(HOST=191.191.191.101)(PORT=32823))
[ CSSD]2014-10-03 09:05:23.804 [1544] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(3) wrtcnt(78639) LATS(190241056) Disk lastSeqNo(78639)
[ CSSD]2014-10-03 09:05:30.781 [4628] >TRACE: clssnmRcfgMgrThread: Local Join
[ CSSD]2014-10-03 09:08:44.082 [4628] >WARNING: clssnmLocalJoinEvent: takeover succ
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmDoSyncUpdate: Initiating sync 1
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmDoSyncUpdate: diskTimeout set to (27000)ms
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmSetupAckWait: Ack message type (11)
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmSetupAckWait: node(2) is ALIVE
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmSendSync: syncSeqNo(1)
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmWaitForAcks: Ack message type(11), ackCount(1)
[ CSSD]2014-10-03 09:08:44.082 [2829] >TRACE: clssnmHandleSync: diskTimeout set to (27000)ms
[ CSSD]2014-10-03 09:08:44.082 [2829] >TRACE: clssnmHandleSync: Acknowledging sync: src[2] srcName[jxsmdb2] seq[1] sync[1]
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmWaitForAcks: done, msg type(11)
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmDoSyncUpdate: node(2) is transitioning from joining state to active state
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmSetupAckWait: Ack message type (13)
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmSetupAckWait: node(2) is ACTIVE
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmWaitForAcks: Ack message type(13), ackCount(1)
[ CSSD]2014-10-03 09:08:44.082 [2829] >TRACE: clssnmSendVoteInfo: node(2) syncSeqNo(1)
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmWaitForAcks: done, msg type(13)
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmCheckDskInfo: Checking disk info...
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmCheckDskInfo: diskTimeout set to (200000)ms
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmEvict: Start
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmWaitOnEvictions: Start
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmSetupAckWait: Ack message type (15)
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmSetupAckWait: node(2) is ACTIVE
[ CSSD]2014-10-03 09:08:44.082 [4628] >TRACE: clssnmSendUpdate: syncSeqNo(1)
[ CSSD]2014-10-03 09:08:44.083 [4628] >TRACE: clssnmWaitForAcks: Ack message type(15), ackCount(1)
[ CSSD]2014-10-03 09:08:44.083 [2829] >TRACE: clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)
[ CSSD]2014-10-03 09:08:44.083 [2829] >TRACE: clssnmUpdateNodeState: node 1, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)
[ CSSD]2014-10-03 09:08:44.083 [2829] >TRACE: clssnmUpdateNodeState: node 2, state (2/3) unique (1412298321/1412298321) prevConuni(0) birth (1/1) (old/new)
[ CSSD]2014-10-03 09:08:44.083 [2829] >USER: clssnmHandleUpdate: SYNC(1) from node(2) completed
[ CSSD]2014-10-03 09:08:44.083 [2829] >USER: clssnmHandleUpdate: NODE 2 (jxsmdb2) IS ACTIVE MEMBER OF CLUSTER
[ CSSD]2014-10-03 09:08:44.083 [2829] >TRACE: clssnmHandleUpdate: diskTimeout set to (200000)ms
[ CSSD]2014-10-03 09:08:44.083 [4628] >TRACE: clssnmWaitForAcks: done, msg type(15)
[ CSSD]2014-10-03 09:08:44.083 [4628] >TRACE: clssnmDoSyncUpdate: Sync 1 complete!
[ CSSD]2014-10-03 09:08:44.101 [1] >USER: NMEVENT_SUSPEND [00][00][00][00]
[ CSSD]2014-10-03 09:08:44.105 [4885] >TRACE: clssgmReconfigThread: started for reconfig (1)
[ CSSD]2014-10-03 09:08:44.105 [4885] >USER: NMEVENT_RECONFIG [00][00][00][04]
[ CSSD]2014-10-03 09:08:44.105 [4885] >TRACE: clssgmEstablishConnections: 1 nodes in cluster incarn 1
[ CSSD]2014-10-03 09:08:44.105 [3857] >TRACE: clssgmPeerListener: connects done (1/1)
[ CSSD]2014-10-03 09:08:44.105 [4885] >TRACE: clssgmEstablishMasterNode: MASTER for 1 is node(2) birth(1)
[ CSSD]2014-10-03 09:08:44.105 [4885] >TRACE: clssgmChangeMasterNode: requeued 0 RPCs
[ CSSD]2014-10-03 09:08:44.105 [4885] >TRACE: clssgmMasterCMSync: Synchronizing group/lock status
[ CSSD]2014-10-03 09:08:44.105 [4885] >TRACE: clssgmMasterSendDBDone: group/lock status synchronization complete
[ CSSD]CLSS-3000: reconfiguration successful, incarnation 1 with 1 nodes
[ CSSD]CLSS-3001: local node number 2, master node number 2
[ CSSD]2014-10-03 09:08:44.105 [4885] >TRACE: clssgmReconfigThread: completed for reconfig(1), with status(1)
[ CSSD]2014-10-03 09:08:44.266 [3086] >TRACE: clssgmCommonAddMember: clsomon joined (2/0x1000000/#CSS_CLSSOMON
檢視crs 服務
crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE OFFLINE
ora....B1.lsnr application ONLINE OFFLINE
ora....db1.gsd application ONLINE OFFLINE
ora....db1.ons application ONLINE OFFLINE
ora....db1.vip application ONLINE ONLINE jxsmdb2
ora....SM2.asm application ONLINE ONLINE jxsmdb2
ora....B2.lsnr application ONLINE ONLINE jxsmdb2
ora....db2.gsd application ONLINE ONLINE jxsmdb2
ora....db2.ons application ONLINE ONLINE jxsmdb2
ora....db2.vip application ONLINE ONLINE jxsmdb2
ora.jxsmk.db application ONLINE ONLINE jxsmdb2
ora....k1.inst application ONLINE OFFLINE
ora....k2.inst application ONLINE ONLINE jxsmdb2
資料庫終於在2號節點起來了
現在想想最開始的mos說的bug的原因估計是 ocr無法訪問,導致的。Mos上說的打pach 應該是在磁碟,硬體,系統都沒啥問題的情形下。
啟動1號機器crs
->tail -f al*
[cssd(143604)]CRS-1605:CSSD voting file is online: /dev/rhdisk5. Details in /oracle/product/10.2.0/crs/log/jxsmdb1/cssd/ocssd.log.
[cssd(143604)]CRS-1601:CSSD Reconfiguration complete. Active nodes are jxsmdb1 jxsmdb2 .
2014-09-30 17:39:11.803
[crsd(139286)]CRS-1012:The OCR service started on node jxsmdb1.
2014-09-30 17:39:12.848
[evmd(151694)]CRS-1401:EVMD started on node jxsmdb1.
2014-09-30 17:39:15.807
[crsd(139286)]CRS-1201:CRSD started on node jxsmdb1.
2014-10-02 00:05:49.042
[crsd(159746)]CRS-1011:OCR cannot determine that the OCR content contains the latest updates. Details in /oracle/product/10.2.0/crs/log/jxsmdb1/crsd/crsd.log.
Terminated
可以看到crs起來了
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/26175573/viewspace-1290649/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Oracle Linux 6.7中 Oracle 11.2.0.4 RAC叢集CRS異常處理OracleLinux
- oracle SP2-問題處理Oracle
- Oracle日常問題處理ORA-04031Oracle
- ORACLE問題處理十個指令碼Oracle指令碼
- python socketserver處理客戶端的流程PythonServer客戶端
- linux處理oracle問題常用命令LinuxOracle
- Oracle資料庫中的逐行處理問題NEOracle資料庫
- Oracle CPU使用率過高問題處理Oracle
- ORACLE懸疑分散式事務問題處理Oracle分散式
- 打Oracle PSU時碰到的一些問題處理Oracle
- 以客戶端為中心的錯誤處理客戶端
- Oracle database 升級(文件)to 10.2.0.4 from 10.2.0.1OracleDatabase
- 如何處理Oracle資料庫中的壞塊問題(轉)Oracle資料庫
- pyinstaller打包cx_Oracle庫問題處理記錄Oracle
- oracle系統表空間過大問題處理Oracle
- Oracle 記一次ORA-00001問題處理Oracle
- nginx 處理客戶端請求的完整過程Nginx客戶端
- Oracle OER 7451 in Load Indicator : Error Code = OSD-04500的問題處理OracleIndicatorError
- Oracle日常問題處理-資料庫無法啟動Oracle資料庫
- Oracle 11g ORA-600 [kjbrcrcvt:lms] 問題處理Oracle
- oracle客戶端安裝步驟—附圖形介面啟用失敗處理方法Oracle客戶端
- 外貿客戶管理軟體(處理外貿行業的客戶分類及歸納)行業
- Oracle資料庫處理壞塊問題常用命令Oracle資料庫
- redhat7 搭建oracle 11g RAC 問題與處理RedhatOracle
- 【ERROR】儲存鏈路問題造成oracle錯誤,ora-600[4193] 問題處理ErrorOracle
- golang json處理問題GolangJSON
- [git] git問題處理Git
- Oracle排程作業引起的空間驟增問題處理記錄Oracle
- Oracle X9M ORA-15001 ORA-15018問題處理Oracle
- 銀河麒麟系統安裝ORACLE資料庫問題處理Oracle資料庫
- Oracle 客戶端安裝Oracle客戶端
- .net異常處理的效能問題
- SpringBoot 2.6.7 處理跨域的問題Spring Boot跨域
- SpringBoot 2.7.0 處理跨域的問題Spring Boot跨域
- 【問題處理】MySQL忘記root密碼的處理辦法MySql密碼
- 併發問題處理方式
- Linux 問題處理集錦Linux
- 處理SQLServer errorlog滿問題SQLServerError
- 資料處理--pandas問題