儲存裝置許可權不對導致crs啟動出錯

aaqwsh發表於2011-05-11

今天在3節點測試庫進行了一下壓力測試,導致一個節點hang住。

另外一個節點也出現I/O error

重啟後發現節點crs起不來:

 

1  檢查crs是否啟動

 

[root@rac3 log]# ps -ef | grep -i crs

root     28137     1  0 11:17 ?        00:00:00 /bin/sh /etc/init.d/init.crsd run

root     31789 30006  0 11:29 pts/1    00:00:00 grep -i crs

[root@rac3 log]#

[root@rac3 log]#

[root@rac3 log]# crsctl check crs

Failure 1 contacting CSS daemon

Cannot communicate with CRS

Cannot communicate with EVM

[root@rac3 log]#

[root@rac3 log]# crsctl check crs

Failure 1 contacting CSS daemon

Cannot communicate with CRS

Cannot communicate with EVM

[root@rac3 log]# ps -ef |egrep "crsd.bin|ocssd.bin|evmd.bin|oprocd"

root     31938 30006  0 11:30 pts/1    00:00:00 egrep crsd.bin|ocssd.bin|evmd.bin|oprocd

[root@rac3 log]#

[root@rac3 log]#

 

 

2 檢查OCR and voting disk裝置是否正確配置;

檢查OCR裝置的配置檔案/etc/oracle/ocr.loc

 

[root@rac3 log]# crsctl check crs

Failure 1 contacting CSS daemon

Cannot communicate with CRS

Cannot communicate with EVM

[root@rac3 log]# crsctl query css votedisk

 0.     0    /dev/raw/raw3

 1.     0    /dev/raw/raw4

 2.     0    /dev/raw/raw5

 

located 3 votedisk(s).

[root@rac3 log]# ocrcheck

Status of Oracle Cluster Registry is as follows :

         Version                  :          2

         Total space (kbytes)     :     106564

         Used space (kbytes)      :       5444

         Available space (kbytes) :     101120

         ID                       : 1393012097

         Device/File Name         : /dev/raw/raw1

                                    Device/File integrity check succeeded

         Device/File Name         : /dev/raw/raw2

                                    Device/File integrity check succeeded

 

         Cluster registry integrity check succeeded

 

[root@rac3 log]# crsctl stop crs

Stopping resources. This could take several minutes.

Error while stopping resources. Possible cause: CRSD is down.

[root@rac3 log]# crsctl start crs

Attempting to start CRS stack

The CRS stack will be started shortly

[root@rac3 log]#

[root@rac3 log]# ps -ef |grep d.bin

root       607 30006  0 11:39 pts/1    00:00:00 grep d.bin

[root@rac3 log]# ps -ef |grep d.bin

root       609 30006  0 11:39 pts/1    00:00:00 grep d.bin

[root@rac3 log]# ps -ef |grep d.bin

root       618 30006  0 11:39 pts/1    00:00:00 grep d.bin

[root@rac3 log]# ps -ef |grep d.bin

root       620 30006  0 11:39 pts/1    00:00:00 grep d.bin

[root@rac3 log]# ps -ef |grep d.bin

root       896 30006  0 11:41 pts/1    00:00:00 grep d.bin

 

 

[root@rac3 mapper]# cat /etc/oracle/ocr.loc

ocrconfig_loc=/dev/raw/raw1

ocrmirrorconfig_loc=/dev/raw/raw2

local_only=FALSE

 

3 檢查 crsd.log,發現沒有任何日誌:

[oracle@rac3 rac3]$ cd crsd/

[oracle@rac3 crsd]$ ls -al

total 76

drwxr-x---  2 root oinstall  4096 May  3 19:03 .

drwxr-xr-t  8 root oinstall  4096 May  3 19:03 ..

-rw-r--r--  1 root root     53171 May 11 10:38 crsd.log

[oracle@rac3 crsd]$ tail -n 50 crsd.log

2011-05-11 10:15:09.400: [  CRSRES][1484962144]0startRunnable: setting CLI values

2011-05-11 10:16:29.767: [  OCRUTL][1252067680]u_freem: mem passed is null

2011-05-11 10:32:25.140: [  CRSEVT][1487063392]0CAAMonitorHandler :: 0:Could not join /home/oracle/oracle/product/10.2.0/crs/bin/racgwrap(check)

category: 1234, operation: scls_process_join, loc: childcrash, OS error: 0, other: Abnormal termination of the child

 

2011-05-11 10:32:25.140: [  CRSEVT][1487063392]0CAAMonitorHandler :: 0:Action Script. /home/oracle/oracle/product/10.2.0/crs/bin/racgwrap(check) timed out for ora.rac3.vip!

(timeout=60)

2011-05-11 10:32:25.140: [  CRSAPP][1487063392]0CheckResource error for ora.rac3.vip error code = -2

2011-05-11 10:38:14.731: [  CRSEVT][1499654496]0CAAMonitorHandler :: 0:Could not join /home/oracle/oracle/product/10.2.0/crs/bin/racgwrap(check)

category: 1234, operation: scls_process_join, loc: childcrash, OS error: 0, other: Abnormal termination of the child

 

2011-05-11 10:38:25.335: [  CRSEVT][1499654496]0CAAMonitorHandler :: 0:Action Script. /home/oracle/oracle/product/10.2.0/crs/bin/racgwrap(check) timed out for ora.rac3.vip!

(timeout=60)

2011-05-11 10:38:25.335: [  CRSAPP][1499654496]0CheckResource error for ora.rac3.vip error code = -2

2011-05-11 10:38:33.140: [  CRSRES][1503856992]0In stateChanged, ora.rac.rac3.inst target is ONLINE

2011-05-11 10:38:33.141: [  CRSRES][1503856992]0ora.rac.rac3.inst on rac3 went OFFLINE unexpectedly

2011-05-11 10:38:33.141: [  CRSRES][1503856992]0StopResource: setting CLI values

2011-05-11 10:38:38.143: [  CRSRES][1503856992]0Attempting to stop `ora.rac.rac3.inst` on member `rac3`

 

4 注意到/etc/init.d/init.cssd startcheck ,於是檢查/tmp

[root@rac3 log]# ps -ef

root     28128     1  0 11:17 ?        00:00:00 /usr/bin/gdm-binary -nodaemon

root     28132     1  0 11:17 ?        00:00:00 /bin/sh /etc/init.d/init.evmd run

root     28135     1  0 11:17 ?        00:00:00 /bin/sh /etc/init.d/init.cssd fatal

root     28137     1  0 11:17 ?        00:00:00 /bin/sh /etc/init.d/init.crsd run

root     28678 28132  0 11:17 ?        00:00:00 /bin/sh /etc/init.d/init.cssd startcheck

root     28982 28135  0 11:17 ?        00:00:00 /bin/sh /etc/init.d/init.cssd startcheck

root     29026 28128  0 11:17 ?        00:00:00 /usr/bin/gdm-binary -nodaemon

root     29068 29026  0 11:17 ?        00:00:05 /usr/X11R6/bin/X :0 -audit 0 -auth /var/gdm/:0.Xauth -nolisten tcp vt7

root     29201 28137  0 11:17 ?        00:00:00 /bin/sh /etc/init.d/init.cssd startcheck

 

[root@rac3 mapper]# cd /tmp

[root@rac3 tmp]# ls -altr

-rw-r--r--   1 oracle oinstall  148 May 11 12:34 crsctl.29201

-rw-r--r--   1 oracle oinstall  148 May 11 12:34 crsctl.28982

-rw-r--r--   1 oracle oinstall  148 May 11 12:34 crsctl.28678

[root@rac3 tmp]# cat crsctl.28678

OCR initialization failed accessing OCR device: PROC-26: Error while accessing the physical storage Operating System error [Permission denied] [13]

[root@rac3 tmp]#

[root@rac3 tmp]#

 

5 通過上述錯誤,應該是許可權問題:

[root@rac3 tmp]# cd /dev/raw

[root@rac3 raw]# ls -al

total 0

drwxr-xr-x   2 root root    180 May 11 11:17 .

drwxr-xr-x  11 root root   7620 May 11 11:17 ..

crw-rw----   1 root disk 162, 1 May 11 11:17 raw1

crw-rw----   1 root disk 162, 2 May 11 11:17 raw2

crw-rw----   1 root disk 162, 3 May 11 11:17 raw3

crw-rw----   1 root disk 162, 4 May 11 11:17 raw4

crw-rw----   1 root disk 162, 5 May 11 11:17 raw5

crw-rw----   1 root disk 162, 6 May 11 11:17 raw6

crw-rw----   1 root disk 162, 7 May 11 11:17 raw7

[root@rac3 raw]# chown -R oracle:dba /dev/raw

[root@rac3 raw]# chmod -R 777 /dev/raw

[root@rac3 raw]#

 

6 檢查許可權情況,啟動crs

[root@rac3 raw]# ls -al

total 0

drwxrwxrwx   2 oracle dba     180 May 11 11:17 .

drwxr-xr-x  11 root   root   7620 May 11 11:17 ..

crwxrwxrwx   1 oracle dba  162, 1 May 11 11:17 raw1

crwxrwxrwx   1 oracle dba  162, 2 May 11 11:17 raw2

crwxrwxrwx   1 oracle dba  162, 3 May 11 11:17 raw3

crwxrwxrwx   1 oracle dba  162, 4 May 11 11:17 raw4

crwxrwxrwx   1 oracle dba  162, 5 May 11 11:17 raw5

crwxrwxrwx   1 oracle dba  162, 6 May 11 11:17 raw6

crwxrwxrwx   1 oracle dba  162, 7 May 11 11:17 raw7

[root@rac3 raw]#

 

 

[root@rac3 raw]# crsctl stop crs

Stopping resources. This could take several minutes.

Successfully stopped CRS resources.

Stopping CSSD.

Shutting down CSS daemon.

Shutdown request successfully issued.

[root@rac3 raw]# crsctl start crs

Attempting to start CRS stack

The CRS stack will be started shortly

 

 

[root@rac3 raw]# crsctl check crs

CSS appears healthy

CRS appears healthy

EVM appears healthy

  

[root@rac3 raw]# crs_stat -t

Name           Type           Target    State     Host       

------------------------------------------------------------

ora.rac.db     application    ONLINE    ONLINE    rac2       

ora....c1.inst application    ONLINE    ONLINE    rac1       

ora....c2.inst application    ONLINE    ONLINE    rac2       

ora....c3.inst application    ONLINE    ONLINE    rac3       

ora....SM1.asm application    ONLINE    ONLINE    rac1       

ora....C1.lsnr application    ONLINE    ONLINE    rac1       

ora.rac1.gsd   application    ONLINE    ONLINE    rac1       

ora.rac1.ons   application    ONLINE    ONLINE    rac1       

ora.rac1.vip   application    ONLINE    ONLINE    rac1       

ora....SM2.asm application    ONLINE    ONLINE    rac2       

ora....C2.lsnr application    ONLINE    ONLINE    rac2       

ora.rac2.gsd   application    ONLINE    ONLINE    rac2       

ora.rac2.ons   application    ONLINE    ONLINE    rac2       

ora.rac2.vip   application    ONLINE    ONLINE    rac2       

ora....SM3.asm application    ONLINE    ONLINE    rac3       

ora....C3.lsnr application    ONLINE    ONLINE    rac3       

ora.rac3.gsd   application    ONLINE    ONLINE    rac3       

ora.rac3.ons   application    ONLINE    ONLINE    rac3       

ora.rac3.vip   application    ONLINE    ONLINE    rac3       

[root@rac3 raw]# ps -ef |grep d.bin

oracle   11568 11563  0 12:38 ?        00:00:00 /home/oracle/oracle/product/10.2.0/crs/bin/evmd.bin

root     11773 10601  0 12:38 ?        00:00:02 /home/oracle/oracle/product/10.2.0/crs/bin/crsd.bin reboot

root     12310 11853  0 12:38 ?        00:00:00 /home/oracle/oracle/product/10.2.0/crs/bin/oprocd.bin run -t 1000 -m 10000 -hsi 5:10:50:75:90 -f

oracle   12454 11890  0 12:38 ?        00:00:01 /home/oracle/oracle/product/10.2.0/crs/bin/ocssd.bin

root     22908 30006  0 13:03 pts/1    00:00:00 grep d.bin

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/758322/viewspace-695008/,如需轉載,請註明出處,否則將追究法律責任。

相關文章