有關ocssd程式的問題解決:
昨天有一個資料庫的使用者提出,在資料庫伺服器上的/var/log/messages檔案中,每5秒鐘寫一些日誌,內容:
Feb 27 08:11:44 bj su(pam_unix)[7692]: session opened for user oracle by (uid=0)
Feb 27 08:11:44 bj su(pam_unix)[7692]: session closed for user oracle
Feb 27 08:11:44 bj logger: Failure in CSS initialization opening OCR.
Feb 27 08:11:49 bj su(pam_unix)[7731]: session opened for user oracle by (uid=0)
Feb 27 08:11:49 bj su(pam_unix)[7731]: session closed for user oracle
Feb 27 08:11:49 bj logger: Failure in CSS initialization opening OCR.
我檢查了另外一個10g的資料庫伺服器,相同的檔案:
Feb 28 10:02:27 bj sshd(pam_unix)[5985]: session opened for user lisa by (uid=502)
Feb 28 10:06:40 bj sshd(pam_unix)[5985]: session closed for user lisa
Feb 28 15:31:17 bj sshd(pam_unix)[6115]: session opened for user lisa by (uid=502)
Feb 28 15:32:09 bj sshd(pam_unix)[6115]: session closed for user lisa
Mar 1 10:19:54 bj sshd(pam_unix)[15042]: session opened for user lisa by (uid=502)
Mar 1 10:54:29 bj su(pam_unix)[15086]: session opened for user root by lisa(uid=502)
Mar 1 10:54:33 bj su(pam_unix)[15119]: session opened for user oracle by lisa(uid=0)
Mar 1 12:12:30 bj su(pam_unix)[15189]: session opened for user root by lisa(uid=501)
記錄的是一些使用者登入的資訊,以及使用者su的資訊,其中前面的程式碼是程式的ID,後面的程式碼是使用者的ID。
檢視有問題的資料庫伺服器的bdump目錄和udump目錄,以及alert.log檔案,均沒有發現異常記錄。
檢視系統程式:
[oracle@db1 udump]$ ps -ef | grep css
root 5716 1 0 Jan11 ? 00:00:00 /bin/sh /etc/init.d/init.cssd run
root 5721 5716 0 Jan11 ? 00:13:17 /bin/sh /etc/init.d/init.cssd startcheck
oracle 17210 5844 0 14:02 pts/2 00:00:00 grep css
正確的資料庫伺服器上的系統程式:
[root@bj log]# ps -ef | grep css
root 4669 1 0 2004 ? 00:00:00 /bin/su oracle -c exec /home/oracle/product/10.1.0/db_1/bin/ocssd
oracle 4771 4669 0 2004 ? 00:25:53 /home/oracle/product/10.1.0/db_1/bin/ocssd.bin
root 15278 15225 0 14:05 pts/0 00:00:00 grep css
隨即我檢視了/etc/init.d/init.cssd,沒有什麼收穫,太長了,我沒有仔細看。
察看oracle的文件有關css的部分:
Oracle Cluster Synchronization Services (CSS) is a daemon process that is configured by the root.sh script when you install Oracle Database 10g for the first time. It is configured to start every time the system boots. This daemon process is required to enable synchronization between Oracle ASM and database instances. It must be running if an Oracle database is using ASM for database file storage.
CSS是一個後臺程式,安裝的時候預設安裝的,系統啟動的時候自動啟動,用來做ASM和資料庫例項的同步,如果使用ASM則必須要使用這個程式。
先放了一半的心,因為現在的資料庫並沒有使用ASM,實在不行還可以把它停掉。
然後檢視了oracle文件中有關Reconfiguring Oracle Cluster Synchronization Services 部分,摘錄如下:
1、Identifying Oracle Database 10g Oracle Homes
To identify all of the Oracle Database 10g Oracle home directories, enter one of the following commands:
$ more /etc/oratab
這是在我的伺服器上的結果
[root@bj log]# more /etc/oratab
#
# This file is used by ORACLE utilities. It is created by root.sh
# and updated by the Database Configuration Assistant when creating
# a database.
# A colon, ':', is used as the field terminator. A new line terminates
# the entry. Lines beginning with a pound sign, '#', are comments.
#
# Entries are of the form:
# $ORACLE_SID:$ORACLE_HOME::
#
# The first and second fields are the system identifier and home
# directory of the database respectively. The third filed indicates
# to the dbstart utility that the database should , "Y", or should not,
# "N", be brought up at system boot time.
#
# Multiple entries with the same $ORACLE_SID are not allowed.
#
#
# *:/home/oracle/product/10.1.0/db_1:N
$ORACLE_SID:/home/oracle/product/10.1.0/db_1:N
*:/home/oracle/product/10.1.0/db_1:N
$ORACLE_SID:/home/oracle/product/10.1.0/db_1:N
From the output, identify any Oracle home directories where Oracle Database 10g is installed. Oracle homes that contain Oracle Database 10g typically have paths similar to the following. However, they might use different paths.
/mount_point/app/oracle/product/10.1.0/db_n
If there is only one Oracle home directory that contains Oracle Database 10g, see the "Deleting the Oracle CSS Daemon Configuration" section for information about deleting the Oracle CSS daemon configuration.
If you identify more than one Oracle Database 10g Oracle home directory, see the following section for information about reconfiguring the Oracle CSS daemon.
2、Reconfiguring the Oracle CSS Daemon
To reconfigure the Oracle CSS daemon so that it runs from an Oracle home that you are not removing, follow these steps:
In all Oracle home directories on the system, stop all Oracle ASM instances and any Oracle Database instances that use ASM for database file storage.
Switch user to root.
Depending on your operating system, enter one of the following commands to identify the Oracle home directory being used to run the CSS daemon:
# more /etc/oracle/ocr.loc
The output from this command is similar to the following:
ocrconfig_loc=/u01/app/oracle/product/10.1.0/db_1/cdata/localhost/local.ocr
local_only=TRUE
這是在我的伺服器上的結果
[root@bj log]# more /etc/oracle/ocr.loc
ocrconfig_loc=/home/oracle/product/10.1.0/db_1/cdata/localhost/local.ocr
local_only=TRUE
The ocrconfig_loc parameter specifies the location of the Oracle Cluster Registry (OCR) used by the CSS daemon. The path up to the cdata directory is the Oracle home directory where the CSS daemon is running (/Volumes/u01/app/oracle/product/10.1.0/db_1 in this example).
Note:
If the value for the local_only parameter is FALSE, Oracle CRS is installed on this system. See the Oracle Real Application Clusters Installation and Configuration Guide for information about removing RAC or CRS.
If this Oracle home directory is not the Oracle home that you want to remove, you can continue to the "Removing Oracle Software" section.
Change directory to the Oracle home directory for an Oracle Database 10g installation that you are not removing.
Set the ORACLE_HOME environment variable to specify the path to this Oracle home directory:
Bourne, Bash, or Korn shell:
# ORACLE_HOME=/u01/app/oracle/product/10.1.0/db_2;
# export ORACLE_HOME
C shell:
# setenv ORACLE_HOME /u01/app/oracle/product/10.1.0/db_2
Enter the following command to reconfigure the CSS daemon to run from this Oracle home:
# $ORACLE_HOME/bin/localconfig reset $ORACLE_HOME
The script stops the Oracle CSS daemon, reconfigures it in the new Oracle home, and then restarts it. When the system boots, the CSS daemon starts automatically from the new Oracle home.
To remove the original Oracle home directory, see the "Removing Oracle Software" section.
3、Deleting the Oracle CSS Daemon Configuration
To delete the Oracle CSS daemon configuration, follow these steps:
Note:
Delete the CSS daemon configuration only if you are certain that no other Oracle Database 10g installation requires it.
Remove any databases or ASM instances associated with this Oracle home. See the preceding sections for information about how to complete these tasks.
Switch user to root.
Change directory to the Oracle home directory that you are removing.
Set the ORACLE_HOME environment variable to specify the path to this Oracle home directory:
Bourne, Bash, or Korn shell:
# ORACLE_HOME=/u01/app/oracle/product/10.1.0/db_1;
# export ORACLE_HOME
C shell:
# setenv ORACLE_HOME /u01/app/oracle/product/10.1.0/db_1
Enter the following command to delete the CSS daemon configuration from this Oracle home:
# $ORACLE_HOME/bin/localconfig delete
The script stops the Oracle CSS daemon, then deletes its configuration. When the system boots, the CSS daemon no longer starts.
那麼可以試著重新設定或者刪除css程式的配置,但是這兩個操作需要用root使用者來做,但是那臺錯誤的伺服器,我並沒有root的口令,並且我也沒有什麼把握。
於是我開始檢查我的其他的兩臺安裝10g的伺服器:
第一臺伺服器:
[lisa@localhost lisa]$ ps -ef | grep css
lisa 3336 3294 0 14:39 pts/0 00:00:00 grep css
什麼程式也沒有,呵呵
網上有人提到,將這個檔案的最後一樣去掉,就可以將occsd.bin的程式去掉,但是不提倡這樣做:
[lisa@localhost lisa]$ cat /etc/inittab
......
# Run xdm in runlevel 5
x:5:respawn:/etc/X11/prefdm -nodaemon
h1:35:respawn:/etc/init.d/init.cssd run >/dev/null 2>&1
檢視/etc/oracle/ocr.loc和/etc/oratab,都沒有什麼問題,和正確的伺服器上的配置是相同的。
檢視日誌檔案,現象是每5分鐘要執行crsstart,我理解是要啟動ocssd程式:
[root@localhost log]# tail messages
Mar 1 14:39:39 localhost logger: Oracle Cluster Ready Services disabled by corrupt install
Mar 1 14:39:39 localhost logger: Could not access /etc/oracle/scls_scr/localhost.localdomain/root/crsstart.
Mar 1 14:39:39 localhost logger: Oracle Cluster Ready Services disabled by corrupt install
Mar 1 14:39:39 localhost logger: Could not access /etc/oracle/scls_scr/localhost.localdomain/root/crsstart.
Mar 1 14:39:39 localhost logger: Oracle Cluster Ready Services disabled by corrupt install
Mar 1 14:39:39 localhost logger: Could not access /etc/oracle/scls_scr/localhost.localdomain/root/crsstart.
Mar 1 14:39:39 localhost logger: Oracle Cluster Ready Services disabled by corrupt install
Mar 1 14:39:39 localhost logger: Could not access /etc/oracle/scls_scr/localhost.localdomain/root/crsstart.
Mar 1 14:39:39 localhost init: Id "h1" respawning too fast: disabled for 5 minutes
Mar 1 14:41:48 localhost su(pam_unix)[3482]: session opened for user root by lisa(uid=502)
/etc/oracle/scls_scr/這個目錄下面並沒有localhost.localdomain這個目錄。
檢視環境變數:
[root@localhost scls_scr]# env
HOSTNAME=localhost.localdomain
應該是HOSTNAME不對造成的,於是修改HOSTNAME。
由於修改HOSTNAME操作遇到一點兒問題,所以,我當時打算放棄了,注掉了/etc/inittab最後一行,企圖停止程式啟動。
[lisa@localhost lisa]$ cat /etc/inittab
......
# Run xdm in runlevel 5
x:5:respawn:/etc/X11/prefdm -nodaemon
#h1:35:respawn:/etc/init.d/init.cssd run >/dev/null 2>&1
但是注掉這一行以後,並沒有如我所願的沒有再寫日誌,問題還是一如既往地存在著。
如果我直接執行那個檔案,提示沒有許可權(不論用root還是oracle):
[root@localhost log]# /etc/oracle/scls_scr/*/root/crsstart
bash: /etc/oracle/scls_scr/****/root/crsstart: Permission denied
最後在網管的指導下,成功修改了HOSTNAME(呵呵,汗顏)
再檢查日誌檔案:
[root@bj34 log]# tail messages
Mar 1 15:23:01 localhost su(pam_unix)[8583]: session opened for user oracle by (uid=0)
Mar 1 15:23:01 localhost su(pam_unix)[8583]: session closed for user oracle
Mar 1 15:23:01 localhost logger: Failed 3 to bind listening endpoint: (ADDRESS=(PROTOCOL=tcp)(HOST=**))
Mar 1 15:23:05 localhost su(pam_unix)[8623]: session opened for user root by lisa(uid=502)
Mar 1 15:23:06 localhost su(pam_unix)[8655]: session opened for user oracle by (uid=0)
Mar 1 15:23:06 localhost su(pam_unix)[8655]: session closed for user oracle
Mar 1 15:23:06 localhost logger: Failed 3 to bind listening endpoint: (ADDRESS=(PROTOCOL=tcp)(HOST=**))
Mar 1 15:23:11 localhost su(pam_unix)[8695]: session opened for user oracle by (uid=0)
Mar 1 15:23:11 localhost su(pam_unix)[8695]: session closed for user oracle
Mar 1 15:23:11 localhost logger: Failed 3 to bind listening endpoint: (ADDRESS=(PROTOCOL=tcp)(HOST=**))
仍然提示了錯誤,但是問題改變了,每5秒鐘寫一次,應該還是機器名配置的問題
檢視系統程式:
[root@bj34 log]# ps -ef | grep css
root 4722 1 0 15:14 ? 00:00:00 /bin/su -l oracle -c exec /home/oracle/product/10.1.0/db_1/bin/ocssd
oracle 9834 4722 0 15:25 ? 00:00:00 /home/oracle/product/10.1.0/db_1/bin/ocssd.bin
root 9957 9921 0 15:27 pts/0 00:00:00 grep css
程式上倒是對了的。
這次我修改了/etc/hosts,問題就解決了:
沒有再寫日誌檔案。
第二臺伺服器:
系統程式:
[root@bj72 root]# ps -ef | grep css
root 5498 1 0 15:50 ? 00:00:00 /bin/sh /etc/init.d/init.cssd run
root 5501 5498 0 15:50 ? 00:00:00 /bin/sh /etc/init.d/init.cssd startcheck
root 5515 5465 0 15:51 pts/0 00:00:00 grep css
[root@bj72 lisa]# tail /var/log/messages
Mar 1 15:45:50 bj72 logger: Oracle Cluster Ready Services disabled by corrupt install
Mar 1 15:45:50 bj72 logger: Could not access /etc/oracle/scls_scr/****/root/crsstart.
Mar 1 15:45:50 bj72 logger: Oracle Cluster Ready Services disabled by corrupt install
Mar 1 15:45:50 bj72 logger: Could not access /etc/oracle/scls_scr/****/root/crsstart.
Mar 1 15:45:50 bj72 logger: Oracle Cluster Ready Services disabled by corrupt install
Mar 1 15:45:50 bj72 logger: Could not access /etc/oracle/scls_scr/****/root/crsstart.
Mar 1 15:45:50 bj72 logger: Oracle Cluster Ready Services disabled by corrupt install
Mar 1 15:45:50 bj72 logger: Could not access /etc/oracle/scls_scr/****/root/crsstart.
Mar 1 15:45:50 bj72 init: Id "h1" respawning too fast: disabled for 5 minutes
Mar 1 15:47:55 bj72 su(pam_unix)[5464]: session opened for user root by lisa(uid=502)
也是提示錯誤的,但是錯誤的情況不同
在/etc/oracle/scls_scr/目錄下面沒有****這個目錄,檢視環境變數,****為HOSTNAME,由於這臺伺服器在資料庫已經安裝完畢,且執行了一段時間以後遷移到其他機房,並更換了IP和HOSTNAME,想必是這個原因引起的。
這次不能修改HOSTNAME了,所以我把****目錄重新命名為新的HOSTNAME,五分鐘後:
[root@bj72 root]# tail /var/log/messages
Mar 1 15:45:50 bj72 logger: Oracle Cluster Ready Services disabled by corrupt install
Mar 1 15:45:50 bj72 logger: Could not access /etc/oracle/scls_scr/****/root/crsstart.
Mar 1 15:45:50 bj72 init: Id "h1" respawning too fast: disabled for 5 minutes
Mar 1 15:47:55 bj72 su(pam_unix)[5464]: session opened for user root by lisa(uid=502)
Mar 1 15:50:51 bj72 su(pam_unix)[5505]: session opened for user oracle by (uid=0)
Mar 1 15:50:51 bj72 su(pam_unix)[5505]: session closed for user oracle
Mar 1 15:51:51 bj72 su(pam_unix)[5498]: session opened for user oracle by (uid=0)
Mar 1 15:51:52 bj72 su(pam_unix)[5498]: session closed for user oracle
Mar 1 15:51:52 bj72 su(pam_unix)[5532]: session opened for user oracle by (uid=0)
Mar 1 15:51:52 bj72 su(pam_unix)[5532]: session closed for user oracle
問題發生了變化,但是仍然存在。這次我試來試去都不行,決定把程式停掉:
首先
[root@bj72 etc]# cp inittab.no_cssd inittab
感覺上就是把最後一行刪掉了而已,仍舊寫日誌。
執行:
[root@bj72 bin]# $ORACLE_HOME/bin/localconfig reset $ORACLE_HOME
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Oracle Cluster Registry for cluster has been initialized
Usage: /etc/init.d/init.cssd {start|stop|run|fatal|startcheck|activatevg}
Usage: /etc/init.d/init.cssd {start|stop|run|fatal|startcheck|activatevg}
Usage: /etc/init.d/init.cssd {start|stop|run|fatal|startcheck|activatevg}
Usage: /etc/init.d/init.cssd {start|stop|run|fatal|startcheck|activatevg}
Usage: /etc/init.d/init.cssd {start|stop|run|fatal|startcheck|activatevg}
Usage: /etc/init.d/init.cssd {start|stop|run|fatal|startcheck|activatevg}
Adding to inittab
/home/oracle/product/10.1.0/db_1/bin/localconfig: line 1: /bin/cp: No such file or directory
Checking the status of new Oracle init process...
Expecting the CRS daemons to be up within 600 seconds.
Giving up: Oracle CSS stack appears NOT to be running.
Oracle CSS service would not start as installed
Automatic Storage Management(ASM) cannot be used until Oracle CSS service is started
再檢視程式,已經沒有了:
[root@bj72 bin]# ps -ef | grep css
root 6279 5465 0 16:30 pts/0 00:00:00 grep css
檢視日誌檔案,可以看到重新配置的過程,後來也沒有再寫:
[root@bj72 bin]# tail /var/log/messages
Mar 1 16:17:19 bj72 su(pam_unix)[6125]: session opened for user oracle by (uid=0)
Mar 1 16:17:20 bj72 su(pam_unix)[6125]: session closed for user oracle
Mar 1 16:17:20 bj72 su(pam_unix)[6156]: session opened for user oracle by (uid=0)
Mar 1 16:17:20 bj72 su(pam_unix)[6156]: session closed for user oracle
Mar 1 16:18:20 bj72 su(pam_unix)[6149]: session opened for user oracle by (uid=0)
Mar 1 16:18:21 bj72 su(pam_unix)[6149]: session closed for user oracle
Mar 1 16:18:21 bj72 su(pam_unix)[6178]: session opened for user oracle by (uid=0)
Mar 1 16:18:21 bj72 su(pam_unix)[6178]: session closed for user oracle
Mar 1 16:19:09 bj72 lisa: (Oracle CSSD will be run out of init)
Mar 1 16:19:09 bj72 init: Re-reading inittab
根據我掌握的這三臺伺服器的情況看,使用者所提出的問題應該是HOSTNAME修改造成的,於是建議使用者用root執行
$ORACLE_HOME/bin/localconfig reset $ORACLE_HOME
執行的結果:
[root@db1 oracle]# $ORACLE_HOME/bin/localconfig reset $ORACLE_HOME
nThe following environment variables are set as:
ORACLE_OWNER= oracle
ORACLE_HOME= /home1/oracle/product/10.1.0/db_1
Failure at scls_scr_create with code 1
Internal Error Information:
Category: 1234
Operation: scls_scr_create
Location: mkdir
Other: Unable to make user dir
Dep: 2
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Oracle Cluster Registry for cluster has been initialized
Usage: /etc/init.d/init.cssd {start|stop|run|fatal|startcheck|activatevg}
Usage: /etc/init.d/init.cssd {start|stop|run|fatal|startcheck|activatevg}
Usage: /etc/init.d/init.cssd {start|stop|run|fatal|startcheck|activatevg}
Usage: /etc/init.d/init.cssd {start|stop|run|fatal|startcheck|activatevg}
Usage: /etc/init.d/init.cssd {start|stop|run|fatal|startcheck|activatevg}
Usage: /etc/init.d/init.cssd {start|stop|run|fatal|startcheck|activatevg}
Adding to inittab
Checking the status of new Oracle init process...
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
db1
CSS is active on all nodes.
Oracle CSS service is installed and running under init(1M)
檢視程式:
[oracle@db1 oracle]$ ps -ef | grep css
root 5716 1 0 Jan11 ? 00:00:00 /bin/su -l oracle -c exec /home1/oracle/product/10.1.0/db_1/bin/ocssd
oracle 9933 5716 0 17:15 ? 00:00:00 /home1/oracle/product/10.1.0/db_1/bin/ocssd.bin
也沒有再寫日誌檔案,至此問題解決。
綜上所述,在資料庫伺服器安裝完畢以後,如果修改了HOSTNAME,會導致ocssd程式啟動錯誤,因為程式啟動的目錄是寫死了機器名的,執行
$ORACLE_HOME/bin/localconfig reset $ORACLE_HOME
重新配置引數就可以解決了。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/51862/viewspace-180537/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- oracle10g中EM的有關問題解決Oracle
- 問題解決方法有三
- 關於PHP程式設計師解決問題的能力PHP程式設計師
- 怎樣成為解決問題的高手?——關於問題解決的關鍵4步驟
- 解決機器學習問題有通法機器學習
- 解決Hexo關於圖片的問題Hexo
- 關於解決問題的幾個段位
- 關於Failed to resolve的問題解決AI
- 解決「問題」,不要解決問題
- 有關模型關聯的問題模型
- 有關asp程式設計中,baseclass的問題程式設計
- 用SQL解決有向圖問題SQL
- Eclipse解決JavaScript等支援問題(沒有高亮,沒有程式碼提示)EclipseJavaScript
- 關於 LF will be replaced by CRLF 問題的解決方式
- 一個關於/root/.gvfs的問題解決?
- SaaS無法解決“關鍵”問題
- 解決slackware關機問題(轉)
- svn相關問題解決辦法
- 關於哪裡有開住宿費發票|問題解決周
- 解決右鍵選單沒有新建的問題
- banq,你好,我有一個急需解決的問題??
- 用SQL解決有向圖問題(轉)SQL
- 幽默:程式設計中困難的不是解決問題,而是確定要解決的問題 - Paul程式設計
- kafka shutdown停止關閉很慢問題的解決方案Kafka
- 關於解決博弈論問題的SG函式函式
- 關於oracle invalid components問題的解決Oracle
- 關於ORA-28031問題的解決
- 解決mysql_query()報錯的相關問題MySql
- 關於listener無法啟動的問題解決
- 關於 PHP Session ID 改變的問題解決PHPSession
- 物理方法解決數學問題(五):一個與橢圓有關的性質
- 高手都進來歇歇~解決一個問題關於SE的問題
- Android定時關機問題解決Android
- 關於並口,串列埠問題解決串列埠
- 解決csdn關注瀏覽全文問題
- 有關使用java -Xrunhprof的問題Java
- 有關時間同步的問題
- 有關*.properties檔案的問題