DBCA建庫導致已有資料庫出現ORA-27140錯誤
透過DBCA在伺服器上建立一個額外的例項,結束後以外發現,原有的資料庫例項出現了異常。
簡單描述一下系統環境,Oracle 11.2.0.2 for Linux x86-64,隨後又安裝了GRID,建立了ASM磁碟組。原有的資料庫有部分表空間和REDO存放在ASM磁碟組中。
為了測試,透過DBCA建立了額外的例項,事實上,透過DBCA又先後建立了兩個資料庫例項,第二個資料庫例項建立後一切正常,將第二個例項關閉,透過DBCA建立第三個資料庫例項時,導致伺服器上的第一個例項出現了異常。
檢查例項1上的告警日誌,可以發現大量下面的錯誤資訊:
2011-08-30 10:20:54.502000 +08:00
Process W000 died, see its trace file
2011-08-30 10:20:58.518000 +08:00
Process W000 died, see its trace file
2011-08-30 10:21:02.524000 +08:00
Process W000 died, see its trace file
2011-08-30 10:21:06.527000 +08:00
Process W000 died, see its trace file
Errors in file
/u01/app/oracle/diag/rdbms/fhacdb1/fhacdb1/trace/fhacdb1_j000_19030.trc:
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operation not permitted
ORA-27302: failure occurred at: skgpwinit6
ORA-27303: additional information: startup egid = 1000 (oinstall), current egid
= 1200 (dba)
2011-08-30 10:21:07.536000 +08:00
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file
/u01/app/oracle/diag/rdbms/fhacdb1/fhacdb1/trace/fhacdb1_cjq0_12308.trc:
Errors in file
/u01/app/oracle/diag/rdbms/fhacdb1/fhacdb1/trace/fhacdb1_j000_19036.trc:
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operation not permitted
ORA-27302: failure occurred at: skgpwinit6
ORA-27303: additional information: startup egid = 1000 (oinstall), current egid
= 1200 (dba)
資料庫並非簡單的報錯而已,實際上即使是SQLPLUS / AS SYSDBA登入資料庫,進行任何操作也會顯示沒有連線到資料庫,因此對於例項1來說,只能執行SHUTDOWN ABORT關閉例項並重新啟動。
查詢METALINK發現不少類似的情況,導致這個錯誤產生的原因就是ORA_CRS_HOME/bin下的oracle可執行檔案或者ORACLE_HOME/bin下的oracle可執行檔案發生修改所致。雖然從隨後的ORA-2730X資訊中無法判斷是GRID使用者下還是ORACLE使用者下的oracle檔案被修改,但是可以確定的是,oracle可執行檔案的組資訊被修改,原本應該是oinstall,而現在變成了dba。
分別檢查ORA_CRS_HOME/bin和ORACLE_HOME/bin下的oracle檔案,發現是ORACLE目錄下的ORACLE_HOME/bin/oracle發生了改變:
[oracle@dbserver1 ~]$ cd $ORACLE_HOME/bin
[oracle@dbserver1 bin]$ ls -l oracle
-rwsr-s--x 1 oracle dba 228881982 Jul 26 16:55 oracle
[oracle@dbserver1 bin]$ ls -l |grep dba
-rwsr-s--x 1 oracle dba 228881982
Jul 26 16:55 oracle
-rwxr-xr-x 1 oracle oinstall 32372
Jul 26 16:55 osdbagrp
-rwxr-xr-x 1 oracle oinstall 35677
Aug 20 2010 osdbagrp0
可以看到,只有oracle檔案變成了dba組,bin目錄下所有其他檔案仍然是oinstall組。為了對比,在另外一個伺服器上,有一臺同時搭建的相同環境的oracle,檢查oracle檔案的組資訊:
[oracle@dbserver2 ~]$ cd $ORACLE_HOME/bin
[oracle@dbserver2 bin]$ ls -l oracle
-rwsr-s--x. 1 oracle oinstall 217733745 Apr 27 14:02 oracle
[oracle@dbserver2 bin]$ ls -l |grep dba
-rwxr-xr-x. 1 oracle oinstall 32372
Apr 27 14:02 osdbagrp
-rwxr-xr-x. 1 oracle oinstall 35677
Aug 20 2010 osdbagrp0
在這個伺服器上,oracle檔案的組屬性仍然是oinstall,這說明節點1上的oracle組屬性確實被修改了。
前面提到了,利用DBCA新增第二個例項的時候,資料庫並未出現問題,而新增第三個例項的時候,才導致oracle組資訊變化,那麼同樣是dbca的操作,為什麼有所區別呢。對比第二個和第三個例項的區別,問題應該和是否使用ASM有關。第二個例項的所有檔案都儲存在本地硬碟,而第三個例項的所有檔案都儲存在ASM磁碟組中,甚至會和例項1共用一些磁碟組,應該就是這個差別導致了問題的產生。
檢查ASM告警日誌:
2011-08-30 10:21:08.853000 +08:00
NOTE: client al32utf8:al32utf8 registered, osid 19095, mbr 0x1
2011-08-30 10:53:27.195000 +08:00
NOTE: client al32utf8:al32utf8 deregistered
2011-08-30 10:53:31.248000 +08:00
Starting background process ASMB
ASMB started with pid=42, OS id=23251
NOTE: client +ASM:+ASM registered, osid 23253, mbr 0x1
2011-08-30 10:53:36.804000 +08:00
NOTE: ASMB process exiting due to lack of ASM file activity for 5 seconds
2011-08-30 10:53:51.372000 +08:00
NOTE: client al32utf8:al32utf8 registered, osid 23441, mbr 0x1
可以看到,就是第三個例項註冊到ASM的時刻,在例項1上出現了錯誤。應該就是這個時刻,oracle檔案的組資訊發生了變化。
檢查一下例項3上的告警日誌:
2011-08-30 10:20:53.722000 +08:00
Adjusting the default value of parameter parallel_max_servers
from 960 to 135 due to the value of parameter processes (150)
Starting ORACLE instance (normal)
2011-08-30 10:20:55.149000 +08:00
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Shared memory segment for instance monitoring created
Picked latch-free SCN scheme 3
2011-08-30 10:20:56.823000 +08:00
Using LOG_ARCHIVE_DEST_1 parameter default value as
/u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch
Autotune of undo retention is turned on.
IMODE=BR
ILAT =29
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options.
Using parameter settings in client-side pfile
/u01/app/oracle/admin/al32utf8/pfile/init.ora on machine dbserver1
System parameters with non-default values:
processes = 150
memory_target = 6432M
db_block_size = 8192
compatible = "11.2.0.0.0"
db_create_file_dest = "+DATA"
undo_tablespace = "UNDOTBS1"
remote_login_passwordfile= "EXCLUSIVE"
db_domain = ""
dispatchers = "(PROTOCOL=TCP) (SERVICE=al32utf8XDB)"
local_listener = "LISTENER_AL32UTF8"
audit_file_dest = "/u01/app/oracle/admin/al32utf8/adump"
audit_trail = "DB"
db_name = "al32utf8"
open_cursors = 300
diagnostic_dest = "/u01/app/oracle"
2011-08-30 10:21:06.647000 +08:00
PMON started with pid=2, OS id=19032
PSP0 started with pid=3, OS id=19034
2011-08-30 10:21:07.721000 +08:00
VKTM started with pid=4, OS id=19038 at elevated priority
VKTM running at (1)millisec precision with DBRM quantum (100)ms
GEN0 started with pid=5, OS id=19042
DIAG started with pid=6, OS id=19044
DBRM started with pid=7, OS id=19046
DIA0 started with pid=8, OS id=19048
MMAN started with pid=9, OS id=19050
DBW0 started with pid=10, OS id=19052
DBW1 started with pid=11, OS id=19054
DBW2 started with pid=12, OS id=19056
LGWR started with pid=13, OS id=19058
CKPT started with pid=14, OS id=19060
SMON started with pid=15, OS id=19062
RECO started with pid=16, OS id=19064
MMON started with pid=17, OS id=19066
MMNL started with pid=18, OS id=19068
starting up 1 dispatcher(s) for network address
'(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
starting up 1 shared server(s) ...
ORACLE_BASE from environment = /u01/app/oracle
CREATE DATABASE "al32utf8"
MAXINSTANCES 8
MAXLOGHISTORY 1
MAXLOGFILES 16
MAXLOGMEMBERS 3
MAXDATAFILES 100
DATAFILE SIZE 1024M AUTOEXTEND ON NEXT 10240K MAXSIZE UNLIMITED
EXTENT MANAGEMENT LOCAL
SYSAUX DATAFILE SIZE 2048M AUTOEXTEND ON NEXT 10240K MAXSIZE UNLIMITED
SMALLFILE DEFAULT TEMPORARY TABLESPACE TEMP TEMPFILE SIZE 8192M AUTOEXTEND ON
NEXT 640K MAXSIZE UNLIMITED
SMALLFILE UNDO TABLESPACE "UNDOTBS1" DATAFILE SIZE 8192M AUTOEXTEND
ON NEXT 5120K MAXSIZE UNLIMITED
CHARACTER SET AL32UTF8
NATIONAL CHARACTER SET AL16UTF16
LOGFILE GROUP 1 SIZE 1024M,
GROUP 2 SIZE 1024M,
GROUP 3 SIZE 1024M
USER SYS IDENTIFIED BY *USER SYSTEM IDENTIFIED BY
Starting background process ASMB
ASMB started with pid=22, OS id=19092
2011-08-30 10:21:08.901000 +08:00
Starting background process RBAL
RBAL started with pid=23, OS id=19105
NOTE: initiating MARK startup
Starting background process MARK
MARK started with pid=24, OS id=19108
NOTE: MARK has subscribed
2011-08-30 10:21:12.325000 +08:00
NOTE: Loaded library: System
SUCCESS: diskgroup DATA was mounted
Database mounted in Exclusive Mode
Lost write protection disabled
ERROR: failed to establish dependency between database al32utf8 and diskgroup
resource ora.DATA.dg
2011-08-30 10:21:36.862000 +08:00
Successful mount of redo thread 1, with mount id 3103441300
Assigning activation ID 3103441300 (0xb8fac194)
Thread 1 opened at log sequence 1
Current log# 1 seq# 1 mem# 0: +DATA/al32utf8/onlinelog/group_1.265.760530073
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
SMON: enabling cache recovery
processing ?/rdbms/admin/dcore.bsq
create tablespace SYSTEM datafile SIZE 1024M AUTOEXTEND ON NEXT 10240K MAXSIZE
UNLIMITED
EXTENT MANAGEMENT LOCAL online
2011-08-30 10:21:44.431000 +08:00
Completed: create tablespace SYSTEM datafile SIZE 1024M AUTOEXTEND ON NEXT
10240K MAXSIZE UNLIMITED
可以確認例項1出現問題的時刻,正是例項3啟動並在ASM磁碟組中建立表空間的時刻。
瞭解了問題的原委後,這個問題並不難解決,只需要重啟例項1,就可以解決了。但是這個問題對於包含ASM的產品環境而言,存在很大的隱患,使用DBCA建立新庫,會影響現有正在執行的資料庫,這是一個埋藏比較深,且有可能造成嚴重後果的bug。
這個錯誤應該只會影響11.2以上版本,11.1以前的ASM在ORACLE_HOME目錄下,應該不會導致這個問題的產生。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/4227/viewspace-707005/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- DBCA建庫出現CHMOD NOT FOUND錯誤
- ORACLE 使用DBCA安裝資料庫出現錯誤Oracle資料庫
- 資料庫升級導致ORA-918錯誤資料庫
- 11.2.0.3 ASM例項出現ORA-4031錯誤導致資料庫歸檔失敗ASM資料庫
- 手工建庫與dbca建庫
- 資料庫增加SGA,導致ORA-27102: out of memory錯誤資料庫
- dbca靜默silent建立資料庫_空間不足出錯資料庫
- 【shmmax】由於shmmax設定過小導致dbca建庫無法完成HMM
- SPFILE 錯誤導致資料庫無法啟動(ORA-01565)資料庫
- 【oracle資料庫資料恢復】誤操作導致的資料庫誤刪除的資料恢復案例Oracle資料庫資料恢復
- oracle 9i DBCA建庫報錯Oracle
- 資料庫啟動出現ORA-27037錯誤資料庫
- 關閉資料庫出現ORA-21779錯誤資料庫
- solaris上建立oracle資料庫出現:out of memory 錯誤Oracle資料庫
- 啟動資料庫出現ORA-9925錯誤資料庫
- 【資料庫資料恢復】磁碟空間不足導致sql server錯誤的資料恢復資料庫資料恢復SQLServer
- SQL Server資料庫出現邏輯錯誤的資料恢復SQLServer資料庫資料恢復
- 資料庫主機名錯誤導致客戶端TNS-12541資料庫客戶端
- 又一例SPFILE設定錯誤導致資料庫無法啟動資料庫
- 9.2 STANDBY資料庫出現ORA-16009錯誤資料庫
- 10.2.0.1資料庫exp出現Ora-07445錯誤資料庫
- 啟動資料庫出現ORA-27123錯誤資料庫
- 資料庫啟動出現ORA-27102錯誤資料庫
- DBCA建庫報錯ORA-119
- Oracle GoldenGate導致IMP出現ORACLE 32588錯誤OracleGo
- 建庫的過程中DBCA報錯,錯誤號為ORA-12547
- 使用DBCA建庫後,出現ORA-00333 redo log read error block 49849 count 7982錯誤ErrorBloC
- 手工建庫後 sql developer連線不上新資料庫 出現 ora-12514錯誤SQLDeveloper資料庫
- 誤刪出資料檔案,透過dbca無法刪除資料庫問題資料庫
- 使用DBCA建立資料庫時,報告錯誤:ORA-12547: TNS: lost contact錯誤資料庫
- 誤修改ORACLE_HOME導致無法登陸資料庫Oracle資料庫
- [oracle]undo表空間出錯,導致資料庫例項無法開啟Oracle資料庫
- Oracle9i mount資料庫出現ORA-32700錯誤Oracle資料庫
- AIX系統啟動資料庫出現ORA-27504錯誤AI資料庫
- sqlplus直連資料庫出現ORA-27504錯誤SQL資料庫
- SWITCHOVER RAC資料庫出現ORA-600(kcctrdf_2)錯誤資料庫
- 做standby 資料庫時,出現ORA-12560 錯誤:資料庫
- 資料庫啟動出現ORA-30036錯誤資料庫