RAC節點hang住, oracle bug導致了cpu過高,無法啟動叢集隔離

pennymeng發表於2020-07-20

問題描述

1) RAC1節點hang住, oracle bug導致了cpu高,然後叢集啟動隔離,但是cpu太高,隔離不了


問題原因

1) Bug 21286665 - "Streams AQ: enqueue blocked on low memory" waits with fix 18828868 - superseded (文件 ID 21286665.8)

日誌如圖:

 

 


解決方案

1) 主庫shutdown abort,業務調整到從庫。正常執行

2) 排查資料庫日誌,尋找問題

3) 發現問題所在,找到BUG 1) Bug 21286665 - "Streams AQ: enqueue blocked on low memory" waits with fix 18828868

 

4) 下載補丁修復BUG

p22502456_112040_Linux-x86-64.zip


升級opatch 下載optach的最新版本 p6880880_112000_Linux-x86-64.zip、、


補丁升級過程:

升級補丁需要關閉資料庫

1。下載補丁上傳至$ORACLE_HOME/Opatch目錄下,解壓備用

2. [oracle@xxx OPatch]$ pwd

/home/app/oracle/product/11.2.0/OPatch

[oracle@xxx OPatch]$ cd 22502456/

[oracle@xxx 22502456]$ ../OPatch/opatch apply


3. 驗證 ../OPatch/opatch lsinventory

4. 或者進入系統應用一下這些補丁,然後查詢驗證:

@?/rdbms/admin/catbundle.sql psu apply

select * from dba_registry_history;


2主庫升級完補丁以後,要開啟資料庫startup時報錯

ORA-15025: could not open disk "/dev/asm_ssd3"

ORA-27041: unable to open file

Linux-x86_64 Error: 13: Permission denied

Additional information: 9

ORA-15025: could not open disk "/dev/asm_ssd4"

ORA-27041: unable to open file

Linux-x86_64 Error: 13: Permission denied

Additional information: 9

ORA-15025: could not open disk "/dev/asm_ssd5"

ORA-27041: unable to open file

Linux-x86_64 Error: 13: Permission denied

Additional information: 9

SUCCESS: diskgroup SSDDATA was dismounted

ERROR: diskgroup SSDDATA was not mounted

ORA-15025: could not open disk "/dev/asm_ssd2"

ORA-27041: unable to open file

Linux-x86_64 Error: 13: Permission denied

Additional information: 9

ORA-15025: could not open disk "/dev/asm_ssd3"

ORA-27041: unable to open file

Linux-x86_64 Error: 13: Permission denied

Additional information: 9

ORA-15025: could not open disk "/dev/asm_ssd4"

ORA-27041: unable to open file

Linux-x86_64 Error: 13: Permission denied

Additional information: 9

ORA-15025: could not open disk "/dev/asm_ssd5"

ORA-27041: unable to open file


需要使用grid使用者登入後用叢集命令開啟資料庫

[grid@orcl1 ~]$ srvctl status database -d orcl

Instance orcl1 is running on node orcl1

Instance orcl2 is running on node orcl2


srvctl status instance  -d orcl -i orcl1


srvctl stop/start instance  -d orcl -i orcl1


兩個RAC節點都升級完以後,發現DG歸檔日誌報錯,


SUCCESS: diskgroup SSDDATA was dismounted

ERROR: diskgroup SSDDATA was not mounted

ORA-00210: cannot open the specified control file

ORA-00202: control file: '+SSDDATA/shdbrac/controlfile/current.257.946912997'

ORA-17503: ksfdopn:2 Failed to open file +SSDDATA/orcl/controlfile/current.257.946912997

ORA-15001: diskgroup "SSDDATA" does not exist or is not mounted

...skipping...

      returning error ORA-16191

------------------------------------------------------------

PING[ARC2]: Heartbeat failed to connect to standby 'stbdb'. Error is 16191.

Error 1017 received logging on to the standby

------------------------------------------------------------

Check that the primary and standby are using a password file

and remote_login_passwordfile is set to SHARED or EXCLUSIVE, 

and that the SYS password is same in the password files.

      returning error ORA-16191


DG關閉資料庫, 重啟後發生以下報錯:

ORA-16136 signalled during: alter database recover managed standby database cancel...

alter database open read only

AUDIT_TRAIL initialization parameter is changed to OS, as DB is NOT compatible for database opened with read-only access

Fri Apr 03 21:22:56 2020

Beginning Standby Crash Recovery.

Serial Media Recovery started

Managed Standby Recovery starting Real Time Apply

Media Recovery Log /oradata/stbdb/archivelog/1_63869_946912997.dbf

Media Recovery Waiting for thread 2 sequence 29011

Fri Apr 03 21:23:53 2020

Standby Crash Recovery aborted due to error 1013.

Errors in file /home/app/oracle/diag/rdbms/stbdb/stbdb/trace/orcl_ora_11251.trc:

ORA-01013: user requested cancel of current operation

Recovery interrupted!

Some recovered datafiles maybe left media fuzzy

Media recovery may continue but open resetlogs may fail

Completed Standby Crash Recovery.

Errors in file /home/app/oracle/diag/rdbms/stbdb/stbdb/trace/stbdb_ora_11251.trc:

ORA-10458: standby database requires recovery

ORA-01196: file 1 is inconsistent due to a failed media recovery session

ORA-01110: data file 1: '/oradata/stbdb/datafile/system.266.946913027'

ORA-10458 signalled during: alter database open read only...

alter database open

Beginning Standby Crash Recovery.

Serial Media Recovery started

Managed Standby Recovery starting Real Time Apply

Media Recovery Log /oradata/stbdb/archivelog/1_63869_946912997.dbf

Media Recovery Waiting for thread 2 sequence 29011

Fri Apr 03 21:24:12 2020

Standby Crash Recovery aborted due to error 1013.

Errors in file /home/app/oracle/diag/rdbms/stbdb/stbdb/trace/stbdb_ora_11251.trc:

ORA-01013: user requested cancel of current operation

Recovery interrupted!

Some recovered datafiles maybe left media fuzzy

Media recovery may continue but open resetlogs may fail

Completed Standby Crash Recovery.

Errors in file /home/app/oracle/diag/rdbms/stbdb/stbdb/trace/stbdb_ora_11251.trc:

ORA-10458: standby database requires recovery

ORA-01196: file 1 is inconsistent due to a failed media recovery session

ORA-01110: data file 1: '/oradata/stbdb/datafile/system.266.946913027'

ORA-10458 signalled during: alter database open...

Shutting down instance (abort)

License high water mark = 7

USER (ospid: 11251): terminating the instance

Fri Apr 03 21:24:16 2020

opiodr aborting process unknown ospid (11280) as a result of ORA-1092

Fri Apr 03 21:24:16 2020

ORA-1092 : opitsk aborting process

Instance terminated by USER, pid = 11251

Fri Apr 03 21:24:19 2020

Instance shutdown complete

Fri Apr 03 21:25:38 2020

Starting ORACLE instance (normal)


經檢查,問題原因在於口令檔案有誤


將節點1的$ORACLE_HOME/dbs下面的口令檔案,copy到節點二和DG後,問題解決。


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/69976387/viewspace-2705730/,如需轉載,請註明出處,否則將追究法律責任。

相關文章