RAC節點hang住, oracle bug導致了cpu過高,無法啟動叢集隔離
問題描述
1) RAC1節點hang住, oracle bug導致了cpu高,然後叢集啟動隔離,但是cpu太高,隔離不了
問題原因
1) Bug 21286665 - "Streams AQ: enqueue blocked on low memory" waits with fix 18828868 - superseded (文件 ID 21286665.8)
日誌如圖:
解決方案
1) 主庫shutdown abort,業務調整到從庫。正常執行
2) 排查資料庫日誌,尋找問題
3) 發現問題所在,找到BUG 1) Bug 21286665 - "Streams AQ: enqueue blocked on low memory" waits with fix 18828868
4) 下載補丁修復BUG
p22502456_112040_Linux-x86-64.zip
升級opatch 下載optach的最新版本 p6880880_112000_Linux-x86-64.zip、、
補丁升級過程:
升級補丁需要關閉資料庫
1。下載補丁上傳至$ORACLE_HOME/Opatch目錄下,解壓備用
2. [oracle@xxx OPatch]$ pwd
/home/app/oracle/product/11.2.0/OPatch
[oracle@xxx OPatch]$ cd 22502456/
[oracle@xxx 22502456]$ ../OPatch/opatch apply
3. 驗證 ../OPatch/opatch lsinventory
4. 或者進入系統應用一下這些補丁,然後查詢驗證:
@?/rdbms/admin/catbundle.sql psu apply
select * from dba_registry_history;
2主庫升級完補丁以後,要開啟資料庫startup時報錯
ORA-15025: could not open disk "/dev/asm_ssd3"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
ORA-15025: could not open disk "/dev/asm_ssd4"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
ORA-15025: could not open disk "/dev/asm_ssd5"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
SUCCESS: diskgroup SSDDATA was dismounted
ERROR: diskgroup SSDDATA was not mounted
ORA-15025: could not open disk "/dev/asm_ssd2"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
ORA-15025: could not open disk "/dev/asm_ssd3"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
ORA-15025: could not open disk "/dev/asm_ssd4"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
ORA-15025: could not open disk "/dev/asm_ssd5"
ORA-27041: unable to open file
需要使用grid使用者登入後用叢集命令開啟資料庫
[grid@orcl1 ~]$ srvctl status database -d orcl
Instance orcl1 is running on node orcl1
Instance orcl2 is running on node orcl2
srvctl status instance -d orcl -i orcl1
srvctl stop/start instance -d orcl -i orcl1
兩個RAC節點都升級完以後,發現DG歸檔日誌報錯,
SUCCESS: diskgroup SSDDATA was dismounted
ERROR: diskgroup SSDDATA was not mounted
ORA-00210: cannot open the specified control file
ORA-00202: control file: '+SSDDATA/shdbrac/controlfile/current.257.946912997'
ORA-17503: ksfdopn:2 Failed to open file +SSDDATA/orcl/controlfile/current.257.946912997
ORA-15001: diskgroup "SSDDATA" does not exist or is not mounted
...skipping...
returning error ORA-16191
------------------------------------------------------------
PING[ARC2]: Heartbeat failed to connect to standby 'stbdb'. Error is 16191.
Error 1017 received logging on to the standby
------------------------------------------------------------
Check that the primary and standby are using a password file
and remote_login_passwordfile is set to SHARED or EXCLUSIVE,
and that the SYS password is same in the password files.
returning error ORA-16191
DG關閉資料庫, 重啟後發生以下報錯:
ORA-16136 signalled during: alter database recover managed standby database cancel...
alter database open read only
AUDIT_TRAIL initialization parameter is changed to OS, as DB is NOT compatible for database opened with read-only access
Fri Apr 03 21:22:56 2020
Beginning Standby Crash Recovery.
Serial Media Recovery started
Managed Standby Recovery starting Real Time Apply
Media Recovery Log /oradata/stbdb/archivelog/1_63869_946912997.dbf
Media Recovery Waiting for thread 2 sequence 29011
Fri Apr 03 21:23:53 2020
Standby Crash Recovery aborted due to error 1013.
Errors in file /home/app/oracle/diag/rdbms/stbdb/stbdb/trace/orcl_ora_11251.trc:
ORA-01013: user requested cancel of current operation
Recovery interrupted!
Some recovered datafiles maybe left media fuzzy
Media recovery may continue but open resetlogs may fail
Completed Standby Crash Recovery.
Errors in file /home/app/oracle/diag/rdbms/stbdb/stbdb/trace/stbdb_ora_11251.trc:
ORA-10458: standby database requires recovery
ORA-01196: file 1 is inconsistent due to a failed media recovery session
ORA-01110: data file 1: '/oradata/stbdb/datafile/system.266.946913027'
ORA-10458 signalled during: alter database open read only...
alter database open
Beginning Standby Crash Recovery.
Serial Media Recovery started
Managed Standby Recovery starting Real Time Apply
Media Recovery Log /oradata/stbdb/archivelog/1_63869_946912997.dbf
Media Recovery Waiting for thread 2 sequence 29011
Fri Apr 03 21:24:12 2020
Standby Crash Recovery aborted due to error 1013.
Errors in file /home/app/oracle/diag/rdbms/stbdb/stbdb/trace/stbdb_ora_11251.trc:
ORA-01013: user requested cancel of current operation
Recovery interrupted!
Some recovered datafiles maybe left media fuzzy
Media recovery may continue but open resetlogs may fail
Completed Standby Crash Recovery.
Errors in file /home/app/oracle/diag/rdbms/stbdb/stbdb/trace/stbdb_ora_11251.trc:
ORA-10458: standby database requires recovery
ORA-01196: file 1 is inconsistent due to a failed media recovery session
ORA-01110: data file 1: '/oradata/stbdb/datafile/system.266.946913027'
ORA-10458 signalled during: alter database open...
Shutting down instance (abort)
License high water mark = 7
USER (ospid: 11251): terminating the instance
Fri Apr 03 21:24:16 2020
opiodr aborting process unknown ospid (11280) as a result of ORA-1092
Fri Apr 03 21:24:16 2020
ORA-1092 : opitsk aborting process
Instance terminated by USER, pid = 11251
Fri Apr 03 21:24:19 2020
Instance shutdown complete
Fri Apr 03 21:25:38 2020
Starting ORACLE instance (normal)
經檢查,問題原因在於口令檔案有誤
將節點1的$ORACLE_HOME/dbs下面的口令檔案,copy到節點二和DG後,問題解決。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/69976387/viewspace-2705730/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- oracle兩節點RAC,由於gipc導致某節點crs無法啟動問題分析Oracle
- ORACLE RAC 11.2.0.4 FOR RHEL6叢集無法啟動的處理Oracle
- ORACLE 11.2.0.4 for solaris更換硬體後主機時間改變導致一節點叢集服務無法啟動Oracle
- Oracle RAC日常運維-NetworkManager導致叢集故障Oracle運維
- file-max設定過小導致oracle資料庫hang住Oracle資料庫
- runc hang 導致 Kubernetes 節點 NotReady
- ORACLE RAC 11.2.0.4 ASM加盤導致叢集重啟之ASM sga設定過小OracleASM
- 私有IP丟失造成Oracle 12C RAC叢集節點不能啟動Oracle
- RAC節點啟動失敗--ASM無法連線ASM
- 記一次oracle 19c RAC叢集重啟單節點DB啟動異常(二)Oracle
- Oracle RAC啟動因CTSS導致的異常Oracle
- rac二節點例項redo故障無法啟動修復
- 多路徑配置問題和ACFS啟用原因導致rac二節點不能正常啟動
- 【Azure微服務 Service Fabric 】因證書過期導致Service Fabric叢集掛掉(升級無法完成,節點不可用)微服務
- Solaris叢集節點重啟
- ORACLE 11.2.0.4 rac for linux 鏈路宕導致的單節點異常當機OracleLinux
- Oracle RAC新增節點Oracle
- Oracle叢集軟體管理-新增和刪除叢集節點Oracle
- RAC二節點啟動異常
- 當心ORACLE 12.2 RAC新特性引入的BUG導致ORA-4031Oracle
- 網站主機CPU或記憶體使用率過高導致網站無法訪問網站記憶體
- DM8動態增加讀寫分離叢集節點
- 記一次K8S叢集Node節點CPU消耗高故障K8S
- 關於Oracle 11G RAC雙節點之間存在防火牆導致只能一個節點執行Oracle防火牆
- Oracle 12c RAC CSSD程式無法啟動real time模式OracleCSS模式
- 系統變數group_replication_group_seeds為空導致MySQL節點無法啟動組複製變數MySql
- SQL Server 因設定最大記憶體過小導致無法啟動SQLServer記憶體
- 設定gbase叢集節點離線狀態
- Oracle 11gR2 RAC 叢集服務啟動與關閉總結Oracle
- RAC下主機修改時區導致db無法open
- Oracle叢集技術 | 叢集的自啟動系列(一)Oracle
- [Kubernetes]node節點pod無法啟動/節點刪除網路重置
- Oracle 19c rac安裝,只能啟動一個節點的ASMOracleASM
- oracle RAC 診斷叢集狀態命令Oracle
- ORACLE OGG引數修改引起佇列檔案異常導致抽取程式無法啟動Oracle佇列
- Oracle 12c叢集啟動故障Oracle
- 一次FGC導致CPU飆高的排查過程GC
- Oracle RAC自啟動Oracle