【RAC】因系統時間設定不當,造成RAC一節點叢集資源及資料庫關閉
下午接到一個同事電話,說一體機(RAC)第二個節點資料庫連不上了,讓我幫忙看看。我便登上系統,在第一個節點檢視資訊,如下
點選(此處)摺疊或開啟
-
[grid@pwjkdb01 ~]$ crs_stat -t
-
Name Type Target State Host
-
------------------------------------------------------------
-
ora....PWJK.dg ora....up.type ONLINE ONLINE pwjkdb01
-
ora.DBFS_DG.dg ora....up.type ONLINE ONLINE pwjkdb01
-
ora....ER.lsnr ora....er.type ONLINE ONLINE pwjkdb01
-
ora....N1.lsnr ora....er.type ONLINE OFFLINE
-
ora....PWJK.dg ora....up.type ONLINE ONLINE pwjkdb01
-
ora.asm ora.asm.type ONLINE ONLINE pwjkdb01
-
ora.cvu ora.cvu.type ONLINE OFFLINE
-
ora.gsd ora.gsd.type OFFLINE OFFLINE
-
ora....network ora....rk.type ONLINE ONLINE pwjkdb01
-
ora.oc4j ora.oc4j.type ONLINE OFFLINE
-
ora.ons ora.ons.type ONLINE ONLINE pwjkdb01
-
ora.pwdata.db ora....se.type ONLINE ONLINE pwjkdb01
-
ora....rv1.svc ora....ce.type ONLINE ONLINE pwjkdb01
-
ora....rv2.svc ora....ce.type ONLINE ONLINE pwjkdb01
-
ora....SM1.asm application ONLINE ONLINE pwjkdb01
-
ora....01.lsnr application ONLINE ONLINE pwjkdb01
-
ora....b01.gsd application OFFLINE OFFLINE
-
ora....b01.ons application ONLINE ONLINE pwjkdb01
-
ora....b01.vip ora....t1.type ONLINE ONLINE pwjkdb01
-
ora....b02.vip ora....t1.type ONLINE OFFLINE
- ora.scan1.vip ora....ip.type ONLINE OFFLINE
點選(此處)摺疊或開啟
-
[root@pwjkdb02 ~]# ps -ef |grep pmon
-
root 6679 1 0 2015 ? 00:01:42 /usr/bin/perl -w /opt/oracle.cellos/compmon/exadata_mon_hw_asr.pl -server
- root 63055 62491 0 16:45 pts/1 00:00:00 grep pmon
由於業務系統關係,我檢視了下系統時間、執行時間,便嘗試啟動第二個節點叢集資源
點選(此處)摺疊或開啟
-
[root@pwjkdb02 bin]# ./crsctl start crs
-
CRS-4640: Oracle High Availability Services is already active
-
CRS-4000: Command Start failed, or completed with errors.
-
[root@pwjkdb02 bin]# ./crsctl start cluster
-
CRS-2672: Attempting to start 'ora.asm' on 'pwjkdb02'
-
CRS-2676: Start of 'ora.asm' on 'pwjkdb02' succeeded
-
CRS-2672: Attempting to start 'ora.crsd' on 'pwjkdb02'
-
CRS-2676: Start of 'ora.crsd' on 'pwjkdb02' succeeded
- [root@pwjkdb02 bin]#
節點二啟動正常,如下
點選(此處)摺疊或開啟
-
[grid@pwjkdb02 ~]$ crs_stat -t
-
Name Type Target State Host
-
------------------------------------------------------------
-
ora....PWJK.dg ora....up.type ONLINE ONLINE pwjkdb01
-
ora.DBFS_DG.dg ora....up.type ONLINE ONLINE pwjkdb01
-
ora....ER.lsnr ora....er.type ONLINE ONLINE pwjkdb01
-
ora....N1.lsnr ora....er.type ONLINE ONLINE pwjkdb02
-
ora....PWJK.dg ora....up.type ONLINE ONLINE pwjkdb01
-
ora.asm ora.asm.type ONLINE ONLINE pwjkdb01
-
ora.cvu ora.cvu.type ONLINE ONLINE pwjkdb02
-
ora.gsd ora.gsd.type OFFLINE OFFLINE
-
ora....network ora....rk.type ONLINE ONLINE pwjkdb01
-
ora.oc4j ora.oc4j.type ONLINE ONLINE pwjkdb02
-
ora.ons ora.ons.type ONLINE ONLINE pwjkdb01
-
ora.pwdata.db ora....se.type ONLINE ONLINE pwjkdb01
-
ora....rv1.svc ora....ce.type ONLINE ONLINE pwjkdb01
-
ora....rv2.svc ora....ce.type ONLINE ONLINE pwjkdb01
-
ora....SM1.asm application ONLINE ONLINE pwjkdb01
-
ora....01.lsnr application ONLINE ONLINE pwjkdb01
-
ora....b01.gsd application OFFLINE OFFLINE
-
ora....b01.ons application ONLINE ONLINE pwjkdb01
-
ora....b01.vip ora....t1.type ONLINE ONLINE pwjkdb01
-
ora....SM2.asm application ONLINE ONLINE pwjkdb02
-
ora....02.lsnr application ONLINE ONLINE pwjkdb02
-
ora....b02.gsd application OFFLINE OFFLINE
-
ora....b02.ons application ONLINE ONLINE pwjkdb02
-
ora....b02.vip ora....t1.type ONLINE ONLINE pwjkdb02
- ora.scan1.vip ora....ip.type ONLINE ONLINE pwjkdb02
啟動後,檢視部分日誌
資料庫日誌:
點選(此處)摺疊或開啟
-
tail -100f alertpwjkdb02.log
-
-
4016-01-02 16:26:00.736
-
[/u01/app/11.2.0.3/grid/bin/oraagent.bin(48051)]CRS-5011:Check of resource "pwdata" failed: details at "(:CLSN00007:)" in "/u01/app/11.2.0.3/grid/log/pwjkdb02/agent/crsd/oraagent_oracle/oraagent_oracle.log"
-
4016-01-02 16:26:01.262
-
[crsd(9981)]CRS-2765:Resource 'ora.pwdata.db' has failed on server 'pwjkdb02'.
-
4016-01-02 16:26:01.329
-
[crsd(9981)]CRS-2765:Resource 'ora.pwdata.pwdatasrv2.svc' has failed on server 'pwjkdb02'.
-
4016-01-02 16:26:01.329
-
[crsd(9981)]CRS-2771:Maximum restart attempts reached for resource 'ora.pwdata.pwdatasrv2.svc'; will not restart.
-
4016-01-02 16:26:01.510
-
[/u01/app/11.2.0.3/grid/bin/oraagent.bin(8988)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/u01/app/11.2.0.3/grid/log/pwjkdb02/agent/ohasd/oraagent_grid/oraagent_grid.log"
-
4016-01-02 16:26:01.614
-
[ohasd(6722)]CRS-2765:Resource 'ora.asm' has failed on server 'pwjkdb02'.
-
4016-01-02 16:26:01.663
- [/u01/app/11.2.0.3/grid/bin/oraagent.bin(8988)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/u01/app/11.2.0.3/grid/log/pwjkdb02/agent/ohasd/oraagent_grid/oraagent_grid.log"
- ……………………………………
點選(此處)摺疊或開啟
- tail -600f alert_+ASM2.log |more
-
Warning: VKTM detected a time drift.
Time drifts can result in an unexpected behavior such as time-outs. Please check trace file for more details.
Tue Jan 12 00:39:25 2016
Warning: VKTM detected a time drift.
Time drifts can result in an unexpected behavior such as time-outs. Please check trace file for more details.
Sat Jan 02 16:26:00 4016
Warning: VKTM detected a time drift.
Time drifts can result in an unexpected behavior such as time-outs. Please check trace file for more details.
Sat Jan 02 16:26:00 4016
Errors in file /u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_pmon_9913.trc:
ORA-01513: invalid current time returned by operating system
PMON (ospid: 9913): terminating the instance due to error 1513
Sat Jan 02 16:26:01 4016
System state dump requested by (instance=2, osid=9913 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_diag_9923.trc
Dumping diagnostic data in directory=[cdmp_19740622152201], requested by (instance=2, osid=9913 (PMON)), summary=[abnormal instance
termination].
Sat Jan 02 16:26:01 4016
ORA-1092 : opitsk aborting process
Sat Jan 02 16:26:01 4016
License high water mark = 24
Instance terminated by PMON, pid = 9913
USER (ospid: 49268): terminating the instance
檢視該節點作業系統操作歷史記錄,
點選(此處)摺疊或開啟
-
vi .bash_profile
-
-
su - oracle
-
#1454306838
-
sar 1 5
-
#1454315175
-
date
-
#1454315199
-
date 010216264016.00
-
#64565627161
-
date
-
#64565627177
-
date 0102162716.00
-
#1451723220
-
date
-
#1451723223
-
date
-
#1451723225
-
date
-
#1451723227
-
date
-
#1451723233
-
date
-
#1451723250
-
date
-
#1451723302
-
date
-
#1451723310
-
date
-
#1451723315
-
date
-
#1451723323
-
su - oracle
-
#1451723446
-
date
-
#1451723534
-
date 0201163016.00
-
#1454315401
-
date
-
#1454315505
-
date 0201163416.00
-
#1454315642
-
date
-
#1454315647
-
date
- #1454315648
透過以上我們可以找到一條記錄為:date 010216264016.00,再透過警告日誌及檢視其它叢集日誌,可以確認,由於更改作業系統時間造成RAC節點二叢集關閉,經過電話溝通,該同事發現系統時間慢5分鐘,直接在作業系統上更改(請注意,更改作業系統時間需謹慎,尤其資料庫系統執行狀態,以免影響業務應用),由於命令不熟,將時間改為4016年,帶來以上問題。
任何操作都有風險性,在做操作時,我們應該提前做好規劃、操作方案以及應急預案及風險性評估,切不可想當然對線上系統做任何更改。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29487349/viewspace-1991263/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 【RAC】因硬體問題引起叢集重配可能造成所有節點不可用
- 3節點RAC資料庫夯故障分析資料庫
- 私有IP丟失造成Oracle 12C RAC叢集節點不能啟動Oracle
- Oracle叢集(RAC)時間同步(ntp和CTSS)Oracle
- 【ASM】RAC19C因引數設定不當,asm無法啟動ASM
- 如何修改rac的系統時間
- rac新增節點前之清除節點資訊
- Oracle 例項和RAC叢集下資料庫日誌目錄合集Oracle資料庫
- DRM特性引起的RAC節點當機
- 【故障公告】Kubernetes 叢集節點當機造成部落格站點故障
- 【RAC】作業系統重灌後RAC11g節點重置注意事項作業系統
- Oracle資料庫從Linux x86單機遷移到Solaries雙節點RAC叢集經驗分享-備份策略驗證Oracle資料庫Linux
- Oracle RAC新增節點Oracle
- Oracle RAC Cache Fusion 系列八:Oracle RAC 分散式資源管理(一)Oracle分散式
- oracle rac資料庫的安裝Oracle資料庫
- RAC開啟資料庫歸檔資料庫
- vgant 安裝oracle資料庫racOracle資料庫
- RAC資料庫心跳更換方案資料庫
- Oracle 11gR2 RAC 叢集服務啟動與關閉總結Oracle
- Oracle資料庫從Linux x86單機遷移到Solaries雙節點RAC叢集經驗分享-生產環境切換Oracle資料庫Linux
- Oracle資料庫從Linux x86單機遷移到Solaries雙節點RAC叢集經驗分享-測試環境驗證Oracle資料庫Linux
- rac叢集日常維護命令
- 小知識:使用oracle使用者檢視RAC叢集資源狀態Oracle
- Windows 11.2.0.4 RAC安裝配置以及RAC新增節點Windows
- oracle 11g rac新增節點前之清除節點資訊Oracle
- ORACLE RAC 兩節點db_32k_cache_size設定不當導致表truncate失敗之ORA-00379Oracle
- 2節點RAC安裝
- 【BUILD_ORACLE】Oracle 19c RAC搭建(六)建立RAC資料庫UIOracle資料庫
- 將RAC備份集恢復為單例項資料庫單例資料庫
- Oracle資料庫(RAC)巡檢報告Oracle資料庫
- RAC之資料庫軟體安裝資料庫
- 時序資料庫的叢集方案?資料庫
- oracle 11.2.0.4 rac節點異常當機之ORA-07445Oracle
- 關於Oracle 11G RAC雙節點之間存在防火牆導致只能一個節點執行Oracle防火牆
- DM7 RAC資料庫恢復成單機資料庫資料庫
- 安裝 Hadoop:設定單節點 Hadoop 叢集Hadoop
- 設定gbase叢集節點離線狀態
- 記一次oracle 19c RAC叢集重啟單節點DB啟動異常(二)Oracle
- 19c rac資料庫如何新增mgmt資料庫