備庫歸檔刪除策略失效的問題分析
最近碰到了一個有些奇怪的問題,自己當時排查問題時間緊,沒有細細琢磨,今天抽空看了下,終於發現了端倪。
首先是在早晨收到了報警郵件,報警郵件內容如下:
ZABBIX-監控系統:
在主庫使用dg broker進行快速驗證。發現這個時候dg broker也驗證失敗了,問題看來還是得趕緊修復了。
DGMGRL> show configuration;
Configuration - sol_dg
Protection Mode: MaxPerformance
Databases:
sol - Primary database
Error: ORA-16778: redo transport error for one or more databases
ssol1 - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
ERROR
DGMGRL> DGMGRL> exit
好在自當時也是冷靜處理,沒有稀裡糊塗解決,要不很可能釀成一個苦果,後面說。
至於這個錯誤,可以透過verbose的方式看到更多明細的資訊。
可以看到主庫端驗證出現了問題。
DGMGRL> show database verbose sol
Database - sol
Role: PRIMARY
Intended State: TRANSPORT-ON
Instance(s):
sol
Error: ORA-16737: the redo transport service for standby database "ssol1" has an error
Properties:
DGConnectIdentifier = 'sol'
ObserverConnectIdentifier = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'optional'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '2'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = ''
LogFileNameConvert = ''
FastStartFailoverTarget = ''
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
SidName = 'sol'
StaticConnectIdentifier = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.2.65.111)(PORT=1535))(CONNECT_DATA=(SERVICE_NAME=sol_DGMGRL)(INSTANCE_NAME=sol)(SERVER=DEDICATED)))'
StandbyArchiveLocation = 'USE_DB_RECOVERY_FILE_DEST'
AlternateLocation = ''
LogArchiveTrace = '0'
LogArchiveFormat = '%t_%s_%r.dbf'
TopWaitEvents = '(monitor)'
Database Status:
ERROR
檢視備庫的verbose資訊,一切正常。
這個時候來看看主庫的資料庫日誌資訊,發現在歸檔的時候出現了問題、
Thu Oct 29 00:11:26 2015
LNS: Attempting destination LOG_ARCHIVE_DEST_2 network reconnect (270)
LNS: Destination LOG_ARCHIVE_DEST_2 network reconnect abandoned
Error 270 for archive log file 18 to 'ssol1'
Errors in file /U01/app/oracle/diag/rdbms/sol/sol/trace/sol_nsa2_9780.trc:
ORA-00270: error creating archive log
ARC1: Attempting destination LOG_ARCHIVE_DEST_2 network reconnect (270)
ARC1: Destination LOG_ARCHIVE_DEST_2 network reconnect abandoned
FAL[server, ARC1]: Error 270 creating remote archivelog file 'ssol1'
FAL[server, ARC1]: FAL archive failed, see trace file.
ARCH: FAL archive failed. Archiver continuing
ORACLE Instance sol - Archival Error. Archiver continuing.
Thu Oct 29 00:16:54 2015
ARC1: Attempting destination LOG_ARCHIVE_DEST_2 network reconnect (270)
ARC1: Destination LOG_ARCHIVE_DEST_2 network reconnect abandoned
FAL[server, ARC1]: Error 270 creating remote archivelog file 'ssol1'
FAL[server, ARC1]: FAL archive failed, see trace file.
ARCH: FAL archive failed. Archiver continuing
ORACLE Instance sol - Archival Error. Archiver continuing.
Thu Oct 29 00:16:56 2015
對於這個問題,之前碰到過備庫的空間問題導致的歸檔問題。
這次檢視了一下,發現這次的問題有些不同,單純檢視磁碟空間,沒有發現磁碟空間問題。
[oracle@statg2 yangjr]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 7.8G 2.3G 5.2G 31% /
/dev/sda6 7.8G 4.4G 3.0G 60% /var
/dev/sda5 7.8G 2.0G 5.5G 27% /usr
/dev/sda1 122M 12M 104M 10% /boot
tmpfs 90G 276M 90G 1% /dev/shm
/dev/shm 90G 276M 90G 1% /tmp
/dev/sda7 158G 85G 65G 57% /U01
/dev/sdd1 1.7T 1.4T 204G 88% /data
/dev/sde1 1.6T 1.3T 219G 86% /other
/dev/sdf1 296G 92G 189G 33% /U01/app/oracle/oradata/sol
備庫檢視 verbose的資訊,一切都是正常的,
DGMGRL> show database verbose ssol1
Database - ssol1
Role: PHYSICAL STANDBY
Intended State: READ-ONLY
Transport Lag: 12 minutes 26 seconds
Apply Lag: 5 days 16 hours 52 minutes 38 seconds
Real Time Query: OFF
Instance(s):
sol
Properties:
DGConnectIdentifier = 'ssol1'
ObserverConnectIdentifier = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'optional'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '2'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = '+ARCH/sol/datafile, /U02/app/oracle/oradata/sol, +DATA/sol/datafile, /U02/app/oracle/oradata/sol, /data/oracle/oradata/sol, /U02/app/oracle/oradata/sol, /other/app/oracle/oradata/sol, /U02/app/oracle/oradata/sol, /U01/app/oracle/oradata/sol, /U02/app/oracle/oradata/sol'
LogFileNameConvert = '/U01/app/oracle/oradata/sol/, /U02/app/oracle/oradata/sol/'
FastStartFailoverTarget = ''
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
SidName = 'sol'
StaticConnectIdentifier = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.127.133.23)(PORT=1532))(CONNECT_DATA=(SERVICE_NAME=ssol1_DGMGRL)(INSTANCE_NAME=sol)(SERVER=DEDICATED)))'
StandbyArchiveLocation = 'USE_DB_RECOVERY_FILE_DEST'
AlternateLocation = ''
LogArchiveTrace = '0'
LogArchiveFormat = '%t_%s_%r.dbf'
TopWaitEvents = '(monitor)'
Database Status:
SUCCESS
但是檢視備庫的資料庫日誌,發現原來是閃回區滿了。
Creating archive destination file : /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_29/o1_mf_1_459848_%u_.arc (648904 blocks)
Thu Oct 29 00:16:54 2015
Errors in file /U02/app/oracle/diag/rdbms/ssol1/sol/trace/sol_rfs_5518.trc:
ORA-19815: WARNING: db_recovery_file_dest_size of 107374182400 bytes is 100.00% used, and has 0 remaining bytes available.
可以明顯看到是歸檔所在的閃回區爆滿導致的,那麼系統層面也設定了crontab,應該也可以清理掉吧。
檢視了crontab使用的備庫歸檔刪除策略是保留了近4天的歸檔
crosscheck archivelog all;
delete expired archivelog all;
DELETE ARCHIVELOG ALL COMPLETED BEFORE 'SYSDATE-4';
問題處理要緊,為了保留現場,同是為了後續繼續跟進問題,沒有武斷的把歸檔都手工刪除,就變通解決了問題,調整了閃回區的大小,本來比如100G,我擴大的150G這樣幾天內還不會出現問題,所以簡單擴充套件之後,dg broker校驗就又恢復了正常。
但是這個時候有存在兩個疑問,11g開始閃回區中的備庫歸檔在超過80%的使用率的時候,會自動刪除,為什麼沒有刪除。
crontab在系統級透過呼叫rman來刪除過期的歸檔,為什麼也沒有清理。
對於這兩個問題,自己排除了rman的配置影響,排除了其它引數的干擾,最後檢視dg broker的日誌,發現裡面的內容是大量的字樣:
11/01/2015 21:52:36
Read-Only state no longer supported
順著錯誤開始讓上找,一致找到錯誤開始丟擲,已經過去了快1個星期了。
05/20/2015 09:43:44
version check on database ssol1 detected stale metadata,
requesting update from primary database
Creating process RSM0
10/23/2015 11:11:50
Command EDIT DATABASE ssol1 SET STATE = READ-ONLY completed
10/23/2015 11:11:52
Read-Only state no longer supported
10/23/2015 11:12:52
對於這個問題一看到這兒就恍然大悟了,原來是之前做恢復測試的時候把資料庫置為read-only,但是竟然忘了置為Online狀態。
馬上置為Online狀態。
DGMGRL> edit database ssol1 set state='ONLINE';
Succeeded.
這個時候備庫的日誌就開始啟動了MRP,開始應用日誌了。
Sun Nov 01 22:00:16 2015
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE
Attempt to start background Managed Standby Recovery process (sol)
Sun Nov 01 22:00:16 2015
MRP0 started with pid=28, OS id=18517
MRP0: Background Managed Standby Recovery process started (sol)
started logmerger process
Sun Nov 01 22:00:21 2015
Managed Standby Recovery starting Real Time Apply
Sun Nov 01 22:00:35 2015
Parallel Media Recovery started with 16 slaves
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Sun Nov 01 22:00:36 2015
Media Recovery Log /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_23/o1_mf_1_459657_c2lwdgrl_.arc
Sun Nov 01 22:00:36 2015
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE
這個時候歸檔已經從主庫傳輸到了備庫,但是沒有清理掉。
比如檢視最近的歸檔檔案情況,可以看到,從23號開始的歸檔都保留了下來。
4.0K ./2015_10_22
3.2G ./2015_10_23
9.9G ./2015_10_24
20G ./2015_10_25
11G ./2015_10_26
17G ./2015_10_27
36G ./2015_10_28
9.7G ./2015_10_29
11G ./2015_10_30
9.8G ./2015_10_31
20G ./2015_11_01
那麼歸檔開始應用了,怎麼看看80%的閥值開始生效了呢,一個招數,把閃回區大小改小。可以看到改小之後,馬上回發現空間開始收緊,這個時候已經部署的歸檔就會置為expired的。然後開始做delete操作。
Sun Nov 01 22:26:03 2015
Media Recovery Log /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_25/o1_mf_1_459697_c2qnysqs_.arc
Sun Nov 01 22:26:06 2015
ALTER SYSTEM SET db_recovery_file_dest_size='120G' SCOPE=BOTH;
Sun Nov 01 22:26:06 2015
Errors in file /U02/app/oracle/diag/rdbms/ssol1/sol/trace/sol_m000_24860.trc:
ORA-19815: WARNING: db_recovery_file_dest_size of 128849018880 bytes is 100.00% used, and has 0 remaining bytes available.
************************************************************************
************************************************************************
Sun Nov 01 22:26:27 2015
Media Recovery Log /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_25/o1_mf_1_459698_c2qo28z1_.arc
Sun Nov 01 22:26:31 2015
ALTER SYSTEM SET db_recovery_file_dest_size='140G' SCOPE=BOTH;
Sun Nov 01 22:26:32 2015
Deleted Oracle managed file /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_23/o1_mf_1_459657_c2lwdgrl_.arc
Deleted Oracle managed file /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_23/o1_mf_1_459658_c2mfz45j_.arc
Deleted Oracle managed file /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_23/o1_mf_1_459659_c2mzkpgg_.arc
Deleted Oracle managed file /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_23/o1_mf_1_459660_c2nnnod5_.arc
Deleted Oracle managed file /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_23/o1_mf_1_459661_c2nnnod6_.arc
Deleted Oracle managed file /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_24/o1_mf_1_459662_c2npf92y_.arc
Deleted Oracle managed file /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_24/o1_mf_1_459663_c2npgdvq_.arc
Deleted Oracle managed file /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_24/o1_mf_1_459664_c2npgzr5_.arc
Sun Nov 01 22:26:44 2015
Media Recovery Log /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_25/o1_mf_1_459699_c2qo592h_.arc
當然為了不影響接收歸檔還是簡單驗證後又改了回去,先把歸檔接收完再慢慢清理。
所以透過這個可以發現其實很多時候看起來奇怪的現象的原因其實都很簡單常規。但是這些細節是否能夠合理利用起來,如果回過頭來看,最開始dg broker檢查的時候已經間接提示了。
DGMGRL> show database verbose ssol1
Database - ssol1
Role: PHYSICAL STANDBY
Intended State: READ-ONLY
Transport Lag: 12 minutes 26 seconds
Apply Lag: 5 days 16 hours 52 minutes 38 seconds
Real Time Query: OFF
所以對於這些日誌資訊還是需要好好斟酌。剩下的就是等待歸檔都應用完成之後開始檢查清理是否完成了。
首先是在早晨收到了報警郵件,報警郵件內容如下:
ZABBIX-監控系統:
------------------------------------
報警內容: DG_issue
------------------------------------
報警級別: PROBLEM
------------------------------------
監控專案: dg_issue:2015-10-29
00:34:55.0Fetch Archive LogErrorFAL[server, ARC1]: Error 270 creating remote
archivelog file 'ssol1'
------------------------------------
報警時間:2015.10.29-00:36:48
在主庫使用dg broker進行快速驗證。發現這個時候dg broker也驗證失敗了,問題看來還是得趕緊修復了。
DGMGRL> show configuration;
Configuration - sol_dg
Protection Mode: MaxPerformance
Databases:
sol - Primary database
Error: ORA-16778: redo transport error for one or more databases
ssol1 - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
ERROR
DGMGRL> DGMGRL> exit
好在自當時也是冷靜處理,沒有稀裡糊塗解決,要不很可能釀成一個苦果,後面說。
至於這個錯誤,可以透過verbose的方式看到更多明細的資訊。
可以看到主庫端驗證出現了問題。
DGMGRL> show database verbose sol
Database - sol
Role: PRIMARY
Intended State: TRANSPORT-ON
Instance(s):
sol
Error: ORA-16737: the redo transport service for standby database "ssol1" has an error
Properties:
DGConnectIdentifier = 'sol'
ObserverConnectIdentifier = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'optional'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '2'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = ''
LogFileNameConvert = ''
FastStartFailoverTarget = ''
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
SidName = 'sol'
StaticConnectIdentifier = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.2.65.111)(PORT=1535))(CONNECT_DATA=(SERVICE_NAME=sol_DGMGRL)(INSTANCE_NAME=sol)(SERVER=DEDICATED)))'
StandbyArchiveLocation = 'USE_DB_RECOVERY_FILE_DEST'
AlternateLocation = ''
LogArchiveTrace = '0'
LogArchiveFormat = '%t_%s_%r.dbf'
TopWaitEvents = '(monitor)'
Database Status:
ERROR
檢視備庫的verbose資訊,一切正常。
這個時候來看看主庫的資料庫日誌資訊,發現在歸檔的時候出現了問題、
Thu Oct 29 00:11:26 2015
LNS: Attempting destination LOG_ARCHIVE_DEST_2 network reconnect (270)
LNS: Destination LOG_ARCHIVE_DEST_2 network reconnect abandoned
Error 270 for archive log file 18 to 'ssol1'
Errors in file /U01/app/oracle/diag/rdbms/sol/sol/trace/sol_nsa2_9780.trc:
ORA-00270: error creating archive log
ARC1: Attempting destination LOG_ARCHIVE_DEST_2 network reconnect (270)
ARC1: Destination LOG_ARCHIVE_DEST_2 network reconnect abandoned
FAL[server, ARC1]: Error 270 creating remote archivelog file 'ssol1'
FAL[server, ARC1]: FAL archive failed, see trace file.
ARCH: FAL archive failed. Archiver continuing
ORACLE Instance sol - Archival Error. Archiver continuing.
Thu Oct 29 00:16:54 2015
ARC1: Attempting destination LOG_ARCHIVE_DEST_2 network reconnect (270)
ARC1: Destination LOG_ARCHIVE_DEST_2 network reconnect abandoned
FAL[server, ARC1]: Error 270 creating remote archivelog file 'ssol1'
FAL[server, ARC1]: FAL archive failed, see trace file.
ARCH: FAL archive failed. Archiver continuing
ORACLE Instance sol - Archival Error. Archiver continuing.
Thu Oct 29 00:16:56 2015
對於這個問題,之前碰到過備庫的空間問題導致的歸檔問題。
這次檢視了一下,發現這次的問題有些不同,單純檢視磁碟空間,沒有發現磁碟空間問題。
[oracle@statg2 yangjr]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 7.8G 2.3G 5.2G 31% /
/dev/sda6 7.8G 4.4G 3.0G 60% /var
/dev/sda5 7.8G 2.0G 5.5G 27% /usr
/dev/sda1 122M 12M 104M 10% /boot
tmpfs 90G 276M 90G 1% /dev/shm
/dev/shm 90G 276M 90G 1% /tmp
/dev/sda7 158G 85G 65G 57% /U01
/dev/sdd1 1.7T 1.4T 204G 88% /data
/dev/sde1 1.6T 1.3T 219G 86% /other
/dev/sdf1 296G 92G 189G 33% /U01/app/oracle/oradata/sol
備庫檢視 verbose的資訊,一切都是正常的,
DGMGRL> show database verbose ssol1
Database - ssol1
Role: PHYSICAL STANDBY
Intended State: READ-ONLY
Transport Lag: 12 minutes 26 seconds
Apply Lag: 5 days 16 hours 52 minutes 38 seconds
Real Time Query: OFF
Instance(s):
sol
Properties:
DGConnectIdentifier = 'ssol1'
ObserverConnectIdentifier = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'optional'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '2'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = '+ARCH/sol/datafile, /U02/app/oracle/oradata/sol, +DATA/sol/datafile, /U02/app/oracle/oradata/sol, /data/oracle/oradata/sol, /U02/app/oracle/oradata/sol, /other/app/oracle/oradata/sol, /U02/app/oracle/oradata/sol, /U01/app/oracle/oradata/sol, /U02/app/oracle/oradata/sol'
LogFileNameConvert = '/U01/app/oracle/oradata/sol/, /U02/app/oracle/oradata/sol/'
FastStartFailoverTarget = ''
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
SidName = 'sol'
StaticConnectIdentifier = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.127.133.23)(PORT=1532))(CONNECT_DATA=(SERVICE_NAME=ssol1_DGMGRL)(INSTANCE_NAME=sol)(SERVER=DEDICATED)))'
StandbyArchiveLocation = 'USE_DB_RECOVERY_FILE_DEST'
AlternateLocation = ''
LogArchiveTrace = '0'
LogArchiveFormat = '%t_%s_%r.dbf'
TopWaitEvents = '(monitor)'
Database Status:
SUCCESS
但是檢視備庫的資料庫日誌,發現原來是閃回區滿了。
Creating archive destination file : /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_29/o1_mf_1_459848_%u_.arc (648904 blocks)
Thu Oct 29 00:16:54 2015
Errors in file /U02/app/oracle/diag/rdbms/ssol1/sol/trace/sol_rfs_5518.trc:
ORA-19815: WARNING: db_recovery_file_dest_size of 107374182400 bytes is 100.00% used, and has 0 remaining bytes available.
可以明顯看到是歸檔所在的閃回區爆滿導致的,那麼系統層面也設定了crontab,應該也可以清理掉吧。
檢視了crontab使用的備庫歸檔刪除策略是保留了近4天的歸檔
crosscheck archivelog all;
delete expired archivelog all;
DELETE ARCHIVELOG ALL COMPLETED BEFORE 'SYSDATE-4';
問題處理要緊,為了保留現場,同是為了後續繼續跟進問題,沒有武斷的把歸檔都手工刪除,就變通解決了問題,調整了閃回區的大小,本來比如100G,我擴大的150G這樣幾天內還不會出現問題,所以簡單擴充套件之後,dg broker校驗就又恢復了正常。
但是這個時候有存在兩個疑問,11g開始閃回區中的備庫歸檔在超過80%的使用率的時候,會自動刪除,為什麼沒有刪除。
crontab在系統級透過呼叫rman來刪除過期的歸檔,為什麼也沒有清理。
對於這兩個問題,自己排除了rman的配置影響,排除了其它引數的干擾,最後檢視dg broker的日誌,發現裡面的內容是大量的字樣:
11/01/2015 21:52:36
Read-Only state no longer supported
順著錯誤開始讓上找,一致找到錯誤開始丟擲,已經過去了快1個星期了。
05/20/2015 09:43:44
version check on database ssol1 detected stale metadata,
requesting update from primary database
Creating process RSM0
10/23/2015 11:11:50
Command EDIT DATABASE ssol1 SET STATE = READ-ONLY completed
10/23/2015 11:11:52
Read-Only state no longer supported
10/23/2015 11:12:52
對於這個問題一看到這兒就恍然大悟了,原來是之前做恢復測試的時候把資料庫置為read-only,但是竟然忘了置為Online狀態。
馬上置為Online狀態。
DGMGRL> edit database ssol1 set state='ONLINE';
Succeeded.
這個時候備庫的日誌就開始啟動了MRP,開始應用日誌了。
Sun Nov 01 22:00:16 2015
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE
Attempt to start background Managed Standby Recovery process (sol)
Sun Nov 01 22:00:16 2015
MRP0 started with pid=28, OS id=18517
MRP0: Background Managed Standby Recovery process started (sol)
started logmerger process
Sun Nov 01 22:00:21 2015
Managed Standby Recovery starting Real Time Apply
Sun Nov 01 22:00:35 2015
Parallel Media Recovery started with 16 slaves
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Sun Nov 01 22:00:36 2015
Media Recovery Log /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_23/o1_mf_1_459657_c2lwdgrl_.arc
Sun Nov 01 22:00:36 2015
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE
這個時候歸檔已經從主庫傳輸到了備庫,但是沒有清理掉。
比如檢視最近的歸檔檔案情況,可以看到,從23號開始的歸檔都保留了下來。
4.0K ./2015_10_22
3.2G ./2015_10_23
9.9G ./2015_10_24
20G ./2015_10_25
11G ./2015_10_26
17G ./2015_10_27
36G ./2015_10_28
9.7G ./2015_10_29
11G ./2015_10_30
9.8G ./2015_10_31
20G ./2015_11_01
那麼歸檔開始應用了,怎麼看看80%的閥值開始生效了呢,一個招數,把閃回區大小改小。可以看到改小之後,馬上回發現空間開始收緊,這個時候已經部署的歸檔就會置為expired的。然後開始做delete操作。
Sun Nov 01 22:26:03 2015
Media Recovery Log /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_25/o1_mf_1_459697_c2qnysqs_.arc
Sun Nov 01 22:26:06 2015
ALTER SYSTEM SET db_recovery_file_dest_size='120G' SCOPE=BOTH;
Sun Nov 01 22:26:06 2015
Errors in file /U02/app/oracle/diag/rdbms/ssol1/sol/trace/sol_m000_24860.trc:
ORA-19815: WARNING: db_recovery_file_dest_size of 128849018880 bytes is 100.00% used, and has 0 remaining bytes available.
************************************************************************
************************************************************************
Sun Nov 01 22:26:27 2015
Media Recovery Log /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_25/o1_mf_1_459698_c2qo28z1_.arc
Sun Nov 01 22:26:31 2015
ALTER SYSTEM SET db_recovery_file_dest_size='140G' SCOPE=BOTH;
Sun Nov 01 22:26:32 2015
Deleted Oracle managed file /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_23/o1_mf_1_459657_c2lwdgrl_.arc
Deleted Oracle managed file /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_23/o1_mf_1_459658_c2mfz45j_.arc
Deleted Oracle managed file /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_23/o1_mf_1_459659_c2mzkpgg_.arc
Deleted Oracle managed file /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_23/o1_mf_1_459660_c2nnnod5_.arc
Deleted Oracle managed file /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_23/o1_mf_1_459661_c2nnnod6_.arc
Deleted Oracle managed file /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_24/o1_mf_1_459662_c2npf92y_.arc
Deleted Oracle managed file /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_24/o1_mf_1_459663_c2npgdvq_.arc
Deleted Oracle managed file /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_24/o1_mf_1_459664_c2npgzr5_.arc
Sun Nov 01 22:26:44 2015
Media Recovery Log /U02/app/oracle/flash_recovery_area/SSOL1/archivelog/2015_10_25/o1_mf_1_459699_c2qo592h_.arc
當然為了不影響接收歸檔還是簡單驗證後又改了回去,先把歸檔接收完再慢慢清理。
所以透過這個可以發現其實很多時候看起來奇怪的現象的原因其實都很簡單常規。但是這些細節是否能夠合理利用起來,如果回過頭來看,最開始dg broker檢查的時候已經間接提示了。
DGMGRL> show database verbose ssol1
Database - ssol1
Role: PHYSICAL STANDBY
Intended State: READ-ONLY
Transport Lag: 12 minutes 26 seconds
Apply Lag: 5 days 16 hours 52 minutes 38 seconds
Real Time Query: OFF
所以對於這些日誌資訊還是需要好好斟酌。剩下的就是等待歸檔都應用完成之後開始檢查清理是否完成了。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/23718752/viewspace-1820448/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Rman Crosscheck刪除失效歸檔-備份恢復ROS
- Rman Crosscheck刪除失效歸檔ROS
- 11g主庫歸檔自動刪除的小問題分析
- Oracle Data Guard 主庫歸檔檔案刪除策略Oracle
- Oracle Data Guard 主庫 歸檔檔案 刪除策略Oracle
- Oracle Data Guard 主庫 歸檔檔案 刪除策略--續Oracle
- rman刪除舊的歸檔日誌問題
- 【轉載】Oracle Data Guard 主庫 歸檔檔案 刪除策略Oracle
- 利用sqlplus指令碼刪除備庫歸檔SQL指令碼
- 備庫自動刪除已應用的歸檔日誌
- 【轉載】Oracle Data Guard 備庫 歸檔檔案 刪除指令碼Oracle指令碼
- dg刪除備庫已經應用的歸檔日誌指令碼指令碼
- DELETE OBSOLETE不刪除歸檔日誌以及歸檔的備份集delete
- 11gR2 RMAN歸檔日誌刪除策略
- ORACLE刪除歸檔Oracle
- 手工rm刪除歸檔日誌對備份歸檔日誌的影響
- 歸檔日誌的刪除
- 誤刪歸檔日誌除導致備份歸檔日誌失敗
- dataguard 由於主庫引數未配置歸檔刪除策略導致庫歸檔丟失ORA-16016
- Mysql跨庫主從熱備失效問題MySql
- ORACLE NBU調取oracle rman指令碼備份歸檔不自動刪除歸檔Oracle指令碼
- 強制刪除歸檔檔案
- 刪除歸檔日誌檔案
- 刪除歸檔,保留最近的5個歸檔日誌
- 定期刪除歸檔檔案的指令碼指令碼
- 無備份恢復(歸檔模式)已刪除資料檔案模式
- 刪除歸檔物理檔案備份失敗!ORA-19625
- oracle刪除歸檔日誌Oracle
- Oracle歸檔日誌刪除Oracle
- oracle rman 刪除過期的歸檔Oracle
- 生產上刪除歸檔的命令
- rman刪除指定時間的歸檔
- 記錄一個由於倉庫層錯誤導致軟刪除失效的問題
- 使用RMAN安全刪除歸檔檔案
- Docker刪除倉庫中的映象問題Docker
- Oralce資料庫關閉歸檔日誌並且刪除歸檔日誌資料庫
- 如何刪除控制檔案中過去rman備份到磁帶的歸檔備份集
- [20221121]rman刪除歸檔日誌問題.txt