5分鐘,徹底精通Oracle DG切換

chenoracle發表於2022-09-11

由於部落格排版問題,詳細資訊見我的微信公眾號"IT小Chen"

沒錯,就是標題黨,5分鐘就想精通Oracle DG切換了嗎?恐怕原理知識都掌握不了,更別說實操部分了,不要相信這些鬼話,太容易得到的知識你也不會去珍惜。


先看幾個思考題?


1.主庫切換為備庫,成功執行如下命令後,原主庫例項是什麼狀態?


alter database commit to switchover to physical standby with session shutdown;


A.shutdown B.nomount C.mount D.read only 


2.備庫切換為主庫,成功執行如下命令後,原備庫例項是什麼狀態?


alter database commit to switchover to primary with session shutdown;


A.shutdown B.nomount C.mount D.oread write 


3.配置了dg broker以後,沒有開啟FSFO,是否還可以正常透過之前的SQL語句進行dg切換呢?


A.可以 B.不可以


4.配置了dg broker並且啟用了FSFO,是否還可以正常透過之前的SQL語句進行dg切換呢?


A.可以 B.不可以


5.配置了dg broker並且啟用了FSFO,主庫執行shutdown immediate後,會自動進行主備切換嗎?


A.會 B.不會


本篇文章主要講解如下內容:


一:透過sqlplus進行switchover
二:透過dg broker進行switchover
三:透過dg broker進行自動failover
四:透過keepalived進行vip自動切換

一:透過sqlplus工具進行switchover


主機資訊


172.16.6.137 cjc-db-01 

172.16.6.138 cjc-db-02

OS:Redhat 7.9

DB:Oracle 11.2.0.4.0 單機

切換前檢查


1.檢查資料庫基本資訊


set lin 200 pages 100
col FLASHBACK_ON for a10
col current_scn for 99999999999999
col open_mode for a10
col SWITCHOVER_STATUS for a20
col PROTECTION_MODE for a20
col name for a10
select name,current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database;


主庫:


NAME       CURRENT_SCN PROTECTION_MODEDATABASE_ROLE FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

---------- --------------- -------------------- ---------------- --- ---------- ---------- --------------------

CJC    978784 MAXIMUM PERFORMANCEPRIMARY  YES NO READ WRITE TO STANDBY

從庫:


NAME       CURRENT_SCN PROTECTION_MODEDATABASE_ROLE FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

---------- --------------- -------------------- ---------------- --- ---------- ---------- --------------------

CJC    971317 MAXIMUM PERFORMANCEPHYSICAL STANDBY YES NO READ ONLY  NOT ALLOWED

從庫的OPEN_MODE為READ ONLY,切換前需要開啟MRP。


ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; 
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;


SWITCHOVER_STATUS說明:


NOT ALLOWED             當前的資料庫不是帶有備用資料庫的主資料庫

PREPARING DICTIONARY    該邏輯備用資料庫正在向一個主資料庫和其他備用資料庫傳送它的重做資料,以便為切換做準備

PREPARING SWITCHOVER    接受用於切換的重做資料時,邏輯備用配置會使用它

RECOVERY NEEDED         備用資料庫還沒有接收到切換請求

SESSIONS ACTIVE         在主資料庫中存在活動的SQL會話;在繼續執行之前必須斷開這些會話

SWITCHOVER PENDING      適用於那些已收到主資料庫切換請求但是還沒有處理該請求的備用資料庫

SWITCHOVER LATENT       切換沒有完成並返回到主資料庫

TO LOGICAL STANDBY      主資料庫已經收到了來自邏輯備用資料庫的完整的字典

TO PRIMARY              該備用資料庫可以轉換為主資料庫

TO STANDBY              該主資料庫可以轉換為備用資料庫

2.檢查DG引數


set linesize 500 pages 100
col value for a70 
col name for a30 
select name, value 
from v$parameter 
where name in ('db_name','db_unique_name', 
'log_archive_config', 
'log_archive_dest_1','log_archive_dest_2', 
'log_archive_dest_state_1', 'log_archive_dest_3', 
'log_archive_dest_state_3', 
'log_archive_dest_state_2', 
'remote_login_passwordfile', 
'log_archive_format', 
'log_archive_max_processes', 
'fal_server','db_file_name_convert', 
'log_file_name_convert', 
'standby_file_management');


主庫:


NAME       VALUE

------------------------------ ----------------------------------------------------------------------

db_file_name_convert       cjc2, cjc1

log_file_name_convert       cjc2, cjc1

log_archive_dest_1       LOCATION=/arch VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=cjc1

log_archive_dest_2       SERVICE=cjc2 ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_

               NAME=cjc2



log_archive_dest_3

log_archive_dest_state_1       ENABLE

log_archive_dest_state_2       ENABLE

log_archive_dest_state_3       enable

fal_server               cjc2

log_archive_config       DG_CONFIG=(cjc1,cjc2)

log_archive_format       cjc_%t_%s_%r.arc

log_archive_max_processes      4

standby_file_management        AUTO

remote_login_passwordfile      EXCLUSIVE

db_name            cjc

db_unique_name       cjc1


16 rows selected.

從庫:


NAME       VALUE

------------------------------ ----------------------------------------------------------------------

db_file_name_convert       cjc1, cjc2

log_file_name_convert       cjc1, cjc2

log_archive_dest_1       LOCATION=/arch VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=cjc2

log_archive_dest_2       SERVICE=cjc1 ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_

       NAME=cjc1



log_archive_dest_3

log_archive_dest_state_1       ENABLE

log_archive_dest_state_2       ENABLE

log_archive_dest_state_3       enable

fal_server           cjc1

log_archive_config       DG_CONFIG=(cjc2,cjc1)

log_archive_format       cjc_%t_%s_%r.arc

log_archive_max_processes      4

standby_file_management        AUTO

remote_login_passwordfile      EXCLUSIVE

db_name            cjc

db_unique_name       cjc2


16 rows selected.

3.檢查DG程式狀態


col dest_name for a30
col error for a20
set lin 200 pages 100
col applied_scn for 9999999999999
select dest_id,error,status,log_sequence,applied_scn,MAX_CONNECTIONS,NET_TIMEOUT,COMPRESSION from v$archive_dest where dest_id<5;


主庫:


 DEST_ID ERRORSTATUS  LOG_SEQUENCE  APPLIED_SCN MAX_CONNECTIONS NET_TIMEOUT COMPRES

---------- -------------------- --------- ------------ -------------- --------------- ----------- -------

 1VALID    18    0    10 DISABLE

 2VALID    19       9713171   30 DISABLE

 3INACTIVE     0    0    10 DISABLE

 4INACTIVE     0    0    10 DISABLE

從庫:


  DEST_ID ERRORSTATUS  LOG_SEQUENCE  APPLIED_SCN MAX_CONNECTIONS NET_TIMEOUT COMPRES

---------- -------------------- --------- ------------ -------------- --------------- ----------- -------

 1VALID    18    0    10 DISABLE

 2VALID     0    0    1   30 DISABLE

 3INACTIVE     0    0    10 DISABLE

 4INACTIVE     0    0    10 DISABLE

4.檢查程式狀態


select INST_ID,process,status,thread#,sequence#,block#,blocks from gv$managed_standby order by INST_ID ;


主庫:


   INST_ID PROCESS   STATUS     THREAD#  SEQUENCE#     BLOCK#     BLOCKS

---------- --------- ------------ ---------- ---------- ---------- ----------

 1 ARCH      CLOSING   1     19      20480  831

 1 ARCH      CLOSING   1     17  1   75

 1 LNS     WRITING   1     20        168    1

 1 ARCH      CLOSING   1     18  1   22

 1 ARCH      CLOSING   1     14  1 2404

從庫:


   INST_ID PROCESS   STATUS     THREAD#  SEQUENCE#     BLOCK#     BLOCKS

---------- --------- ------------ ---------- ---------- ---------- ----------

 1 ARCH      CONNECTED   0      0  0    0

 1 ARCH      CONNECTED   0      0  0    0

 1 ARCH      CLOSING   1     18  1   22

 1 ARCH      CLOSING   1     19      20480  831

 1 MRP0      APPLYING_LOG   1     20        152     102400

 1 RFS     IDLE   1     20        152    1

 1 RFS     IDLE   0      0  0    0

 1 RFS     IDLE   0      0  0    0

 1 RFS     IDLE   0      0  0    0


9 rows selected.

5.檢查從庫應用狀態


set lines 200
col dest_name for a30
select DEST_ID,DEST_NAME,RECOVERY_MODE from gv$archive_dest_status where RECOVERY_MODE <>'IDLE';


  DEST_ID DEST_NAME  RECOVERY_MODE

---------- ------------------------------ -----------------------

 1 LOG_ARCHIVE_DEST_1  MANAGED REAL TIME APPLY

6.gap檢查


主庫:


SELECT LOG_ARCHIVED-LOG_APPLIED+1 LOGGAP FROM (SELECT MAX(SEQUENCE#) LOG_ARCHIVED
FROM V$ARCHIVED_LOG WHERE DEST_ID=1 AND ARCHIVED='YES' AND RESETLOGS_CHANGE#=(SELECT MAX(RESETLOGS_CHANGE#)
FROM V$ARCHIVED_LOG)), (SELECT MAX(SEQUENCE#) LOG_APPLIED
FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND APPLIED='YES'
AND RESETLOGS_CHANGE#=(SELECT MAX(RESETLOGS_CHANGE#) FROM V$ARCHIVED_LOG)) ;


    LOGGAP

----------

 1

7.主庫建立測試資料,檢查資料是否正常同步


SQL> col name for a50

SQL> select name from v$dbfile;

NAME

--------------------------------------------------

/oradata/cjc/users01.dbf

/oradata/cjc/undotbs01.dbf

/oradata/cjc/sysaux01.dbf

/oradata/cjc/system01.dbf


SQL> select * from cjc.t1;

ID

----------

 1

8.主切備


sqlplus / as sysdba
SELECT trim(DATABASE_ROLE) DBROLE FROM v$DATABASE;
DBROLE
----------------
PRIMARY
alter system switch logfile;
alter system flush SHARED_POOL;
ALTER SYSTEM SET log_archive_trace=8191 sid='*';


執行命令後,原主庫自動關閉例項


alter database commit to switchover to physical standby with session shutdown;


對應切換日誌如下:


Tue Sep 06 14:15:08 2022
alter database commit to switchover to physical standby with session shutdown
ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY [Process Id: 21460] (cjc1)
krss_find_arc: Selecting ARC2 to receive message as last resort
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Waiting for all FAL entries to be archived...
All FAL entries have been archived.
Waiting for potential Physical Standby switchover target to become synchronized...
Tue Sep 06 14:15:08 2022
OCISessionBegin with PasswordVerifier succeeded
Client pid [10698] attached to RFS pid [21661] at remote instance number [1] at dest 'cjc2'
Active, synchronized Physical Standby switchover target has been identified
Tue Sep 06 14:15:10 2022
Process (ospid 10671) is suspended due to switchover to physical standby operation.
Switchover End-Of-Redo Log thread 1 sequence 21 has been fixed
Switchover: Primary highest seen SCN set to 0x0.0xf62e8
ARCH: Noswitch archival of thread 1, sequence 21
ARCH: End-Of-Redo Branch archival of thread 1 sequence 21
ARCH: Evaluating archive   log 3 thread 1 sequence 21
ARCH: LGWR is actively archiving destination LOG_ARCHIVE_DEST_2
ARCH: Beginning to archive thread 1 sequence 21 (988339-Infinity) (cjc1)
ARCH: Creating remote archive destination LOG_ARCHIVE_DEST_2: 'cjc2' (thread 1 sequence 21) (cjc1)
ARCH: Transmitting activation ID 0xdfebe7b0
OCISessionBegin with PasswordVerifier succeeded
Client pid [21460] attached to RFS pid [21665] at remote instance number [1] at dest 'cjc2'
ARCH: Standby redo logfile selected for thread 1 sequence 21 for destination LOG_ARCHIVE_DEST_2
ARCH: Standby redo logfile selected for thread 1 sequence 21 for destination LOG_ARCHIVE_DEST_2
ARCH: Creating local archive destination LOG_ARCHIVE_DEST_1: '/arch/cjc_1_21_1114613364.arc' (thread 1 sequence 21) (cjc1)
ARCH: Closing remote archive destination LOG_ARCHIVE_DEST_2: 'cjc2' (cjc1)
ARCH: Closing local archive destination LOG_ARCHIVE_DEST_1: '/arch/cjc_1_21_1114613364.arc' (cjc1)
Committing creation of archivelog '/arch/cjc_1_21_1114613364.arc'
Archived Log entry 25 added for thread 1 sequence 21 ID 0xdfebe7b0 dest 1:
Archived Log entry 26 added for thread 1 sequence 21 ID 0xdfebe7b0 dest 2:
ARCH: Completed archiving thread 1 sequence 21 (988339-1008360) (cjc1)
ARCH: Archiving is disabled due to current logfile archival
Primary will check for some target standby to have received alls redo
Final check for a synchronized target standby. Check will be made once.
ARCH: Transmitting activation ID 0xdfebe7b0
LOG_ARCHIVE_DEST_2 is a potential Physical Standby switchover target
Active, synchronized target has been identified
Target has also received all redo
-----------------------------------------------------------
|          Target Standby Status                           |
| LOG_ARCHIVE_DEST_1  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_2  : HAS RECEIVED ALL DATA          |
| LOG_ARCHIVE_DEST_3  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_4  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_5  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_6  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_7  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_8  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_9  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_10  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_11  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_12  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_13  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_14  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_15  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_16  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_17  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_18  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_19  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_20  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_21  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_22  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_23  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_24  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_25  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_26  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_27  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_28  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_29  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_30  : NOT ACTIVE                     |
| LOG_ARCHIVE_DEST_31  : NOT ACTIVE                     |
------------------------------------------------------------
Backup controlfile written to trace file /oracle/app/oracle/diag/rdbms/cjc1/cjc1/trace/cjc1_ora_21460.trc
Clearing standby activation ID 3756779440 (0xdfebe7b0)
The primary database controlfile was created using the
'MAXLOGFILES 16' clause.
There is space for up to 13 standby redo logfiles
Use the following SQL commands on the standby database to create
standby redo logfiles that match the primary database:
ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 52428800;
Archivelog for thread 1 sequence 21 required for standby recovery
Switchover: Primary controlfile converted to standby controlfile succesfully.
Switchover: Complete - Database shutdown required
USER (ospid: 21460): terminating the instance
Instance terminated by USER, pid = 21460
Completed: alter database commit to switchover to physical standby with session shutdown
Shutting down instance (abort)
License high water mark = 6
Tue Sep 06 14:15:11 2022
Instance shutdown complete


啟動例項


startup mount


9.備切主


su - oracle

sqlplus / as sysdba

show parameter instance_name

SELECT trim(DATABASE_ROLE) DBROLE FROM v$DATABASE;

DBROLE

----------------

PHYSICAL STANDBY

ALTER SYSTEM SET log_archive_trace=8191 sid='*';

執行命令後,原備庫自動啟動到mount


alter database commit to switchover to primary with session shutdown;

切換日誌如下:


Tue Sep 06 14:18:05 2022
alter database commit to switchover to primary with session shutdown
ALTER DATABASE SWITCHOVER TO PRIMARY (cjc2)
Maximum wait for role transition is 15 minutes.
Switchover: Media recovery is still active
Role Change: Canceling MRP - no more redo to apply
Tue Sep 06 14:18:06 2022
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /oracle/app/oracle/diag/rdbms/cjc2/cjc2/trace/cjc2_pr00_20372.trc:
ORA-16037: user requested cancel of managed recovery operation
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Errors in file /oracle/app/oracle/diag/rdbms/cjc2/cjc2/trace/cjc2_pr00_20372.trc:
ORA-16037: user requested cancel of managed recovery operation
Tue Sep 06 14:18:07 2022
Errors in file /oracle/app/oracle/diag/rdbms/cjc2/cjc2/trace/cjc2_mrp0_20358.trc:
ORA-10877: error signaled in parallel recovery slave 
MRP0: Background Media Recovery process shutdown (cjc2)
Role Change: Canceled MRP
All dispatchers and shared servers shutdown
CLOSE: killing server sessions.
CLOSE: all sessions shutdown successfully.
Tue Sep 06 14:18:08 2022
SMON: disabling cache recovery
Backup controlfile written to trace file /oracle/app/oracle/diag/rdbms/cjc2/cjc2/trace/cjc2_ora_20276.trc
SwitchOver after complete recovery through change 1008360
Online log /oradata/cjc/redo01.log: Thread 1 Group 1 was previously cleared
Online log /oradata/cjc/redo02.log: Thread 1 Group 2 was previously cleared
Online log /oradata/cjc/redo03.log: Thread 1 Group 3 was previously cleared
Standby became primary SCN: 1008358
Switchover: Complete - Database mounted as primary
Completed: alter database commit to switchover to primary with session shutdown
Tue Sep 06 14:18:29 2022
ARC2: Becoming the 'no SRL' ARCH
Tue Sep 06 14:18:48 2022
idle dispatcher 'D000' terminated, pid = (17, 1)


重啟例項


shutdown immediate;

startup;

9.切換後檢查


set lin 200 pages 100
set lin 200 pages 100
col FLASHBACK_ON for a10
col current_scn for 99999999999999
col open_mode for a20
col SWITCHOVER_STATUS for a20
col PROTECTION_MODE for a20
select current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database;


 CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODESWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- ---------- --------------------

1008736 MAXIMUM PERFORMANCE  PRIMARY      YES NO     READ WRITE RESOLVABLE GAP

ALTER SYSTEM SET log_archive_trace=0 sid='*';

10.新備庫啟動mrp


sqlplus / as sysdba
alter database open;
ALTER SYSTEM SET log_archive_trace=0 sid='*';
###recover managed standby database using current logfile disconnect from session;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; 
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;


從庫


  CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODESWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- ---------- --------------------

1009025 MAXIMUM PERFORMANCE  PHYSICAL STANDBY YES NO     READ ONLYNOT ALLOWED

     WITH APPLY

主庫:


 CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODESWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- ---------- --------------------

1008800 MAXIMUM PERFORMANCE  PRIMARY      YES NO     READ WRITE TO STANDBY

主庫:


SQL> insert into cjc.t1 values(2);

SQL> insert into cjc.t1 values(3);

SQL> commit;

從庫:


SQL> select * from cjc.t1;

ID

----------

 1

 2

 3

回切:


主切備


alter database commit to switchover to physical standby with session shutdown; 

startup mount

備切主


alter database commit to switchover to primary with session shutdown;

shutdown immediate;

startup;

備庫啟動MRP


alter database open;

###ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; 

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;

圖片


二:透過dg broker進行switchover


手動進行DG switchover,步驟有些麻煩,是否有更簡單的方式呢,可以試試dg broker。


Oracle DataGuard Broker分為Client Side和Server Side。


Client Side可以透過EM和DGMGRL兩種工具對服務端進行管理和維護。


Server side會有一個配置檔案和一個後臺程式叫Data Guard Broker monitor process(DMON)。


DMON:它是一個用來管理Broker的後臺程式,這個程式負責本地資料庫與standby資料庫的DMON程式進行通訊,當主庫上接收到一個請求的時候,它會協調其他資料庫上的DMON程式處理相應的請求,比如switchover。


同時會更新本地系統中的配置檔案,並與standby資料庫上的DMON程式進行通訊,更新Standby上的配置檔案。


1.啟用dg broker


主庫、從庫:


配置DG_BROKER_START引數


show parameter dg_broker_start;

NAME     TYPE VALUE

------------------------------------ ----------- ------------------------------

dg_broker_start      boolean FALSE

啟用dg_broker_start,啟用後oracle會自動啟動一個dmon程式


alter system set dg_broker_start = true;

對應日誌:


Tue Sep 06 15:04:40 2022

ALTER SYSTEM SET dg_broker_start=TRUE SCOPE=BOTH;

Tue Sep 06 15:04:40 2022

DMON started with pid=31, OS id=3206 

Starting Data Guard Broker (DMON)

Tue Sep 06 15:04:48 2022

INSV started with pid=32, OS id=3215 

2.調整監聽檔案


在監聽檔案中加入DGMGRL靜態監聽


主庫


SID_LIST_LISTENER = 

  (SID_LIST = 

    (SID_DESC = 

      (GLOBAL_DBNAME = cjc1) 

      (ORACLE_HOME = /oracle/app/oracle/product/11.2/db) 

      (SID_NAME = cjc1) 

    ) 

    (SID_DESC = 

      (GLOBAL_DBNAME = cjc1_DGMGRL) 

      (ORACLE_HOME = /oracle/app/oracle/product/11.2/db) 

      (SID_NAME = cjc1) 

    ) 

  )

  

LISTENER =

  (DESCRIPTION_LIST =

    (DESCRIPTION =

      (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.6.137)(PORT = 1521))

    )

  )

從庫  


SID_LIST_LISTENER = 

  (SID_LIST = 

    (SID_DESC = 

      (GLOBAL_DBNAME = cjc2) 

      (ORACLE_HOME = /oracle/app/oracle/product/11.2/db) 

      (SID_NAME = cjc2) 

    ) 

    (SID_DESC = 

      (GLOBAL_DBNAME = cjc2_DGMGRL) 

      (ORACLE_HOME = /oracle/app/oracle/product/11.2/db) 

      (SID_NAME = cjc2) 

    ) 

  )

  

LISTENER =

  (DESCRIPTION_LIST =

    (DESCRIPTION =

      (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.6.138)(PORT = 1521))

    )

  )

監聽狀態


lsnrctl reload

lsnrctl status

主庫:


Services Summary...

Service "cjc1" has 2 instance(s).

  Instance "cjc1", status UNKNOWN, has 1 handler(s) for this service...

  Instance "cjc1", status READY, has 1 handler(s) for this service...

Service "cjc1_DGMGRL" has 1 instance(s).

  Instance "cjc1", status UNKNOWN, has 1 handler(s) for this service...

Service "cjcXDB" has 1 instance(s).

  Instance "cjc1", status READY, has 1 handler(s) for this service...

The command completed successfully

從庫:


Services Summary...

Service "cjc2" has 2 instance(s).

  Instance "cjc2", status UNKNOWN, has 1 handler(s) for this service...

  Instance "cjc2", status READY, has 1 handler(s) for this service...

Service "cjc2_DGMGRL" has 1 instance(s).

  Instance "cjc2", status UNKNOWN, has 1 handler(s) for this service...

Service "cjcXDB" has 1 instance(s).

  Instance "cjc2", status READY, has 1 handler(s) for this service...

The command completed successfully

3.配置broker


[oracle@cjc-db-01 ~]$ dgmgrl sys/oracle

DGMGRL for Linux: Version 11.2.0.4.0 - 64bit Production

Copyright (c) 2000, 2009, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.

Connected.

DGMGRL>

顯示配置


DGMGRL> show configuration

ORA-16532: Data Guard broker configuration does not exist


Configuration details cannot be determined by DGMGRL

新增配置


DGMGRL> create configuration 'cjcdgbroker' as primary database is 'cjc1' connect identifier is cjc1;

Configuration "cjcdgbroker" created with primary database "cjc1"

顯示配置


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database


Fast-Start Failover: DISABLED


Configuration Status:

DISABLED

新增備庫配置


DGMGRL> add database 'cjc2' as connect identifier is 'cjc2' maintained as physical;

Database "cjc2" added

顯示配置


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database

    cjc2 - Physical standby database


Fast-Start Failover: DISABLED


Configuration Status:

DISABLED

啟用配置


DGMGRL> enable configuration

Enabled.

顯示配置


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database

    cjc2 - Physical standby database


Fast-Start Failover: DISABLED


Configuration Status:

SUCCESS

4.測試switchover


DGMGRL> switchover to cjc2

Performing switchover NOW, please wait...

Operation requires a connection to instance "cjc2" on database "cjc2"

Connecting to instance "cjc2"...

Connected.

New primary database "cjc2" is opening...

Operation requires startup of instance "cjc1" on database "cjc1"

Starting instance "cjc1"...

ORACLE instance started.

Database mounted.

Database opened.

Switchover succeeded, new primary is "cjc2"

檢查配置


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc2 - Primary database

    cjc1 - Physical standby database


Fast-Start Failover: DISABLED


Configuration Status:

SUCCESS

SQL檢查


set lin 200 pages 100

set lin 200 pages 100

col FLASHBACK_ON for a10

col current_scn for 99999999999999

col open_mode for a20

col SWITCHOVER_STATUS for a20

col PROTECTION_MODE for a20

select current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database;

新備:


 CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1052426 MAXIMUM PERFORMANCE  PHYSICAL STANDBY YES NO     READ ONLY WITH APPLY NOT ALLOWED

新主


 CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1052498 MAXIMUM PERFORMANCE  PRIMARY      YES NO     READ WRITE   SESSIONS ACTIVE

5.回切


DGMGRL> switchover to cjc1;

Performing switchover NOW, please wait...

Operation requires a connection to instance "cjc1" on database "cjc1"

Connecting to instance "cjc1"...

Connected.

New primary database "cjc1" is opening...

Operation requires startup of instance "cjc2" on database "cjc2"

Starting instance "cjc2"...

ORACLE instance started.

Database mounted.

Database opened.

Switchover succeeded, new primary is "cjc1"

檢查配置


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database

    cjc2 - Physical standby database


Fast-Start Failover: DISABLED


Configuration Status:

SUCCESS

思考:配置了dg broker以後,是否還可以透過之前的SQL語句進行dg切換呢?


此時,透過SQL命令仍然可以進行切換


主切備


alter database commit to switchover to physical standby with session shutdown; 

startup mount

備切主


alter database commit to switchover to primary with session shutdown;

shutdown immediate;

startup;

備庫啟動MRP


alter database open;

###recover managed standby database using current logfile disconnect from session;

###ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; 

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;

可以使用,既然已經啟動了dg broker,建議透過broker進行切換。


圖片


三:透過dg broker進行自動failover


啟動自動故障轉移FSFO


Fast-Start Failove


FSFO允許代理在主庫故障的情況下自動故障轉移到先前選擇的備庫,無需手動執行任何步驟,以便快速可靠地恢復業務。


FSFO只能在代理配置中使用,並且只能透過DGMGRL或OEM進行配置。


FSFO支援在最高可用性與最高效能模式下使用。


最大可用性模式保證不會丟失任何資料,最高效能模式可保證丟失的資料量不超過FastStartFailoverLagLimit屬性指定的資料量(單位為秒)。


思考:何時會進行FSFO?


配置完成後,FSFO將在以下情況下起作用:


1.主庫與 Observer和目標備庫 失聯時間均超過FastStartFailoverThreshold屬性配置閾值(單位為秒)

2.單例項資料庫中主例項崩潰

3.RAC中所有主例項崩潰

4.shutdown abort主庫

5.應用程式透過呼叫DBMS_DG.INITIATE_FS_FAILOVER函式啟動FSFO

6.Oracle 錯誤:可以指定啟動 FSFO 故障切換的 ORA 錯誤列表(預設為空)

Broker 可配置為遇到以下任一條件時啟動FSFO


1.Datafile Offline    資料檔案由於寫錯誤離線Yes

2.Corrupted Dictionary關鍵資料庫物件的字典損壞,該狀態可在資料庫open時被檢測到Yes

3.Corrupted Controlfile控制檔案由於寫入錯誤永久損壞Yes

4.Inaccessible Logfile由於IO錯誤,LGWR無法寫入日誌組的任何成員No

5.Stuck Archiver        由於裝置已滿或不可用,歸檔程式無法歸檔redo log

設定保護模式與LogXptMode


LOGXPTMODE屬性在最大可用性模式下應為SYNC,在最大效能模式下應為ASYNC,主備庫LOGXPTMODE設定必須相同


###DGMGRL> EDIT CONFIGURATION SET PROTECTION MODE AS MaxAvailability;

DGMGRL> edit database cjc1 set property 'LogXptMode'='ASYNC'; 

Property "LogXptMode" updated

DGMGRL> edit database cjc2 set property 'LogXptMode'='ASYNC';

Property "LogXptMode" updated

設定FSFO閾值


如果主庫與 Observer和目標備庫 失聯時間均超過FastStartFailoverThreshold屬性配置閾值(單位為秒),將啟動FSFO。


換句話說,FastStartFailoverThreshold表示,Observer和目標備庫 在檢測到主庫不可用後、希望等多少秒才啟動FSFO。再此期間,它們會嘗試重連主庫。


預設值為30秒,最小6秒。


DGMGRL> edit configuration set property faststartfailoverthreshold=20;

Property "faststartfailoverthreshold" updated

設定最大可接受延遲時間


FastStartFailoverLagLimit表示,自動故障轉移允許的最大資料丟失量(單位為秒),僅在最高效能保護模式時才可使用。


FastStartFailoverLagLimit作為DG可接受的最大延遲時間,備庫的apply lag只有在此限制之內,才允許FSFO。


如果無法維持已配置的資料丟失保證,主庫上的redo生成將停止。


為了避免長時間停頓,Observer或者目標備庫可能會在第一次記錄到無法發生FSFO之後,允許主庫繼續生成redo。


此時若主庫故障,將無法發生FSFO。


預設值為30秒,最小值為10秒。


DGMGRL> edit configuration set property faststartfailoverlaglimit=60;

設定自動恢復資料庫屬性


如果FastStartFailoverAutoReinstate屬性設定為TRUE,在原主庫故障修復後,會自動嘗試將其恢復為新主庫的備庫。


DGMGRL> EDIT CONFIGURATION SET PROPERTY FASTSTARTFAILOVERAUTOREINSTATE=TRUE;

Property "faststartfailoverautoreinstate" updated

啟動Observer


建立指令碼


[oracle@cjc-db-02 dg]$ pwd

/home/oracle/dg

[oracle@cjc-db-02 dg]$ mkdir observer

[oracle@cjc-db-02 dg]$ cd observer/

vi observer.sh

nohup dgmgrl sys/oracle@cjc2 "start observer file=FSFO.dat">>fsfo.log 2>&1 &

執行指令碼


[oracle@cjc-db-02 observer]$ sh observer.sh

啟動FSFO,報錯ORA-16651


DGMGRL> enable fast_start failover;

Error: ORA-16651: requirements not met for enabling fast-start failover

檢查報錯資訊,發現主從庫必須開啟閃回


[oracle@cjc-db-02 observer]$ oerr ora 16651
16651, 0000, "requirements not met for enabling fast-start failover"
// *Cause:  The attempt to enable fast-start failover could not be completed
//          because one or more requirements were not met:
//          - The Data Guard configuration must be in either MaxAvailability
//            or MaxPerformance protection mode.
//          - The LogXptMode property for both the primary database and
//            the fast-start failover target standby database must be
//            set to SYNC if the configuration protection mode is set to
//            MaxAvailability mode.
//          - The LogXptMode property for both the primary database and
//            the fast-start failover target standby database must be
//            set to ASYNC if the configuration protection mode is set to
//            MaxPerformance mode.
//          - The primary database and the fast-start failover target standby
//            database must both have flashback enabled.
//          - No valid target standby database was specified in the primary
//            database FastStartFailoverTarget property prior to the attempt
//            to enable fast-start failover, and more than one standby
//            database exists in the Data Guard configuration.
// *Action: Retry the command after correcting the issue:
//          - Set the Data Guard configuration to either MaxAvailability
//            or MaxPerformance protection mode.
//          - Ensure that the LogXptMode property for both the primary
//            database and the fast-start failover target standby database
//            are set to SYNC if the configuration protection mode is set to
//            MaxAvailability.
//          - Ensure that the LogXptMode property for both the primary
//            database and the fast-start failover target standby database
//            are set to ASYNC if the configuration protection mode is set to
//            MaxPerformance.
//          - Ensure that both the primary database and the fast-start failover
//            target standby database have flashback enabled.
//          - Set the primary database FastStartFailoverTarget property to
//            the DB_UNIQUE_NAME value of the desired target standby database
//            and the desired target standby database FastStartFailoverTarget
//            property to the DB_UNIQUE_NAME value of the primary database.


啟動閃回


alter database flashback on;

再次開啟FSFO


DGMGRL> enable fast_start failover;

Enabled.

對應日誌:


Tue Sep 06 16:31:43 2022

Fast-Start Failover (FSFO) has been enabled between:

  Primary = "cjc1"

  Standby = "cjc2"

Tue Sep 06 16:31:43 2022

FSFP started with pid=38, OS id=8869 

OCISessionBegin with PasswordVerifier succeeded

檢視配置,Fast-Start Failover狀態變成ENABLED了。


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database

    cjc2 - (*) Physical standby database


Fast-Start Failover: ENABLED


Configuration Status:

SUCCESS

測試自動切換


主庫


SQL> shutdown abort    

ORACLE instance shut down.

檢視fsfo日誌


[oracle@cjc-db-02 observer]$ tail -10f fsfo.log

16:38:41.35  Tuesday, September 06, 2022

Initiating Fast-Start Failover to database "cjc2"...

Performing failover NOW, please wait...

Failover succeeded, new primary is "cjc2"

16:38:48.09  Tuesday, September 06, 2022

完成了自動切換


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc2 - Primary database

      Warning: ORA-16829: fast-start failover configuration is lagging


    cjc1 - (*) Physical standby database (disabled)

      ORA-16661: the standby database needs to be reinstated


Fast-Start Failover: ENABLED


Configuration Status:

WARNING

已完成自動切換


  CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1159699 MAXIMUM PERFORMANCE  PRIMARY      YES YES     READ WRITE   NOT ALLOWED

啟動原主庫


SQL> startup

ORACLE instance started.


Total System Global Area 1135747072 bytes

Fixed Size    2252544 bytes

Variable Size  754974976 bytes

Database Buffers  369098752 bytes

Redo Buffers    9420800 bytes

Database mounted.

ORA-16649: possible failover to another database prevents this database from being opened

檢查錯誤


[oracle@cjc-db-01 ~]$ oerr ora 16449

16449, 00000, "incomplete redo thread enable operation"

// *Cause:  The switchover operation could not continue because it failed to

//          disable a thread that was left in an incomplete thread enable

//          state.

// *Action: Check alert log for more details.

啟動資料庫時報錯ORA-16649,不需要處理,後臺會自動執行閃回資料庫,自動open例項。


SQL> set lin 200 pages 100

set lin 200 pages 100

col FLASHBACK_ON for a10

col current_scn for 99999999999999

col open_mode for a20

col SWITCHOVER_STATUS for a20

col PROTECTION_MODE for a20

select current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database;


    CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1160947 MAXIMUM PERFORMANCE  PHYSICAL STANDBY YES YES     READ ONLY WITH APPLY SWITCHOVER PENDING

檢視配置


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database

    cjc2 - (*) Physical standby database


Fast-Start Failover: ENABLED


Configuration Status:

SUCCESS

原主庫自動變為備庫


告警日誌


FLASHBACK DATABASE TO SCN 1160354
Flashback Restore Start
Flashback Restore Complete
Flashback Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 2 slaves
Flashback Media Recovery Log /arch/cjc_1_2_1114706326.arc
Flashback Media Recovery Log /arch/cjc_1_3_1114706326.arc
Tue Sep 06 16:53:25 2022
Recovery of Online Redo Log: Thread 1 Group 1 Seq 4 Reading mem 0
  Mem# 0: /oradata/cjc/redo01.log
Recovery of Online Redo Log: Thread 1 Group 2 Seq 5 Reading mem 0
  Mem# 0: /oradata/cjc/redo02.log
Recovery of Online Redo Log: Thread 1 Group 3 Seq 6 Reading mem 0
  Mem# 0: /oradata/cjc/redo03.log
Incomplete Recovery applied until change 1160355 time 09/06/2022 16:50:59
Flashback Media Recovery Complete
Completed: FLASHBACK DATABASE TO SCN 1160354
alter database convert to physical standby
ALTER DATABASE CONVERT TO PHYSICAL STANDBY (cjc2)
Flush standby redo logfile failed:1649
Clearing standby activation ID 3756857555 (0xdfed18d3)
The primary database controlfile was created using the
'MAXLOGFILES 16' clause.
There is space for up to 13 standby redo logfiles
Use the following SQL commands on the standby database to create
standby redo logfiles that match the primary database:
ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 52428800;
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Clearing online redo logfile 1 /oradata/cjc/redo01.log
Clearing online log 1 of thread 1 sequence number 4
Clearing online redo logfile 1 complete
Clearing online redo logfile 2 /oradata/cjc/redo02.log
Clearing online log 2 of thread 1 sequence number 5
Tue Sep 06 16:53:28 2022
RFS[2]: Assigned to RFS process 10506
RFS[2]: Opened log for thread 1 sequence 6 dbid -538205520 branch 1114706326
Clearing online redo logfile 2 complete
Archived Log entry 87 added for thread 1 sequence 6 rlc 1114706326 ID 0xdfed18d3 dest 2:
Clearing online redo logfile 3 /oradata/cjc/redo03.log
Clearing online log 3 of thread 1 sequence number 6
Tue Sep 06 16:53:31 2022
RFS[3]: Assigned to RFS process 10503
RFS[3]: Opened log for thread 1 sequence 1 dbid -538205520 branch 1114707085
RFS[2]: Opened log for thread 1 sequence 2 dbid -538205520 branch 1114707085
Tue Sep 06 16:53:31 2022
Clearing online redo logfile 3 complete
A new recovery destination branch has been registered
RFS[3]: New Archival REDO Branch(resetlogs_id): 1114707085  Prior: 1114706326
RFS[3]: Archival Activation ID: 0xdfeddf20 Current: 0x0
RFS[3]: Effect of primary database OPEN RESETLOGS
RFS[3]: Incarnation entry added for Branch(resetlogs_id): 1114707085 (cjc2)
Tue Sep 06 16:53:31 2022
Setting recovery target incarnation to 4
Completed: alter database convert to physical standby
Archived Log entry 88 added for thread 1 sequence 1 rlc 1114707085 ID 0xdfeddf20 dest 2:
Killing 2 processes with pids 10503,10506 (all RFS) in order to disallow current and future RFS connections. Requested by OS process 10491
Archived Log entry 89 added for thread 1 sequence 2 rlc 1114707085 ID 0xdfeddf20 dest 2:
Tue Sep 06 16:53:33 2022
RFS[4]: Assigned to RFS process 10515
RFS[4]: Opened log for thread 1 sequence 3 dbid -538205520 branch 1114707085
Tue Sep 06 16:53:33 2022
Shutting down instance (immediate)
Shutting down instance: further logons disabled
Stopping background process MMNL
Archived Log entry 89 added for thread 1 sequence 3 rlc 1114707085 ID 0xdfeddf20 dest 2:
Stopping background process MMON
License high water mark = 5
All dispatchers and shared servers shutdown
alter database CLOSE NORMAL
ORA-1109 signalled during: alter database CLOSE NORMAL...
alter database DISMOUNT
Shutting down archive processes
Archiving is disabled
Completed: alter database DISMOUNT
ARCH: Archival disabled due to shutdown: 1089
Shutting down archive processes
Archiving is disabled
Shutting down Data Guard Broker processes
Tue Sep 06 16:53:39 2022
Completed: Data Guard Broker shutdown
ARCH: Archival disabled due to shutdown: 1089
Shutting down archive processes
Tue Sep 06 16:53:41 2022
Stopping background process VKTM
Archiving is disabled
Tue Sep 06 16:53:43 2022
Instance shutdown complete
Tue Sep 06 16:53:44 2022
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Initial number of CPU is 2
Number of processor cores in the system is 2
Number of processor sockets in the system is 1
CELL communication is configured to use 0 interface(s):
CELL IP affinity details:
    NUMA status: non-NUMA system
    cellaffinity.ora status: N/A
CELL communication will use 1 IP group(s):
    Grp 0: 
Picked latch-free SCN scheme 3
Autotune of undo retention is turned on. 
IMODE=BR
ILAT =27
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options.
ORACLE_HOME = /oracle/app/oracle/product/11.2/db
System name:Linux
Node name:cjc-db-02
Release:5.4.17-2102.201.3.el7uek.x86_64
Version:#2 SMP Fri Apr 23 09:05:55 PDT 2021
Machine:x86_64
Using parameter settings in server-side spfile /oracle/app/oracle/product/11.2/db/dbs/spfilecjc2.ora
System parameters with non-default values:
  processes                = 150
  event                    = "28401 trace name context forever,level 1"
  event                    = "10949 trace name context forever,level 1"
  memory_target            = 1088M
  control_files            = "/oradata/cjc/control01.ctl"
  control_files            = "/oracle/app/oracle/fast_recovery_area/cjc/control02.ctl"
  db_file_name_convert     = "cjc1"
  db_file_name_convert     = "cjc2"
  log_file_name_convert    = "cjc1"
  log_file_name_convert    = "cjc2"
  control_file_record_keep_time= 31
  db_block_size            = 8192
  compatible               = "11.2.0.4.0"
  log_archive_dest_1       = "LOCATION=/arch VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=cjc2"
  log_archive_dest_2       = "service="cjc1""
  log_archive_dest_2       = "LGWR ASYNC NOAFFIRM delay=0 optional compression=disable max_failure=0 max_connections=1 reopen=300 db_unique_name="cjc1" net_timeout=30"
  log_archive_dest_2       = "valid_for=(all_logfiles,primary_role)"
  log_archive_dest_state_1 = "ENABLE"
  log_archive_dest_state_2 = "ENABLE"
  log_archive_min_succeed_dest= 1
  fal_server               = "cjc1"
  log_archive_trace        = 0
  log_archive_config       = "DG_CONFIG=(cjc2,cjc1)"
  log_archive_format       = "cjc_%t_%s_%r.arc"
  log_archive_max_processes= 4
  archive_lag_target       = 0
  _use_adaptive_log_file_sync= "FALSE"
  db_files                 = 2048
  db_recovery_file_dest    = "/home/oracle/dg/recover"
  db_recovery_file_dest_size= 10G
  standby_file_management  = "AUTO"
  _cleanup_rollback_entries= 10000
  undo_tablespace          = "UNDOTBS1"
  _partition_large_extents = "FALSE"
  remote_login_passwordfile= "EXCLUSIVE"
  audit_sys_operations     = FALSE
  db_domain                = ""
  dispatchers              = "(PROTOCOL=TCP) (SERVICE=cjcXDB)"
  parallel_execution_message_size= 32768
  _PX_use_large_pool       = TRUE
  result_cache_max_size    = 0
  audit_file_dest          = "/oracle/app/oracle/admin/cjc/adump"
  audit_trail              = "NONE"
  cell_offload_processing  = FALSE
  db_name                  = "cjc"
  db_unique_name           = "cjc2"
  open_cursors             = 300
  _optimizer_null_aware_antijoin= FALSE
  _b_tree_bitmap_plans     = FALSE
  _optimizer_extended_cursor_sharing= "NONE"
  _optimizer_extended_cursor_sharing_rel= "NONE"
  _optimizer_adaptive_cursor_sharing= FALSE
  deferred_segment_creation= FALSE
  _optimizer_use_feedback  = FALSE
  dg_broker_start          = TRUE
  diagnostic_dest          = "/oracle/app/oracle"
  max_dump_file_size       = "1024M"
Tue Sep 06 16:53:45 2022
PMON started with pid=2, OS id=10533 
Tue Sep 06 16:53:45 2022
PSP0 started with pid=3, OS id=10535 
Tue Sep 06 16:53:46 2022
VKTM started with pid=4, OS id=10538 at elevated priority
VKTM running at (1)millisec precision with DBRM quantum (100)ms
Tue Sep 06 16:53:46 2022
GEN0 started with pid=5, OS id=10543 
Tue Sep 06 16:53:46 2022
DIAG started with pid=6, OS id=10545 
Tue Sep 06 16:53:46 2022
DBRM started with pid=7, OS id=10547 
Tue Sep 06 16:53:46 2022
DIA0 started with pid=8, OS id=10549 
Tue Sep 06 16:53:46 2022
MMAN started with pid=9, OS id=10551 
Tue Sep 06 16:53:46 2022
DBW0 started with pid=10, OS id=10553 
Tue Sep 06 16:53:46 2022
LGWR started with pid=11, OS id=10555 
Tue Sep 06 16:53:46 2022
CKPT started with pid=12, OS id=10557 
Tue Sep 06 16:53:46 2022
SMON started with pid=13, OS id=10559 
Tue Sep 06 16:53:46 2022
RECO started with pid=14, OS id=10561 
Tue Sep 06 16:53:46 2022
MMON started with pid=15, OS id=10563 
Tue Sep 06 16:53:46 2022
MMNL started with pid=16, OS id=10565 
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
starting up 1 shared server(s) ...
Tue Sep 06 16:53:46 2022
DMON started with pid=19, OS id=10571 
ORACLE_BASE from environment = /oracle/app/oracle
Tue Sep 06 16:53:46 2022
alter database  mount
ARCH: STARTING ARCH PROCESSES
Tue Sep 06 16:53:51 2022
ARC0 started with pid=21, OS id=10582 
ARC0: Archival started
ARCH: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Tue Sep 06 16:53:52 2022
ARC1 started with pid=22, OS id=10585 
Tue Sep 06 16:53:52 2022
ARC2 started with pid=23, OS id=10587 
ARC1: Archival started
ARC2: Archival started
ARC2: Becoming the 'no FAL' ARCH
ARC2: Becoming the 'no SRL' ARCH
ARC2: Thread not mounted
Tue Sep 06 16:53:52 2022
ARC3 started with pid=24, OS id=10589 
ARC1: Becoming the heartbeat ARCH
ARC1: Thread not mounted
Successful mount of redo thread 1, with mount id 3756855130
Allocated 8388608 bytes in shared pool for flashback generation buffer
Starting background process RVWR
Tue Sep 06 16:53:52 2022
RVWR started with pid=25, OS id=10591 
Physical Standby Database mounted.
Lost write protection disabled
ARC1: Becoming the active heartbeat ARCH
Completed: alter database  mount
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Starting Data Guard Broker (DMON)
Tue Sep 06 16:53:55 2022
INSV started with pid=26, OS id=10597 
Tue Sep 06 16:54:01 2022
NSV0 started with pid=27, OS id=10607 
Tue Sep 06 16:54:04 2022
Using STANDBY_ARCHIVE_DEST parameter default value as /arch
Tue Sep 06 16:54:04 2022
RSM0 started with pid=29, OS id=10615 
Tue Sep 06 16:54:04 2022
RFS[1]: Assigned to RFS process 10617
RFS[1]: Opened log for thread 1 sequence 2 dbid -538205520 branch 1114707085
Archived Log entry 90 added for thread 1 sequence 2 rlc 1114707085 ID 0xdfeddf20 dest 2:
RFS[1]: Opened log for thread 1 sequence 4 dbid -538205520 branch 1114707085
Archived Log entry 91 added for thread 1 sequence 4 rlc 1114707085 ID 0xdfeddf20 dest 2:
RFS[1]: Selected log 4 for thread 1 sequence 5 dbid -538205520 branch 1114707085
Tue Sep 06 16:54:06 2022
Primary database is in MAXIMUM PERFORMANCE mode
RFS[2]: Assigned to RFS process 10621
RFS[2]: Selected log 5 for thread 1 sequence 6 dbid -538205520 branch 1114707085
Tue Sep 06 16:54:06 2022
Archived Log entry 92 added for thread 1 sequence 5 ID 0xdfeddf20 dest 1:
Data Guard: Failover target was a Real Time Query standby; attempting to open this standby after reinstatement ...
ALTER DATABASE OPEN READ ONLY
Data Guard Broker initializing...
Data Guard Broker initialization complete
Tue Sep 06 16:54:07 2022
SMON: enabling cache recovery
Dictionary check beginning
Dictionary check complete
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Physical standby database opened for read only access.
Completed: ALTER DATABASE OPEN READ ONLY
Tue Sep 06 16:54:08 2022
db_recovery_file_dest_size of 10240 MB is 0.49% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='cjc2';
ALTER SYSTEM SET log_archive_format='cjc_%t_%s_%r.arc' SCOPE=SPFILE SID='cjc2';
ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*';
ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*';
ALTER SYSTEM SET log_archive_max_processes=4 SCOPE=BOTH SID='*';
ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*';
ALTER SYSTEM SET db_file_name_convert='cjc1','cjc2' SCOPE=SPFILE;
ALTER SYSTEM SET log_file_name_convert='cjc1','cjc2' SCOPE=SPFILE;
ALTER SYSTEM SET fal_server='cjc1' SCOPE=BOTH;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE
Attempt to start background Managed Standby Recovery process (cjc2)
Tue Sep 06 16:54:09 2022
MRP0 started with pid=33, OS id=10629 
MRP0: Background Managed Standby Recovery process started (cjc2)
 started logmerger process
Tue Sep 06 16:54:15 2022
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 2 slaves
Media Recovery start incarnation depth : 1, target inc# : 4, irscn : 1160356
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Media Recovery Log /arch/cjc_1_6_1114706326.arc
Identified End-Of-Redo (failover) for thread 1 sequence 6 at SCN 0x0.11b4a4
Tue Sep 06 16:54:15 2022
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE
Resetting standby activation ID 3756857555 (0xdfed18d3)
Media Recovery End-Of-Redo indicator encountered
Media Recovery Continuing
Media Recovery Log /arch/cjc_1_1_1114707085.arc
Archived Log entry 93 added for thread 1 sequence 6 ID 0xdfeddf20 dest 1:
Tue Sep 06 16:54:16 2022
Primary database is in MAXIMUM PERFORMANCE mode
Media Recovery Log /arch/cjc_1_2_1114707085.arc
RFS[3]: Assigned to RFS process 10651
RFS[3]: Selected log 4 for thread 1 sequence 7 dbid -538205520 branch 1114707085
Media Recovery Log /arch/cjc_1_3_1114707085.arc
Media Recovery Log /arch/cjc_1_4_1114707085.arc
Media Recovery Log /arch/cjc_1_5_1114707085.arc
Media Recovery Log /arch/cjc_1_6_1114707085.arc
Media Recovery Waiting for thread 1 sequence 7 (in transit)
Recovery of Online Redo Log: Thread 1 Group 4 Seq 7 Reading mem 0
  Mem# 0: /oradata/cjc/standby_redo04.log


思考:


主庫shutdown immediate會自動切換嗎?


主庫:


SQL> shutdown immediate

Database closed.

Database dismounted.

ORACLE instance shut down.

沒有自動切換


過10min後啟動資料庫


startup


檢查當前配置資訊


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database

    cjc2 - (*) Physical standby database


Fast-Start Failover: ENABLED


Configuration Status:

SUCCESS

SQL>

 CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1164197 MAXIMUM PERFORMANCE  PRIMARY      YES YES     READ WRITE   TO STANDBY

思考:


啟動FSFO後還可以透過之前的SQL命令進行主備切換嗎?


主切備


alter database commit to switchover to physical standby with session shutdown; 

startup mount

備切主,報錯ORA-16109


alter database commit to switchover to primary with session shutdown;

ORA-16109: failed to apply log data from previous primary

[oracle@cjc-db-02 ~]$ oerr ora 16109

16109, 00000, "failed to apply log data from previous primary"

// *Cause:  Log data from previous primary could not be completely applied.

// *Action: Check DBA_LOGSTDBY_EVENTS for failures and take corrective

//          action.  Then, reissue command

檢查fsfo日誌


[oracle@cjc-db-02 observer]$ tail -10f fsfo.log

17:41:12.95  Tuesday, September 06, 2022

Initiating Fast-Start Failover to database "cjc2"...

Performing failover NOW, please wait...

Failover succeeded, new primary is "cjc2"

17:41:22.02  Tuesday, September 06, 2022

備庫自動切成主庫了


set lin 200 pages 100

set lin 200 pages 100

col FLASHBACK_ON for a10

col current_scn for 99999999999999

col open_mode for a20

col SWITCHOVER_STATUS for a20

col PROTECTION_MODE for a20

select current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database;


 CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1185108 MAXIMUM PERFORMANCE  PRIMARY      YES YES     READ WRITE   NOT ALLOWED

備庫需要手動啟動MRP


alter database open;

###recover managed standby database using current logfile disconnect from session;

###ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; 

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;

檢查當前配置,顯示需要重建備庫


[oracle@cjc-db-01 ~]$ dgmgrl sys/oracle

DGMGRL for Linux: Version 11.2.0.4.0 - 64bit Production


Copyright (c) 2000, 2009, Oracle. All rights reserved.


Welcome to DGMGRL, type "help" for information.

Connected.

DGMGRL> show configuration;

ORA-16795: the standby database needs to be re-created


Configuration details cannot be determined by DGMGRL

嘗試回切


DGMGRL> switchover to cjc1;

ORA-16795: the standby database needs to be re-created


Configuration details cannot be determined by DGMGRL



DGMGRL> show configuration;


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc2 - Primary database

      Warning: ORA-16829: fast-start failover configuration is lagging


    cjc1 - (*) Physical standby database (disabled)

      ORA-16661: the standby database needs to be reinstated


Fast-Start Failover: ENABLED


Configuration Status:

WARNING

備庫提示


DGMGRL> show configuration;

ORA-16795: the standby database needs to be re-created


Configuration details cannot be determined by DGMGRL

檢查cjc1

DGMGRL> show database cjc1;


Database - cjc1


  Role:            PHYSICAL STANDBY

  Intended State:  APPLY-ON

  Transport Lag:   (unknown)

  Apply Lag:       (unknown)

  Apply Rate:      (unknown)

  Real Time Query: OFF

  Instance(s):

    cjc1


Database Status:

ORA-16661: the standby database needs to be reinstated


檢查cjc2

DGMGRL> show database cjc2;


Database - cjc2


  Role:            PRIMARY

  Intended State:  TRANSPORT-ON

  Instance(s):

    cjc2


  Database Warning(s):

    ORA-16829: fast-start failover configuration is lagging


Database Status:

WARNING

檢視引數


備庫


DGMGRL> show database verbose cjc1;
Database - cjc1
  Role:            PHYSICAL STANDBY
  Intended State:  APPLY-ON
  Transport Lag:   (unknown)
  Apply Lag:       (unknown)
  Apply Rate:      (unknown)
  Real Time Query: OFF
  Instance(s):
    cjc1
  Properties:
    DGConnectIdentifier             = 'cjc1'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'async'
    DelayMins                       = '0'
    Binding                         = 'optional'
    MaxFailure                      = '0'
    MaxConnections                  = '1'
    ReopenSecs                      = '300'
    NetTimeout                      = '30'
    RedoCompression                 = 'DISABLE'
    LogShipping                     = 'ON'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyParallel                   = 'AUTO'
    StandbyFileManagement           = 'AUTO'
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '4'
    LogArchiveMinSucceedDest        = '1'
    DbFileNameConvert               = 'cjc2, cjc1'
    LogFileNameConvert              = 'cjc2, cjc1'
    FastStartFailoverTarget         = 'cjc2'
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    SendQEntries                    = '(monitor)'
    LogXptStatus                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    ApplyLagThreshold               = '0'
    TransportLagThreshold           = '0'
    TransportDisconnectedThreshold  = '30'
    SidName                         = 'cjc1'
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=cjc-db-01)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=cjc1_DGMGRL)(INSTANCE_NAME=cjc1)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = '/arch'
    AlternateLocation               = ''
    LogArchiveTrace                 = '8191'
    LogArchiveFormat                = 'cjc_%t_%s_%r.arc'
    TopWaitEvents                   = '(monitor)'
Database Status:
ORA-16661: the standby database needs to be reinstated


主庫


DGMGRL> show database verbose cjc2;
Database - cjc2
  Role:            PRIMARY
  Intended State:  TRANSPORT-ON
  Instance(s):
    cjc2
  Database Warning(s):
    ORA-16829: fast-start failover configuration is lagging
  Properties:
    DGConnectIdentifier             = 'cjc2'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'async'
    DelayMins                       = '0'
    Binding                         = 'OPTIONAL'
    MaxFailure                      = '0'
    MaxConnections                  = '1'
    ReopenSecs                      = '300'
    NetTimeout                      = '30'
    RedoCompression                 = 'DISABLE'
    LogShipping                     = 'ON'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyParallel                   = 'AUTO'
    StandbyFileManagement           = 'AUTO'
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '4'
    LogArchiveMinSucceedDest        = '1'
    DbFileNameConvert               = 'cjc1, cjc2'
    LogFileNameConvert              = 'cjc1, cjc2'
    FastStartFailoverTarget         = 'cjc1'
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    SendQEntries                    = '(monitor)'
    LogXptStatus                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    ApplyLagThreshold               = '0'
    TransportLagThreshold           = '0'
    TransportDisconnectedThreshold  = '30'
    SidName                         = 'cjc2'
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=cjc-db-02)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=cjc2_DGMGRL)(INSTANCE_NAME=cjc2)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = '/arch'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = 'cjc_%t_%s_%r.arc'
    TopWaitEvents                   = '(monitor)'
Database Status:
WARNING
主從不同步了,歸檔收不到



 主庫

 DEST_ID ERRORSTATUS  LOG_SEQUENCE  APPLIED_SCN MAX_CONNECTIONS NET_TIMEOUT COMPRES

---------- -------------------- --------- ------------ -------------- --------------- ----------- -------

 1VALID    12    0    10 DISABLE

 2DEFERRED     0    0    1       30 DISABLE

 3INACTIVE     0    0    10 DISABLE

 4INACTIVE     0    0    10 DISABLE

 

從庫

SQL> col dest_name for a30

col error for a20

set lin 200 pages 100

col applied_scn for 9999999999999

select dest_id,error,status,log_sequence,applied_scn,MAX_CONNECTIONS,NET_TIMEOUT,COMPRESSION from v$archive_dest where dest_id<5;SQL> SQL> SQL> SQL> 


   DEST_ID ERRORSTATUS  LOG_SEQUENCE  APPLIED_SCN MAX_CONNECTIONS NET_TIMEOUT COMPRES

---------- -------------------- --------- ------------ -------------- --------------- ----------- -------

 1VALID     0    0    10 DISABLE

 2VALID     0    0    1       30 DISABLE

 3INACTIVE     0    0    10 DISABLE

 4INACTIVE     0    0    10 DISABLE

主庫 


SQL> show parameter log_archive_dest_state_2


NAME     TYPE VALUE

------------------------------------ ----------- ------------------------------

log_archive_dest_state_2     string RESET

重新啟用


SQL> alter system set log_archive_dest_state_2=defer;

System altered.


SQL> alter system set log_archive_dest_state_2=enable;

System altered.

歸檔可以正常接收,主庫新增資料可以正常同步到從庫。


但是dgmgr顯示還是不對


ORA-16829: fast-start failover configuration is lagging


ORA-16661: the standby database needs to be reinstated


重建備庫


DGMGRL> reinstate database cjc1;  

Reinstating database "cjc1", please wait...

Reinstatement of database "cjc1" succeeded

檢視備庫對應告警日誌


Wed Sep 07 14:04:45 2022
NSV1 started with pid=38, OS id=3216 
OCISessionBegin with PasswordVerifier succeeded
Wed Sep 07 14:04:49 2022
RSM0 started with pid=39, OS id=3223 
Killing 4 processes with pids 2641,2710,2652,2654 (all RFS) in order to reinstate the database after a failover. Requested by OS process 3223
Data Guard: Stopping apply to check viability of standby
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
Wed Sep 07 14:04:54 2022
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /oracle/app/oracle/diag/rdbms/cjc1/cjc1/trace/cjc1_pr00_2725.trc:
ORA-16037: user requested cancel of managed recovery operation
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Recovered data files to a consistent state at change 1186859
Errors in file /oracle/app/oracle/diag/rdbms/cjc1/cjc1/trace/cjc1_pr00_2725.trc:
ORA-16037: user requested cancel of managed recovery operation
Wed Sep 07 14:04:54 2022
Errors in file /oracle/app/oracle/diag/rdbms/cjc1/cjc1/trace/cjc1_mrp0_2719.trc:
ORA-10877: error signaled in parallel recovery slave 
MRP0: Background Media Recovery process shutdown (cjc1)
Managed Standby Recovery Canceled (cjc1)
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
krsk_ler_purge_scn: Purged kccle 4 (next SCN 65535:-1)
ALTER SYSTEM SET log_archive_trace=8191 SCOPE=BOTH SID='cjc1';
ALTER SYSTEM SET log_archive_format='cjc_%t_%s_%r.arc' SCOPE=SPFILE SID='cjc1';
ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*';
ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*';
ALTER SYSTEM SET log_archive_max_processes=4 SCOPE=BOTH SID='*';
ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*';
ALTER SYSTEM SET db_file_name_convert='cjc2','cjc1' SCOPE=SPFILE;
ALTER SYSTEM SET log_file_name_convert='cjc2','cjc1' SCOPE=SPFILE;
ALTER SYSTEM SET fal_server='cjc2' SCOPE=BOTH;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE
Attempt to start background Managed Standby Recovery process (cjc1)
Wed Sep 07 14:04:56 2022
MRP0 started with pid=22, OS id=3233 
MRP0: Background Managed Standby Recovery process started (cjc1)
 started logmerger process
Wed Sep 07 14:05:02 2022
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 2 slaves
krss_find_arc: Selecting ARC2 to receive message as last resort
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Media Recovery Waiting for thread 1 sequence 17
OCISessionBegin with PasswordVerifier succeeded
Wed Sep 07 14:05:02 2022
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE
Wed Sep 07 14:05:03 2022
Redo Shipping Client Connected as PUBLIC
-- Connected User is Valid
RFS[6]: Assigned to RFS process 3248
RFS[6]: Identified database type as 'physical standby': Client is LGWR ASYNC pid 2512
Primary database is in MAXIMUM PERFORMANCE mode
RFS[6]: Begin archive primary thread 1 sequence 18 (cjc1)
Primary database is in MAXIMUM PERFORMANCE mode
RFS[6]: Successfully opened standby log 4: '/oradata/cjc/standby_redo04.log'
RFS[6]: Selected log 4 for thread 1 sequence 18 dbid -538205520 branch 1114707085
Wed Sep 07 14:05:12 2022
Fetching gap sequence in thread 1, gap sequence 17-17
FAL[client]: Trying FAL server: cjc2
OCISessionBegin with PasswordVerifier succeeded
Wed Sep 07 14:05:23 2022
FAL[client]: Trying FAL server: cjc2
OCISessionBegin with PasswordVerifier succeeded
Wed Sep 07 14:05:33 2022
FAL[client]: Trying FAL server: cjc2
OCISessionBegin with PasswordVerifier succeeded
Wed Sep 07 14:05:43 2022
FAL[client]: Trying FAL server: cjc2
OCISessionBegin with PasswordVerifier succeeded
Wed Sep 07 14:05:44 2022
Redo Shipping Client Connected as PUBLIC
-- Connected User is Valid
RFS[7]: Assigned to RFS process 3288
RFS[7]: Identified database type as 'physical standby': Client is ARCH pid 2506
Wed Sep 07 14:05:44 2022
Redo Shipping Client Connected as PUBLIC
-- Connected User is Valid
RFS[8]: Assigned to RFS process 3290
RFS[8]: Identified database type as 'physical standby': Client is ARCH pid 2503
RFS[8]: Begin archive primary thread 1 sequence 17 (cjc1)
RFS[8]: Successfully opened standby log 5: '/oradata/cjc/standby_redo05.log'
RFS[8]: Selected log 5 for thread 1 sequence 17 dbid -538205520 branch 1114707085
RFS[8]: Completed archive log 0 thread 1 sequence 17 (cjc1)
Wed Sep 07 14:05:44 2022
ARC3: Evaluating archive   log 5 thread 1 sequence 17
ARC3: Beginning to archive thread 1 sequence 17 (1186140-1186874) (cjc1)
ARC3: Creating local archive destination LOG_ARCHIVE_DEST_1: '/arch/cjc_1_17_1114707085.arc' (thread 1 sequence 17) (cjc1)
ARC3: Closing local archive destination LOG_ARCHIVE_DEST_1: '/arch/cjc_1_17_1114707085.arc' (cjc1)
Committing creation of archivelog '/arch/cjc_1_17_1114707085.arc'
Archived Log entry 127 added for thread 1 sequence 17 ID 0xdfed0f5a dest 1:
ARC3: Completed archiving thread 1 sequence 17 (0-0) (cjc1)
Wed Sep 07 14:05:45 2022
Redo Shipping Client Connected as PUBLIC
-- Connected User is Valid
RFS[9]: Assigned to RFS process 3294
RFS[9]: Identified database type as 'physical standby': Client is ARCH pid 2510
Media Recovery Log /arch/cjc_1_17_1114707085.arc
RFS[9]: Begin archive primary thread 1 sequence 17 (cjc1)
Errors in file /oracle/app/oracle/diag/rdbms/cjc1/cjc1/trace/cjc1_rfs_3294.trc:
ORA-16401: archive log rejected by Remote File Server (RFS)
Media Recovery Waiting for thread 1 sequence 18 (in transit)
Recovery of Online Redo Log: Thread 1 Group 4 Seq 18 Reading mem 0
  Mem# 0: /oradata/cjc/standby_redo04.log


再次檢查,恢復正常


DGMGRL> show configuration;


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc2 - Primary database

    cjc1 - (*) Physical standby database


Fast-Start Failover: ENABLED


Configuration Status:

SUCCESS


回切

DGMGRL> switchover to cjc1;

Performing switchover NOW, please wait...

New primary database "cjc1" is opening...

Operation requires startup of instance "cjc2" on database "cjc2"

Starting instance "cjc2"...

ORACLE instance started.

Database mounted.

Database opened.

Switchover succeeded, new primary is "cjc1

總結:


在使用了DGMGRL來管理Dataguard時,不要再使用SQLPLUS命令列來管理了,所有的操作做好在DGMGRL下進行修改配置,否則會造成引數衝突,狀態異常。


圖片


四:透過keepalived進行vip自動切換


既然DG可以自動切換了,如何保障主庫出現異常時,應用中斷時間最短呢,或者儘量避免人為干預?


這需要VIP或一個域名,應用連線資料庫連線的是VIP或域名,當主庫異常時,透過 FSFO 進行自動主備切換,VIP地址透過keepalived自動切換,切換完成後,應用仍然可以透過VIP進行訪問資料庫。


1.安裝keepalived


[root@cjc-db-02 ~]# cd /soft/

[root@cjc-db-02 soft]# tar -zxvf keepalived-2.0.15.tar.gz 

[root@cjc-db-02 soft]# cd keepalived-2.0.15/

[root@cjc-db-02 keepalived-2.0.15]# ./configure  --prefix=/usr/local/keepalived

報錯


configure: error: 

  !!! OpenSSL is not properly installed on your system. !!!

  !!! Can not include OpenSSL headers files.            !!!

解決


[root@cjc-db-01 keepalived-2.0.15]# yum install openssl openssl-devel

編譯


make && make install

echo $?

將命令複製到/usr/sbin裡:


cp /usr/local/keepalived/sbin/keepalived /usr/sbin/

cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/

配置keepalived,VIP地址為172.16.6.150


節點1  172.16.6.137


vi /etc/keepalived/keepalived.conf

! Configuration File for keepalived



vrrp_script chk_dg_stats {  

    script "/etc/keepalived/check_dataguard.sh" 

    interval 2

    weight -5

    fall 2  

    rise 1  

}

       

vrrp_instance VI_1 {

    state MASTER    

    interface enp0s3 

    mcast_src_ip 172.16.6.137

    virtual_router_id 100 

    priority 100

    inopreempt

    advert_int 1         

    authentication {   

        auth_type PASS 

        auth_pass 888888   

    }

    virtual_ipaddress {    

        172.16.6.150

    }

      

    track_script {               

       chk_dg_stats             

    }

}

節點2 172.16.6.138


[root@test05 ~]# cat /etc/keepalived/keepalived.conf

! Configuration File for keepalived



vrrp_script chk_dg_stats {  

    script "/etc/keepalived/check_dataguard.sh" 

    interval 2

    weight -5

    fall 2  

    rise 1  

}

       

vrrp_instance VI_1 {

    state BACKUP

    interface enp0s3 

    mcast_src_ip 172.16.6.138

    virtual_router_id 100 

    priority 100 

    inopreempt

    advert_int 1         

    authentication {   

        auth_type PASS 

        auth_pass 888888   

    }

    virtual_ipaddress {    

        172.16.6.150

    }

      

    track_script {               

       chk_dg_stats             

    }

}

檢查指令碼


vi /etc/keepalived/check_dataguard.sh 

#!/bin/bash

dbstats=`ps -ef | grep ora_smon | grep -v grep | wc -l`

dgstats=`ps -ef | grep ora_mrp | grep -v grep | wc -l`


if [ "${dbstats}" -eq 0  ]; then

systemctl stop keepalived.service

#elif [[ "${dbstats}" -gt 0 ]] && [[ "${dgstats}" -gt 0 ]]; then

#systemctl stop keepalived.service

fi 

新增執行許可權


chmod a+x /etc/keepalived/check_dataguard.sh

啟動keepalived


[root@cjc-db-01 keepalived]# systemctl start keepalived.service 

[root@cjc-db-01 keepalived]# systemctl status keepalived.service 

?.keepalived.service - LVS and VRRP High Availability Monitor

   Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)

   Active: active (running) since Wed 2022-09-07 15:28:01 CST; 6s ago

  Process: 16728 ExecStart=/usr/local/keepalived/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)

 Main PID: 16730 (keepalived)

    Tasks: 2

   CGroup: /system.slice/keepalived.service

           ?..16730 /usr/local/keepalived/sbin/keepalived -D

           ?..16731 /usr/local/keepalived/sbin/keepalived -D


Sep 07 15:28:01 cjc-db-01 Keepalived_vrrp[16731]: VRRP_Script(chk_dg_stats) succeeded

Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: (VI_1) Receive advertisement timeout

Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: (VI_1) Entering MASTER STATE

Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: (VI_1) setting VIPs.

Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: Sending gratuitous ARP on enp0s3 for 172.16.6.150

Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: (VI_1) Sending/queueing gratuitous ARPs on enp0s3 for 172.16.6.150

Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: Sending gratuitous ARP on enp0s3 for 172.16.6.150

Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: Sending gratuitous ARP on enp0s3 for 172.16.6.150

Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: Sending gratuitous ARP on enp0s3 for 172.16.6.150

Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: Sending gratuitous ARP on enp0s3 for 172.16.6.150

檢查vip


ip a

2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000

    link/ether 08:00:27:29:44:a2 brd ff:ff:ff:ff:ff:ff

    inet 172.16.6.137/16 brd 172.16.255.255 scope global noprefixroute enp0s3

       valid_lft forever preferred_lft forever

    inet 172.16.6.150/32 scope global enp0s3

       valid_lft forever preferred_lft forever

    inet6 fe80::2d75:f64c:8022:6f66/64 scope link noprefixroute 

       valid_lft forever preferred_lft forever

測試VIP漂移


[root@cjc-db-01 keepalived]# systemctl stop keepalived.service 

[root@cjc-db-02 keepalived]# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

       valid_lft forever preferred_lft forever

    inet6 ::1/128 scope host 

       valid_lft forever preferred_lft forever

2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000

    link/ether 08:00:27:5c:ac:40 brd ff:ff:ff:ff:ff:ff

    inet 172.16.6.138/16 brd 172.16.255.255 scope global noprefixroute enp0s3

       valid_lft forever preferred_lft forever

    inet 172.16.6.150/32 scope global enp0s3

       valid_lft forever preferred_lft forever

    inet6 fe80::2d75:f64c:8022:6f66/64 scope link tentative noprefixroute dadfailed 

       valid_lft forever preferred_lft forever

    inet6 fe80::f061:4b23:c393:ca5/64 scope link noprefixroute 

       valid_lft forever preferred_lft forever

禁用自動切換


DGMGRL> disable fast_start failover;

Disabled.

DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database

    cjc2 - Physical standby database


Fast-Start Failover: DISABLED


Configuration Status:

SUCCESS

配置監聽


[oracle@cjc-db-01 ~]$ sqlplus system/oracle@172.16.6.150:1521/cjc1


SQL*Plus: Release 11.2.0.4.0 Production on Wed Sep 7 15:33:18 2022


Copyright (c) 1982, 2013, Oracle.  All rights reserved.


ERROR:

ORA-12541: TNS:no listener

[oracle@cjc-db-01 admin]$ cat listener.ora

SID_LIST_LISTENER = 

  (SID_LIST = 

    (SID_DESC = 

      (GLOBAL_DBNAME = cjc1) 

      (ORACLE_HOME = /oracle/app/oracle/product/11.2/db) 

      (SID_NAME = cjc1) 

    ) 

    (SID_DESC = 

      (GLOBAL_DBNAME = cjc1_DGMGRL) 

      (ORACLE_HOME = /oracle/app/oracle/product/11.2/db) 

      (SID_NAME = cjc1) 

    ) 

  )

  

LISTENER =

  (DESCRIPTION_LIST =

    (DESCRIPTION =

      (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.6.137)(PORT = 1521))

    )

    (DESCRIPTION =

      (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.6.150)(PORT = 1521))

    )

  )

配置服務


SQL> show parameter name

NAME     TYPE VALUE

------------------------------------ ----------- ------------------------------

cell_offloadgroup_name     string

db_file_name_convert     string cjc2, cjc1

db_name      string cjc

db_unique_name     string cjc1

global_names     boolean FALSE

instance_name     string cjc1

lock_name_space      string

log_file_name_convert     string cjc2, cjc1

processor_group_name     string

service_names     string cjc1


SQL> alter system set service_names='cjc1,cjc' scope=both;


System altered.


SQL> show parameter service_name


NAME     TYPE VALUE

------------------------------------ ----------- ------------------------------

service_names     string cjc1,cjc


從庫

alter system set service_names='cjc2,cjc' scope=both;

重啟監聽


lsnrctl reload

lsnrctl stop

lsnrctl start

檢視監聽狀態


[oracle@cjc-db-01 admin]$ lsnrctl status


LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 07-SEP-2022 15:45:15


Copyright (c) 1991, 2013, Oracle.  All rights reserved.


Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.16.6.137)(PORT=1521)))

STATUS of the LISTENER

------------------------

Alias                     LISTENER

Version                   TNSLSNR for Linux: Version 11.2.0.4.0 - Production

Start Date                07-SEP-2022 15:44:08

Uptime                    0 days 0 hr. 1 min. 7 sec

Trace Level               off

Security                  ON: Local OS Authentication

SNMP                      OFF

Listener Parameter File   /oracle/app/oracle/product/11.2/db/network/admin/listener.ora

Listener Log File         /oracle/app/oracle/diag/tnslsnr/cjc-db-01/listener/alert/log.xml

Listening Endpoints Summary...

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.6.137)(PORT=1521)))

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.6.150)(PORT=1521)))

Services Summary...

Service "cjc1" has 2 instance(s).

  Instance "cjc1", status UNKNOWN, has 1 handler(s) for this service...

  Instance "cjc1", status READY, has 1 handler(s) for this service...

Service "cjc1_DGB" has 1 instance(s).

  Instance "cjc1", status READY, has 1 handler(s) for this service...

Service "cjc1_DGMGRL" has 1 instance(s).

  Instance "cjc1", status UNKNOWN, has 1 handler(s) for this service...

Service "cjcXDB" has 1 instance(s).

  Instance "cjc1", status READY, has 1 handler(s) for this service...

The command completed successfully

連線測試


SQL> conn system/oracle@172.16.6.137:1521/cjc1

Connected.

SQL> conn system/oracle@172.16.6.150:1521/cjc1

Connected.

監聽檔案


[oracle@cjc-db-01 admin]$ cat listener.ora

SID_LIST_LISTENER = 

  (SID_LIST = 

    (SID_DESC = 

      (GLOBAL_DBNAME = cjc1) 

      (ORACLE_HOME = /oracle/app/oracle/product/11.2/db) 

      (SID_NAME = cjc1) 

    ) 

    (SID_DESC = 

      (GLOBAL_DBNAME = cjc1_DGMGRL) 

      (ORACLE_HOME = /oracle/app/oracle/product/11.2/db) 

      (SID_NAME = cjc1) 

    ) 

    (SID_DESC = 

      (GLOBAL_DBNAME = cjc) 

      (ORACLE_HOME = /oracle/app/oracle/product/11.2/db) 

      (SID_NAME = cjc) 

    ) 

  )

  

LISTENER =

  (DESCRIPTION_LIST =

    (DESCRIPTION =

      (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.6.137)(PORT = 1521))

    )

    (DESCRIPTION =

      (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.6.150)(PORT = 1521))

    )

  )

  

  

測試

SQL> conn system/oracle@172.16.6.137:1521/cjc1

Connected.

SQL> conn system/oracle@172.16.6.137:1521/cjc

Connected.

SQL> conn system/oracle@172.16.6.150:1521/cjc1

Connected.

SQL> conn system/oracle@172.16.6.150:1521/cjc

Connected.

SQL> conn system/oracle@172.16.6.138:1521/cjc2

Connected.

SQL> conn system/oracle@172.16.6.138:1521/cjc

Connected.


監聽狀態

SQL> ho lsnrctl status


LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 07-SEP-2022 16:05:22


Copyright (c) 1991, 2013, Oracle.  All rights reserved.


Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.16.6.137)(PORT=1521)))

STATUS of the LISTENER

------------------------

Alias                     LISTENER

Version                   TNSLSNR for Linux: Version 11.2.0.4.0 - Production

Start Date                07-SEP-2022 15:44:08

Uptime                    0 days 0 hr. 21 min. 13 sec

Trace Level               off

Security                  ON: Local OS Authentication

SNMP                      OFF

Listener Parameter File   /oracle/app/oracle/product/11.2/db/network/admin/listener.ora

Listener Log File         /oracle/app/oracle/diag/tnslsnr/cjc-db-01/listener/alert/log.xml

Listening Endpoints Summary...

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.6.137)(PORT=1521)))

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.6.150)(PORT=1521)))

Services Summary...

Service "cjc" has 2 instance(s).

  Instance "cjc", status UNKNOWN, has 1 handler(s) for this service...

  Instance "cjc1", status READY, has 1 handler(s) for this service...

Service "cjc1" has 2 instance(s).

  Instance "cjc1", status UNKNOWN, has 1 handler(s) for this service...

  Instance "cjc1", status READY, has 1 handler(s) for this service...

Service "cjc1_DGB" has 1 instance(s).

  Instance "cjc1", status READY, has 1 handler(s) for this service...

Service "cjc1_DGMGRL" has 1 instance(s).

  Instance "cjc1", status UNKNOWN, has 1 handler(s) for this service...

Service "cjcXDB" has 1 instance(s).

  Instance "cjc1", status READY, has 1 handler(s) for this service...

The command completed successfully

測試VIP切換


停主庫


SQL> shutdown immediate

檢視VIP已經飄到從庫


ip a

: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000

    link/ether 08:00:27:5c:ac:40 brd ff:ff:ff:ff:ff:ff

    inet 172.16.6.138/16 brd 172.16.255.255 scope global noprefixroute enp0s3

       valid_lft forever preferred_lft forever

    inet 172.16.6.150/32 scope global enp0s3

       valid_lft forever preferred_lft forever

連線測試


SQL> conn system/oracle@172.16.6.150:1521/cjc

Connected.


sqlplus system/oracle@172.16.6.150:1521/cjc

set lin 200 pages 100

set lin 200 pages 100

col FLASHBACK_ON for a10

col current_scn for 99999999999999

col open_mode for a20

col SWITCHOVER_STATUS for a20

col PROTECTION_MODE for a20

select current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database;

  CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1219061 MAXIMUM PERFORMANCE  PHYSICAL STANDBY YES YES     READ ONLY WITH APPLY NOT ALLOWED

只是VIP飄了,資料庫角色沒有變


需要執行手動切換 或 配置自動切換後才能進行使用


透過dg broker進行手動切換


DGMGRL> switchover to cjc2;

Performing switchover NOW, please wait...

Operation requires a connection to instance "cjc2" on database "cjc2"

Connecting to instance "cjc2"...

Connected.

New primary database "cjc2" is opening...

Operation requires startup of instance "cjc1" on database "cjc1"

Starting instance "cjc1"...

ORACLE instance started.

Database mounted.

Database opened.

Switchover succeeded, new primary is "cjc2"

檢查,切換成功


[oracle@cjc-db-01 ~]$ sqlplus system/oracle@172.16.6.150:1521/cjc

SQL*Plus: Release 11.2.0.4.0 Production on Wed Sep 7 16:28:14 2022

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> show parameter name

NAME     TYPE VALUE

------------------------------------ ----------- ------------------------------

cell_offloadgroup_name     string

db_file_name_convert     string cjc1, cjc2

db_name      string cjc

db_unique_name     string cjc2

global_names     boolean FALSE

instance_name     string cjc2

lock_name_space      string

log_file_name_convert     string cjc1, cjc2

processor_group_name     string

service_names     string cjc2,cjc

透過VIP連線已經是cjc2了


CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1240051 MAXIMUM PERFORMANCE  PRIMARY      YES YES     READ WRITE   SESSIONS ACTIVE

回切


回切前需要先啟動從庫的keepalived服務


[root@cjc-db-01 keepalived]# systemctl start keepalived.service

開始回切


DGMGRL> switchover to cjc1

Performing switchover NOW, please wait...

New primary database "cjc1" is opening...

Operation requires startup of instance "cjc2" on database "cjc2"

Starting instance "cjc2"...

ORACLE instance started.

Database mounted.

Database opened.

Switchover succeeded, new primary is "cjc1"

透過VIP連線資料庫


SQL> show parameter instance_name

NAME     TYPE VALUE

------------------------------------ ----------- ------------------------------

instance_name     string cjc1


sqlplus system/oracle@172.16.6.150:1521/cjc


    CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1260596 MAXIMUM PERFORMANCE  PRIMARY      YES YES     READ WRITE   SESSIONS ACTIVE

測試 FSFO+keepalived自動切換


啟動自動切換


DGMGRL> enable fast_start failover;

DGMGRL> show configuration


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database

    cjc2 - (*) Physical standby database


Fast-Start Failover: ENABLED


Configuration Status:

SUCCESS

主庫


enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000

    link/ether 08:00:27:29:44:a2 brd ff:ff:ff:ff:ff:ff

    inet 172.16.6.137/16 brd 172.16.255.255 scope global noprefixroute enp0s3

       valid_lft forever preferred_lft forever

    inet 172.16.6.150/32 scope global enp0s3

主庫執行非正常關庫操作


SQL> shutdown abort

ORACLE instance shut down.

檢視fsfo.log日誌,成功切換到cjc2


16:38:20.23  Wednesday, September 07, 2022

Initiating Fast-Start Failover to database "cjc2"...

Performing failover NOW, please wait...

Failover succeeded, new primary is "cjc2"

16:38:24.18  Wednesday, September 07, 2022

vip 已經切換到138上


enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000

    link/ether 08:00:27:5c:ac:40 brd ff:ff:ff:ff:ff:ff

    inet 172.16.6.138/16 brd 172.16.255.255 scope global noprefixroute enp0s3

       valid_lft forever preferred_lft forever

    inet 172.16.6.150/32 scope global enp0s3

       valid_lft forever preferred_lft forever

    inet6 fe80::2d75:f64c:8022:6f66/64 scope link tentative noprefixroute dadfailed 

       valid_lft forever preferred_lft forever

    inet6 fe80::f061:4b23:c393:ca5/64 scope link noprefixroute 

       valid_lft forever preferred_lft forever

主庫也是切換到138 cjc2上


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc2 - Primary database

      Warning: ORA-16829: fast-start failover configuration is lagging


    cjc1 - (*) Physical standby database (disabled)

      ORA-16661: the standby database needs to be reinstated


Fast-Start Failover: ENABLED


Configuration Status:

WARNING

恢復原主庫137 cjc1


SQL> startup

ORACLE instance started.


Total System Global Area 1135747072 bytes

Fixed Size    2252544 bytes

Variable Size  754974976 bytes

Database Buffers  369098752 bytes

Redo Buffers    9420800 bytes

Database mounted.

ORA-16649: possible failover to another database prevents this database from

being opened

報錯可以忽略,後臺自動執行閃回資料庫,重新open資料庫


可以看到已經加會到備庫


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc2 - Primary database

    cjc1 - (*) Physical standby database


Fast-Start Failover: ENABLED


Configuration Status:

SUCCESS

普通關閉資料庫,理論上主備角色是不會自動切換的,VIP會發生漂移


[oracle@cjc-db-01 ~]$ sqlplus system/oracle@172.16.6.150:1521/cjc

SQL*Plus: Release 11.2.0.4.0 Production on Wed Sep 7 16:44:40 2022

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options



SQL> show parameter instance_name


NAME     TYPE VALUE

------------------------------------ ----------- ------------------------------

instance_name     string cjc2

啟動keepalived


[root@cjc-db-01 keepalived]# systemctl start keepalived.service 

主庫關閉例項


檢視IP


enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000

    link/ether 08:00:27:5c:ac:40 brd ff:ff:ff:ff:ff:ff

    inet 172.16.6.138/16 brd 172.16.255.255 scope global noprefixroute enp0s3

       valid_lft forever preferred_lft forever

    inet 172.16.6.150/32 scope global enp0s3

       valid_lft forever preferred_lft forever

    inet6 fe80::2d75:f64c:8022:6f66/64 scope link tentative noprefixroute dadfailed 

       valid_lft forever preferred_lft forever

    inet6 fe80::f061:4b23:c393:ca5/64 scope link noprefixroute 

       valid_lft forever preferred_lft forever

SQL> shutdown immediate

Database closed.

Database dismounted.

ORACLE instance shut down.

主從角色並沒有自動切換


DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc2 - Primary database

    cjc1 - (*) Physical standby database


Fast-Start Failover: ENABLED


Configuration Status:

ORA-01034: ORACLE not available

ORA-16625: cannot reach database "cjc2"

DGM-17017: unable to determine configuration status

但是VIP進行切換了


[oracle@cjc-db-01 ~]$ sqlplus system/oracle@172.16.6.150:1521/cjc

SQL> show parameter instance_name

NAME     TYPE VALUE

------------------------------------ ----------- ------------------------------

instance_name     string cjc1


set lin 200 pages 100

set lin 200 pages 100

col FLASHBACK_ON for a10

col current_scn for 99999999999999

col open_mode for a20

col SWITCHOVER_STATUS for a20

col PROTECTION_MODE for a20

select current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database;

CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1261913 MAXIMUM PERFORMANCE  PHYSICAL STANDBY YES YES     READ ONLY WITH APPLY SWITCHOVER PENDING

手動切換


啟動主庫例項


startup

啟動主庫keepalived


systemctl start keepalived.service

重啟從庫keepalived,目的是將VIP切換到主庫


[root@cjc-db-01 keepalived]# systemctl stop keepalived.service 

[root@cjc-db-01 keepalived]# systemctl start keepalived.service 

關閉自動切換


DGMGRL> disable fast_start failover;

Disabled.

DGMGRL>  show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc2 - Primary database

    cjc1 - Physical standby database


Fast-Start Failover: DISABLED


Configuration Status:

SUCCES

回切


DGMGRL> switchover to cjc1

Performing switchover NOW, please wait...

New primary database "cjc1" is opening...

Operation requires startup of instance "cjc2" on database "cjc2"

Starting instance "cjc2"...

ORACLE instance started.

Database mounted.

Database opened.

Switchover succeeded, new primary is "cjc1"

DGMGRL> show configuration


Configuration - cjcdgbroker


  Protection Mode: MaxPerformance

  Databases:

    cjc1 - Primary database

    cjc2 - Physical standby database


Fast-Start Failover: DISABLED


Configuration Status:

SUCCESS

[oracle@cjc-db-01 ~]$ sqlplus system/oracle@172.16.6.150:1521/cjc


SQL*Plus: Release 11.2.0.4.0 Production on Wed Sep 7 17:29:02 2022


Copyright (c) 1982, 2013, Oracle.  All rights reserved.



Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options



SQL> show parameter instance_name


NAME     TYPE VALUE

------------------------------------ ----------- ------------------------------

instance_name     string cjc1



    CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS

--------------- -------------------- ---------------- --- ---------- -------------------- --------------------

1304918 MAXIMUM PERFORMANCE  PRIMARY      YES YES     READ WRITE   SESSIONS ACTIVE

圖片


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29785807/viewspace-2914268/,如需轉載,請註明出處,否則將追究法律責任。

相關文章