Archive Log模式下Redo Log、Check Point和Switch Log

realkid4發表於2013-08-03

 

Oracle體系中,Online Redo LogArchived Redo LogCheck PointSwitch Log動作之間是存在著千絲萬縷的關係。作為我們非理論研究者而言,過多的強調技術細節意義不是很大。但是在實際中,進行診斷和問題解決的時候,對原理的理解還是有一定的作用的。

 

本篇從兩個場景入手,研究一下Redo Log切換過程中,check point和相關狀態情況。

 

1、實驗環境介紹

 

本篇實驗在Oracle 11R2上進行測試。

 

 

SQL> select * from v$version;

 

BANNER

--------------------------------------------------------------------------------

Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production

PL/SQL Release 11.2.0.1.0 - Production

CORE        11.2.0.1.0         Production

 

 

當前為歸檔archive模式,三個日誌組。

 

 

SQL> select log_mode from v$database;

 

LOG_MODE

------------

ARCHIVELOG

 

SQL> select group#, sequence#, archived, status, first_change# from v$log;

 

    GROUP#  SEQUENCE# ARCHIVED STATUS           FIRST_CHANGE#

---------- ---------- -------- ---------------- -------------

         1        139 YES      INACTIVE               3466294

         2        140 NO       CURRENT                3466703

         3        138 YES      INACTIVE               3464619

 

 

我們透過alter log來監控系統行為。我們可以使用log_checkpoints_to_alert來控制檢查點發出的時候,預設是不寫入到alert_log中。

 

 

SQL> show parameter log_checkpoints;

NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

log_checkpoints_to_alert             boolean     FALSE

 

SQL> alter system set log_checkpoints_to_alert=true;

System altered

 

SQL> show parameter log_checkpoints;

NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

log_checkpoints_to_alert             boolean     TRUE

 

 

2、三種常見的Checkpoint行為

 

我們經常能夠看到的Checkpoint一共有三種。分別為:全域性檢查點(Global  Checkpoint)、增量檢查點(Incremental Checkpoint)和日誌切換檢查點(Switch Log Checkpoint)。

 

當我們發出alter system checkpoint或者啟動關閉的時候,會有主動檢查點動作。

 

 

SQL> alter system checkpoint;

System altered

 

Sun Aug 04 03:52:53 2013

Beginning global checkpoint up to RBA [0x8c.23a.10], SCN: 3467262

Completed checkpoint up to RBA [0x8c.23a.10], SCN: 3467262

 

 

增量檢查點主要是針對Global Checkpoint所帶來的效能問題而引入的一種最佳化機制。檢查點Check Point引入的目的,是為了保證進行例項恢復(一旦出現Instance Crash)的時候,消耗的時間可以被接受。存在一個Check PointRedo Log中,相當於是“宣告”了之前的Redo Log Entry對應的Dirty Blocks都已經寫入到了資料檔案中。

 

增量檢查點是Oracle執行階段中自動進行的。

 

 

Sun Aug 04 04:13:13 2013

Incremental checkpoint up to RBA [0x99.17.0], current log tail at RBA [0x99.169.0]

 

 

兩種檢查點(Global CheckpointIncremental Checkpoint)執行都是一次性的,也就是一次性完成,不是一個持續過程。Switch Log Checkpoint在進行Online Redo Log進行切換的時候開始,結束之後一段時間才能結束。

 

 

Beginning log switch checkpoint up to RBA [0x8d.2.10], SCN: 3467294

Thread 1 advanced to log sequence 141 (LGWR switch)

  Current log# 3 seq# 141 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_3_870n4lsg_.log

  Current log# 3 seq# 141 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_3_870n4o31_.log

Sun Aug 04 03:53:56 2013

Archived Log entry 232 added for thread 1 sequence 140 ID 0xe96e7a48 dest 1:

Sun Aug 04 03:59:08 2013

Completed checkpoint up to RBA [0x8d.2.10], SCN: 3467294

 

 

從開始到結束,中間持續了若干分鐘,而且這分鐘裡面,資料庫的事務操作沒有結束。這就是第三類Check Point

 

下面我們進入正題,研究一下Redo Log切換過程。

 

3、非覆蓋過程Redo Log切換行為

 

首先我們先看看Archive Log模式下的時候,日誌非覆蓋的情形。

 

當前日誌組online2,為current狀態。其他日誌狀態為Inactive狀態。

 

 

 

SQL> select group#, sequence#, archived, status, first_change# from v$log;

 

    GROUP#  SEQUENCE# ARCHIVED STATUS           FIRST_CHANGE#

---------- ---------- -------- ---------------- -------------

         1        139 YES      INACTIVE               3466294

         2        140 NO       CURRENT                3466703

         3        138 YES      INACTIVE               3464619

 

 

常見的Online Redo Log Group狀態是四個:ActiveInactiveCurrentUnusable。當一個日誌組被新加入到系統中,沒有被寫入過的時候,狀態為UnusableCurrent狀態表示當前日誌組。Active是說日誌組已經不是當前日誌組,但是Redo Log Entry對應的Dirty Block還沒有完全寫入到資料檔案中。Inactive表示Dirty Block已經完全寫入。

 

此時,我們切換日誌。

 

 

SQL> alter system switch logfile;

System altered

 

SQL> select group#, sequence#, archived, status, first_change# from v$log;

 

    GROUP#  SEQUENCE# ARCHIVED STATUS           FIRST_CHANGE#

---------- ---------- -------- ---------------- -------------

         1        139 YES      INACTIVE               3466294

         2        140 YES      ACTIVE                 3466703

         3        141 NO       CURRENT                3467294

 

 

剛剛的Current日誌狀態變化為Active。表示還沒有完全寫入,此時Alert Log中顯示出剛剛的變化。

 

 

Beginning log switch checkpoint up to RBA [0x8d.2.10], SCN: 3467294

Thread 1 advanced to log sequence 141 (LGWR switch)

  Current log# 3 seq# 141 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_3_870n4lsg_.log

  Current log# 3 seq# 141 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_3_870n4o31_.log

Sun Aug 04 03:53:56 2013

Archived Log entry 232 added for thread 1 sequence 140 ID 0xe96e7a48 dest 1:

 

 

此時,立即生成的log記錄中,有開始log switch checkpoint開始的記錄,之後切換日誌。同時,有Archived Log的資訊。

 

在剛剛的資訊v$log中,我們也的確看到了group 2日誌組Archived Log狀態是Yes,而狀態是Active這就告訴我們:在Active狀態時候,Oracle Archived是可以進行歸檔的。注意:此時沒有發生切換日誌覆蓋動作。

 

注意,此時時間是353。過一會之後,才結束log switch checkpoint

 

 

[oracle@bspdev ~]$ date

Sun Aug  4 03:54:15 CST 2013

 

SQL> select to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') from dual;

 

TO_CHAR(SYSDATE,'YYYY-MM-DDHH2

------------------------------

2013-08-04 03:55:18

 

--延後資訊顯示在日誌上

Sun Aug 04 03:53:56 2013

Archived Log entry 232 added for thread 1 sequence 140 ID 0xe96e7a48 dest 1:

Sun Aug 04 03:59:08 2013

Completed checkpoint up to RBA [0x8d.2.10], SCN: 3467294

 

 

基本上,接近6分鐘才結束這個過程。日誌組group 2狀態也變為Inactive

 

 

 

SQL> select group#, sequence#, archived, status, first_change# from v$log;

 

    GROUP#  SEQUENCE# ARCHIVED STATUS           FIRST_CHANGE#

---------- ---------- -------- ---------------- -------------

         1        139 YES      INACTIVE               3466294

         2        140 YES      INACTIVE               3466703

         3        141 NO       CURRENT                3467294

 

 

那麼如果我們由於生成redo log entry的量過大、速度過快,形成切換迴圈到下一個redo log的時候,歸檔或者active尚未完成,是什麼現象呢?我們繼續分析。

 

 

4覆蓋過程Redo Log切換行為

 

Redo Log切換和歸檔,是Oracle運維和效能調優的一個方面。Redo Log設定不合理,可能會給系統帶來一定的效能問題。

 

比較常見的問題根源是:日誌設定相對於業務量而言,體積較小,日誌切換速度高。同時,低優先順序DBWR或者ARCH來不及將日誌寫入或者歸檔。從而引起阻塞等待。

 

我們當前狀態如下:

 

SQL> set timing on;

SQL> select group#, sequence#, archived, status, first_change# from v$log;

 

    GROUP#  SEQUENCE# ARCHIVED STATUS           FIRST_CHANGE#

---------- ---------- -------- ---------------- -------------

         1        142 YES      INACTIVE               3467843

         2        143 YES      ACTIVE                 3467851

         3        144 NO       CURRENT                3467864

 

Executed in 0.015 seconds

 

 

我們為了模擬情況,使用指令碼連續進行switch log,形成日誌來不及歸檔或者寫入髒塊的情形。

 

 

SQL> alter system switch logfile;

System altered

Executed in 0.015 seconds

 

SQL> alter system switch logfile;

System altered

Executed in 2.262 seconds

 

SQL> alter system switch logfile;

System altered

Executed in 2.964 seconds

 

SQL> alter system switch logfile;

System altered

Executed in 0.234 seconds

 

SQL> alter system switch logfile;

System altered

Executed in 1.139 seconds

 

SQL> alter system switch logfile;

System altered

Executed in 0.015 seconds

 

SQL> alter system switch logfile;

System altered

Executed in 0.234 seconds

 

SQL> alter system switch logfile;

System altered

Executed in 1.731 seconds

 

SQL> alter system switch logfile;

System altered

Executed in 1.341 seconds

 

 

連續的switch log動作,我們從執行時間上,可以看到極度的不平衡。時間最短可以有0.015s,而最長時間達到3s。明顯這個過程中,由於archivedb block寫入,引起一系列的等待或者阻塞。

 

對應最後的日誌資訊如下:

 

 

SQL> select group#, sequence#, archived, status, first_change# from v$log;

 

    GROUP#  SEQUENCE# ARCHIVED STATUS           FIRST_CHANGE#

---------- ---------- -------- ---------------- -------------

         1        151 YES      INACTIVE               3467978

         2        152 YES      INACTIVE               3467981

         3        153 NO       CURRENT                3467984

 

Executed in 0.015 seconds

 

 

Sequence144變換到153。這個階段alert log資訊如下:

 

 

Sun Aug 04 04:03:14 2013

Beginning log switch checkpoint up to RBA [0x90.2.10], SCN: 3467864

Thread 1 advanced to log sequence 144 (LGWR switch)

  Current log# 3 seq# 144 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_3_870n4lsg_.log

  Current log# 3 seq# 144 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_3_870n4o31_.log

Sun Aug 04 04:03:14 2013

Archived Log entry 235 added for thread 1 sequence 143 ID 0xe96e7a48 dest 1:

Sun Aug 04 04:06:39 2013

Beginning log switch checkpoint up to RBA [0x91.2.10], SCN: 3467958

Thread 1 advanced to log sequence 145 (LGWR switch)

  Current log# 1 seq# 145 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_1_870n42n1_.log

  Current log# 1 seq# 145 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_1_870n44z3_.log

Sun Aug 04 04:06:39 2013

Archived Log entry 236 added for thread 1 sequence 144 ID 0xe96e7a48 dest 1:

Thread 1 cannot allocate new log, sequence 146

Checkpoint not complete

  Current log# 1 seq# 145 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_1_870n42n1_.log

  Current log# 1 seq# 145 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_1_870n44z3_.log

Sun Aug 04 04:06:42 2013

Completed checkpoint up to RBA [0x90.2.10], SCN: 3467864

Beginning log switch checkpoint up to RBA [0x92.2.10], SCN: 3467962

Thread 1 advanced to log sequence 146 (LGWR switch)

  Current log# 2 seq# 146 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_2_870n48hc_.log

  Current log# 2 seq# 146 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_2_870n4dtl_.log

Sun Aug 04 04:06:42 2013

Archived Log entry 237 added for thread 1 sequence 145 ID 0xe96e7a48 dest 1:

Thread 1 cannot allocate new log, sequence 147

Checkpoint not complete

  Current log# 2 seq# 146 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_2_870n48hc_.log

  Current log# 2 seq# 146 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_2_870n4dtl_.log

Completed checkpoint up to RBA [0x92.2.10], SCN: 3467962

Completed checkpoint up to RBA [0x91.2.10], SCN: 3467958

Beginning log switch checkpoint up to RBA [0x93.2.10], SCN: 3467966

Thread 1 advanced to log sequence 147 (LGWR switch)

  Current log# 3 seq# 147 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_3_870n4lsg_.log

  Current log# 3 seq# 147 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_3_870n4o31_.log

Sun Aug 04 04:06:45 2013

Archived Log entry 238 added for thread 1 sequence 146 ID 0xe96e7a48 dest 1:

Beginning log switch checkpoint up to RBA [0x94.2.10], SCN: 3467969

Thread 1 advanced to log sequence 148 (LGWR switch)

  Current log# 1 seq# 148 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_1_870n42n1_.log

  Current log# 1 seq# 148 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_1_870n44z3_.log

Completed checkpoint up to RBA [0x94.2.10], SCN: 3467969

Beginning log switch checkpoint up to RBA [0x95.2.10], SCN: 3467972

Thread 1 advanced to log sequence 149 (LGWR switch)

  Current log# 2 seq# 149 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_2_870n48hc_.log

  Current log# 2 seq# 149 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_2_870n4dtl_.log

Archived Log entry 239 added for thread 1 sequence 147 ID 0xe96e7a48 dest 1:

Archived Log entry 240 added for thread 1 sequence 148 ID 0xe96e7a48 dest 1:

Beginning log switch checkpoint up to RBA [0x96.2.10], SCN: 3467975

Thread 1 advanced to log sequence 150 (LGWR switch)

  Current log# 3 seq# 150 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_3_870n4lsg_.log

  Current log# 3 seq# 150 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_3_870n4o31_.log

Completed checkpoint up to RBA [0x96.2.10], SCN: 3467975

Beginning log switch checkpoint up to RBA [0x97.2.10], SCN: 3467978

Thread 1 advanced to log sequence 151 (LGWR switch)

  Current log# 1 seq# 151 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_1_870n42n1_.log

  Current log# 1 seq# 151 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_1_870n44z3_.log

ORACLE Instance wilson - Can not allocate log, archival required

Thread 1 cannot allocate new log, sequence 152

All online logs needed archiving

  Current log# 1 seq# 151 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_1_870n42n1_.log

  Current log# 1 seq# 151 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_1_870n44z3_.log

Archived Log entry 241 added for thread 1 sequence 149 ID 0xe96e7a48 dest 1:

Beginning log switch checkpoint up to RBA [0x98.2.10], SCN: 3467981

Thread 1 advanced to log sequence 152 (LGWR switch)

  Current log# 2 seq# 152 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_2_870n48hc_.log

  Current log# 2 seq# 152 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_2_870n4dtl_.log

Archived Log entry 242 added for thread 1 sequence 150 ID 0xe96e7a48 dest 1:

Archived Log entry 243 added for thread 1 sequence 151 ID 0xe96e7a48 dest 1:

Thread 1 cannot allocate new log, sequence 153

Checkpoint not complete

  Current log# 2 seq# 152 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_2_870n48hc_.log

  Current log# 2 seq# 152 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_2_870n4dtl_.log

Completed checkpoint up to RBA [0x97.2.10], SCN: 3467978

Completed checkpoint up to RBA [0x95.2.10], SCN: 3467972

Completed checkpoint up to RBA [0x93.2.10], SCN: 3467966

Sun Aug 04 04:06:50 2013

Beginning log switch checkpoint up to RBA [0x99.2.10], SCN: 3467984

Thread 1 advanced to log sequence 153 (LGWR switch)

  Current log# 3 seq# 153 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_3_870n4lsg_.log

  Current log# 3 seq# 153 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_3_870n4o31_.log

Completed checkpoint up to RBA [0x99.2.10], SCN: 3467984

Archived Log entry 244 added for thread 1 sequence 152 ID 0xe96e7a48 dest 1:

Sun Aug 04 04:06:53 2013

Completed checkpoint up to RBA [0x98.2.10], SCN: 3467981

 

 

從日誌裡面,我們看到幾個現象:

 

首先,switch log check point是可以進行巢狀過程的。一個log switch check point沒有結束,可能有一個新的check point開始。

 

 

Sun Aug 04 04:03:14 2013

Beginning log switch checkpoint up to RBA [0x90.2.10], SCN: 3467864 第一個check point動作

Thread 1 advanced to log sequence 144 (LGWR switch)

  Current log# 3 seq# 144 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_3_870n4lsg_.log

  Current log# 3 seq# 144 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_3_870n4o31_.log

Sun Aug 04 04:03:14 2013

Archived Log entry 235 added for thread 1 sequence 143 ID 0xe96e7a48 dest 1:

Sun Aug 04 04:06:39 2013

Beginning log switch checkpoint up to RBA [0x91.2.10], SCN: 3467958第二個check point動作開始

Thread 1 advanced to log sequence 145 (LGWR switch)

  Current log# 1 seq# 145 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_1_870n42n1_.log

  Current log# 1 seq# 145 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_1_870n44z3_.log

Sun Aug 04 04:06:39 2013

Archived Log entry 236 added for thread 1 sequence 144 ID 0xe96e7a48 dest 1:

Thread 1 cannot allocate new log, sequence 146

Checkpoint not complete

  Current log# 1 seq# 145 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_1_870n42n1_.log

  Current log# 1 seq# 145 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_1_870n44z3_.log

Sun Aug 04 04:06:42 2013

Completed checkpoint up to RBA [0x90.2.10], SCN: 3467864 第一個check point結束

Beginning log switch checkpoint up to RBA [0x92.2.10], SCN: 3467962

Thread 1 advanced to log sequence 146 (LGWR switch)

  Current log# 2 seq# 146 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_2_870n48hc_.log

  Current log# 2 seq# 146 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_2_870n4dtl_.log

Sun Aug 04 04:06:42 2013

Archived Log entry 237 added for thread 1 sequence 145 ID 0xe96e7a48 dest 1:

Thread 1 cannot allocate new log, sequence 147

Checkpoint not complete

  Current log# 2 seq# 146 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_2_870n48hc_.log

  Current log# 2 seq# 146 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_2_870n4dtl_.log

Completed checkpoint up to RBA [0x92.2.10], SCN: 3467962 第三個check point結束

Completed checkpoint up to RBA [0x91.2.10], SCN: 3467958 第二個check point結束

 

 

第二個是當切換很頻繁的時候,日誌check point會有不能推動等待情況。Alert log中存在check point not complete的時候,對應還常出現“Thread 1 cannot allocate new log”。

 

如果切換過程中要求覆蓋當前日誌沒有歸檔,Oracle是有等待情況。一旦出現這樣的情況,就說明需要進行歸檔過程的調整。

 

 

beginning log switch checkpoint up to RBA [0x97.2.10], SCN: 3467978

Thread 1 advanced to log sequence 151 (LGWR switch)

  Current log# 1 seq# 151 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_1_870n42n1_.log

  Current log# 1 seq# 151 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_1_870n44z3_.log

ORACLE Instance wilson - Can not allocate log, archival required

Thread 1 cannot allocate new log, sequence 152

All online logs needed archiving

 

 

我們發現:如果一個日誌要被覆蓋,要保證日誌已經歸檔,並且狀態已經寫入。

 

5、結論

 

Check PointSwitch LogRedo LogOracle中比較複雜的問題。網路上很多人經常討論檢查點之類的細節問題,筆者認為:不同版本的oracle在這些方法上有差異,針對效能調優過程,過於細節的內容也沒有太大的意義,沒必要過於糾結。

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/17203031/viewspace-767731/,如需轉載,請註明出處,否則將追究法律責任。

相關文章