Overview of Database Checkpoints

lovestanford發表於2014-07-03
-----什麼是資料庫檢查點 ,檢查點的概念
A checkpoint corresponds to a data structure that defines a SCN in the redo thread of a database. Checkpoints are recorded in the control file and each datafile header,
and are a crucial element of recovery.
     簡單來說,檢查點是一種資料結構,每個檢查點對應著 資料庫的redo thread中的一個SCN。檢查點資訊記錄在控制檔案和每個資料檔案頭部中。

When a checkpoint occurs, Oracle must update the headers of all datafiles to record the details of the checkpoint. This is done by the CKPT process. The CKPT process does not write blocks to disk, DBWn (DB Writer Process) always performs that task.
      當檢查點發生的時候,CKPT程式會更新所有資料檔案頭部和控制檔案,記錄該檢查點的詳細資訊,同時通知DBWn程式將所有髒資料從快取中到資料檔案。

A Checkpoint is a crucial mechanism in consistent database shutdowns, instance recovery, and Oracle Database operation generally.
對於consistent資料庫關閉、例項恢復和資料庫正常執行來說,檢查點是一個很關鍵的機制。

The term checkpoint has the following related meanings:
檢查點具有以下相關含義:

    1:A data structure that indicates the checkpoint position, which is the SCN in the redo stream where instance recovery must begin. The checkpoint position is determined by the oldest dirty buffer in the database buffer cache. The checkpoint position acts as a pointer to the redo stream and is stored in the control file and in each data file header.
       包含的資料結構指明瞭檢查點位置,檢查點位置對應著一個記錄在redo stream中的SCN,例項恢復必須從此SCN開始。(可以理解為例項恢復是從最近的檢查點對應的SCN開始的)
       檢查點位置由快取中最老的髒快取決定的,相當於一個指向redo stream 中的指標,檢查點相關資訊記錄在控制檔案盒每個資料庫檔案頭部中。
   2:The writing of modified database buffers in the database buffer cache to disk.
       將快取中的資料寫入磁碟

The checkpoint process (CKPT) updates the control file and data file headers with checkpoint information and signals DBWn to write blocks to disk.
Checkpoint information includes the checkpoint position, SCN, location in online redo log to begin recovery, and so on.
       CKPT程式負責更新控制檔案和資料檔案頭部中的檢查點資訊,通知DBWn進行寫操作,檢查點資訊包括檢查點位置,SCN,線上重做日誌中開始恢復的位置等。
  

-----檢查點的作用,什麼時候發生檢查點
Purpose of Checkpoints
Oracle Database uses checkpoints to achieve the following goals:
Oracle資料庫使用檢查點來達到以下幾個目的:

    A: Reduce the time required for recovery in case of an instance or media failure
       減少執行例項恢復或者介質恢復需要的時間
    B: Ensure that dirty buffers in the buffer cache are written to disk regularly
       確保髒快取能有規律的寫入硬碟
    C: Ensure that all committed data is written to disk during a consistent shutdown
        確保所有已提交的資料在資料庫一致關閉時能寫入硬碟。

When does a checkpoint happen
     什麼時候發生檢查點? 以下幾種情況下(包括但不限於)會發生檢查點
   A:   At each switch of the redo log files.   每次日誌切換時候
   B:   When the delay for LOG_CHECKPOINT_TIMEOUT is reached.
             LOG_CHECKPOINT_TIMEOUT 規定的時間達到
   C:   When the size in bytes corresponding to (LOG_CHECKPOINT_INTERVAL* size of IO OS blocks) is written on the current redo log file.
              當LOG_CHECKPOINT_INTERVAL* size of IO OS blocks 這麼多位元組重做日誌寫入  當前重做日誌檔案  
   D:   When ALTER SYSTEM SWITCH LOGFILE command is issued.
   E:   When ALTER SYSTEM CHECKPOINT command is issued.
             當上述命令執行時候發生檢查點

-----有哪些型別的檢查點

The checkpoint process (CKPT) is responsible for writing checkpoints to the data file headers and control file. Checkpoints occur in a variety of situations. For example, Oracle Database uses the following types of checkpoints:

    Thread checkpoints  ------執行緒檢查點,也就是資料庫檢查點
    也可以這樣理解:這是資料庫級別上的檢查點,針對整個資料庫  

    The database writes to disk all buffers modified by redo in a specific thread before a certain target. The set of thread checkpoints on all instances in a database is a database checkpoint. Thread checkpoints occur in the following situations:
        A:     Consistent database shutdown
        B:     ALTER SYSTEM CHECKPOINT statement
        C:     Online redo log switch
        D:     ALTER DATABASE BEGIN BACKUP statement
   
  Tablespace and data file checkpoints

      表空間和資料檔案級別上的檢查點 ,這種檢查點發生在資料庫的部分表空間和資料檔案級別上;
The database writes to disk all buffers modified by redo before a specific target. A tablespace checkpoint is a set of data file checkpoints,
one for each data file in the tablespace. These checkpoints occur in a variety of situations,
including making a tablespace read-only or taking it offline normal, shrinking a data file, or executing ALTER TABLESPACE BEGIN BACKUP.
   
Incremental checkpoints
    An incremental checkpoint is a type of thread checkpoint partly intended to avoid writing large numbers of blocks at online redo log switches.、
   增量檢查點是一種執行緒檢查點,主要是避免當線日誌發生切換時候,出現大量的資料寫操作。
DBWn checks at least every three seconds to determine whether it has work to do.
  DBWn每三秒鐘醒來檢視是否有工作要做,有資料需要寫入硬碟
   When DBWn writes dirty buffers, it advances the checkpoint position, causing CKPT to write the checkpoint position to the control file, but not to the data file headers.
 當DBWn在寫髒快取時候,它同時將檢查點位置前移,促使CKPT將檢查點位置寫入控制檔案,但是不寫入資料檔案頭部。

Other types of checkpoints include instance and media recovery checkpoints and checkpoints when schema objects are dropped or truncated.


------檢查點相關的引數
Checkpoint Parameters
LOG_CHECKPOINT_INTERVAL

LOG_CHECKPOINT_INTERVAL specifies the frequency of checkpoints in terms of the number of redo log file blocks that can exist between an incremental checkpoint and the last block written to the redo log. This number refers to physical operating system blocks, not database blocks.
Specifying a value of 0 (zero) for LOG_CHECKPOINT_INTERVAL has the same effect as setting the parameter to infinity and causes the parameter to be ignored. Only nonzero values of this parameter are considered meaningful.
Default Value : 0

LOG_CHECKPOINT_TIMEOUT
LOG_CHECKPOINT_TIMEOUT specifies (in seconds) the amount of time that has passed since the incremental checkpoint at the position where the last write to the redo log (sometimes called the tail of the log) occurred. This parameter also signifies that no buffer will remain dirty (in the cache) for more than integer seconds.
Specifying a value of 0 for the timeout disables time-based checkpoints. Hence, setting the value to 0 is not recommended unless FAST_START_MTTR_TARGET is set.
Default value : 1800 sec

FAST_START_MTTR_TARGET 
FAST_START_MTTR_TARGET enables you to specify the number of seconds the database takes to perform crash recovery of a single instance. When specified, FAST_START_MTTR_TARGET is overridden by LOG_CHECKPOINT_INTERVAL.
Note:
You must disable or remove the FAST_START_IO_TARGET, LOG_CHECKPOINT_INTERVAL, and LOG_CHECKPOINT_TIMEOUT initialization parameters when using FAST_START_MTTR_TARGET. Setting these parameters interferes with the mechanisms used to manage cache recovery time to meet FAST_START_MTTR_TARGET.
 FAST_START_MTTR_TARGET 非零則 將會啟動自動checkpointing

FAST_START_IO_TARGET in 9i is replaced by FAST_START_MTTR_TARGET parameter from 10g.
Default Value : 0

LOG_CHECKPOINTS_TO_ALERT
LOG_CHECKPOINTS_TO_ALERT lets you log your checkpoints to the alert file. Doing so is useful for determining whether checkpoints are occurring at the desired frequency.

True則將checkpoint相關資訊記錄在alert_$ORACLE_SID.log中

Default Value - False


------自動檢查點
Automatic Checkpointing
    Oracle Database 10g supports automatic checkpoint tuning which takes advantage of periods of low I/O usage to advance checkpoints and therefore improve availability. Automatic checkpoint tuning is in effect if the FAST_START_MTTR_TARGET database initialization parameter is set to a nonzero value.
FAST_START_MTTR_TARGET 非零則 將會啟動自動checkpointing
Observe the following recommendations to take advantage of automatic checkpoint tuning.
    If it is necessary to control the time to recover from an instance or node failure, then set FAST_START_MTTR_TARGET to the desired MTTR in seconds. If targeting a specific MTTR is unnecessary, then set FAST_START_MTTR_TARGET to a nonzero value to enable automatic checkpoint tuning. 
    Fast-start checkpointing can be disabled by setting FAST_START_MTTR_TARGET=0. Disable fast-start checkpointing only when system I/O capacity is insufficient with fast-start checkpointing enabled and achieving a target MTTR is not important. 
    Enabling fast-start checkpointing increases the average number of writes per transaction that DBWn issues for a given workload (when compared with disabling fast-start checkpointing). However, if the system is not already near or at its maximum I/O capacity, then fast-start checkpointing has a negligible impact on performance. 
    The percentage of additional DBWn writes with very aggressive fast-start checkpointing depends on many factors, including the workload, I/O speed and capacity, CPU speed and capacity, and the performance of previous recoveries.
    If FAST_START_MTTR_TARGET is set to a low value, then fast-start checkpointing is more aggressive, and the average number of writes per transaction that DBWn issues is higher in order to keep the thread checkpoint sufficiently advanced to meet the requested MTTR.
    Conversely, if FAST_START_MTTR_TARGET is set to a high value, or if automatic checkpoint tuning is in effect (that is, FAST_START_MTTR_TARGET is set to a nonzero value), then fast-start checkpointing in less aggressive, and the average number of writes per transaction that DBWn issues is lower.
    Fast-start checkpointing can be explicitly disabled by setting FAST_START_MTTR_TARGET=0. Disabling fast-start checkpointing leads to the fewest average number of writes per transaction for DBWn for a specific workload and configuration, but also results in the highest MTTR. 


-----檢查點相關等待時間
Checkpoint related Wait Events
Checkpoint Completed

A session is waiting for checkpoint to complete. This could happen for example during a close database or a local checkpoint. A Local Checkpoint is one that is initiated by the user (for example, performed by ALTER SYSTEM CHECKPOINT LOCAL statements).

Wait Time: 5 seconds
Parameters: None

Log File Switch (checkpoint incomplete)

Waiting for a log switch because the session cannot wrap into the next log. Wrapping cannot be performed because the checkpoint for that log has not completed.
Wait Time: 1 second
Parameters: None

----檢查點與例項恢復
Checkpoints and Instance Recovery

    Instance recovery uses checkpoints to determine which changes must be applied to the data files.
The checkpoint position guarantees that every committed change with an SCN lower than the checkpoint SCN is saved to the data files.
During instance recovery, the database must apply the changes that occur between the checkpoint position and the end of the redo thread.
Some changes may already have been written to the data files. However, only changes with SCNs lower than the checkpoint position are guaranteed to be on disk.
   例項恢復使用檢查點來決定恢復從哪個CHANGE/SCN開始,每個檢查點都對應一個時間戳,例項恢復從例項崩潰前最後一個checkpoint對應的SCN  SCN_1開始,SCN小於SCN_1而且已經
提交的事務所做的修改都已經寫入資料檔案。

------------檢查點效能
Checkpoint Performance
     
      Frequent checkpoints will enable faster recovery, but can cause performance degradation. A Checkpoint might be a costly operation when the number of files are huge since it has to freeze the datafile headers during the process. There is a performance trade-off regarding frequency of checkpoints.
      頻繁發生檢查點會是例項恢復或者介質恢復更快,但是也會導致效能下降。當檔案數量多的時候,checkpoint可能是一個昂貴的操作,因為它必須在此過程中必須凍結資料檔案頭部。

More frequent checkpoints enable faster database recovery after a crash. This is why some customer sites which have a very low tolerance for unscheduled system downtime will often choose this option. However, the performance degradation of frequent checkpoints may not justify this philosophy in many cases. Let's assume the database is up and running 95% of the time, and unavailable 5% of the time from frequent instance crashes or hardware failures requiring database recovery.  For most customer sites, it makes more sense to tune for the 95% case rather than the rare 5% downtime.

    Checkpoint occurs at every log switch. Hence frequent log switches will start the checkpoints and may degrade the performance. If a previous checkpoint is already in progress, the checkpoint forced by the log switch will override the current checkpoint. This necessitates well-sized redo logs to avoid unnecessary checkpoints as a result of frequent log switches. A good rule of thumb is to switch logs at most every twenty minutes. Having your log files too small can increase checkpoint activity and reduce performance. Oracle recommends the user to set all online log files to be the same size, and have at least two log groups per thread.
    Set the value of FAST_START_MTTR_TARGET to 3600. This enables Fast-Start checkpointing and the Fast-Start Fault Recovery feature, but minimizes its effect on run-time performance while avoiding the need for performance tuning of FAST_START_MTTR_TARGET.


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/21754115/viewspace-1208768/,如需轉載,請註明出處,否則將追究法律責任。

相關文章