1 Buffer Cache原理

Buffer Cache是Oracle SGA中一個重要部分，通常的資料訪問和修改都需要透過Buffer Cache來完成。當一個程式需要訪問資料時，首先需要確定資料在記憶體中是否存在，如果資料在Buffer中存在，則需要根據資料的狀態來判斷是否可以直接訪問還是需要構造一致性讀取；如果資料在Buffer中不存在，則需要在Buffer Cache中尋找足夠的空間以裝在需要的資料，如果Buffer Cache中找不到足夠的記憶體空間，則需要出發DBWR去寫出髒資料，釋放Buffer空間。

Buffer Cache中，Oracle透過幾個連結串列進行記憶體管理，其中最為熟知的是LRU List和LRUW List(也經常被稱為Write/Dirty List)，各種List上存放

的是具體的Buffer的指標等資訊。

Oracle 8開始，為了實施增量檢查點，Oracle還引入了檢查點佇列Checkpoint Queue和檔案佇列File Queue；從Oracle8i開始，由於非同步DBWn的引入，現在關於各種List以及Queue的更為精確的概念是工作集（WS Working Sets），在每個WS中包含幾個不同功能的List，每個List都透過Cache Buffer LRU Chain Latch進行保護，LRU latch數量受隱含引數_db_block_lru_latches控制，當使用多個DBWR程式時(DB_WRITER_PROCESSES引數可以設定資料庫使用多個DBWR)資料庫會存在多個WS，同時當使用Buffer Cache多緩衝池技術時，每個獨立的緩衝池會存在各自獨立的WS，後面從buffer Cache可以看出來WS結構。

alter session set events 'immediate trace name buffers level 4';

LRU List用於維護記憶體中Buffer，按照LRU演算法進行管理（在不同版本中，管理方法有所不同），資料初始化時，所有Buffer都被Hash到LRU List上管理，當需要從資料檔案上讀取資料時，首先要在LRU List上尋找Free的Buffer，然後讀取資料到Buffer Cache中；當資料被修改後，狀態變為Dirty，就可以被移動至LRUW List，LRUW List上的都是候選的可以被DBWR寫出到資料檔案的Buffer，一個Buffer要麼在LRU List上，要麼在LRUW List上存在，不能同時存在兩個List上。

檢查點佇列（Checkpoint Queue）負責按照資料塊的修改資料記錄資料塊，同事將RBA和資料塊關聯起來，這樣在進行增量檢查點時，資料庫可以按照資料塊修改先後順序將其寫出。檢查點觸發是DBWR根據檢查點佇列執行寫出，其他條件觸發DBWR時，DBWR由Dirty List執行寫出，檢查點佇列記憶體從Shared Pool中分配，檢查點佇列也有Latch守護active checkpoint queue latch。

SYS@orcl2 > select latch#,name from v$latch where name like 'active checkpoint%';

LATCH# NAME

---------------------------------------- ----------------------------------------

325 active checkpoint queue latch

SYS@orcl2 > select * from v$sgastat where name like 'Checkpoint%';

POOL NAME BYTES CON_ID

-------------- ---------------------------------------- ------------ ----------------------------------------

shared pool Checkpoint queue 576 1

從Oracle 8i開始，LRU List和LRUW List又分別增加了輔助List（AUXILIARY List），用於提高管理效率。引入輔助List之後，當資料庫初始化時，Buffer首先存放在LRU的輔助List上(AUXILIARY RPL_LST)，當被使用後移動到LRU的主List上（MAIN RPL_LST），這樣當使用者程式搜尋Free Buffer時就可以從LRU-AUX List開始，而DBWR搜尋Dirty Buffer時，則可以從LRU-Main List開始，提高效率。

2. latch: cache buffers lru chain

當使用者程式需要讀資料到Buffer Cache時或Cache Buffer根據LRU演算法進行管理等，就不可避免的要掃描LRU List獲取可用Buffer或更改Buffer狀態，Oracle的Buffer Cache是共享記憶體，可以為眾多的併發程式併發訪問，所以在搜尋的過程中必須獲得Latch(Latch是Oracle的一種序列鎖機制，用於保護共享記憶體結構)，鎖定記憶體結構

Cache buffers LRU chain latch官方解釋：

The cache buffer lru chain latch is acquired in order to introduce a new block into the buffer cache and when writing a buffer back to disk, specifically when trying to scan the LRU (least recently used) chain containing all the dirty blocks in the buffer cache.

Solutions

Consider implementing multiple buffer pools to reduce contention on this latch.

Increase the number of LRU latches with the parameter DB_BLOCK_LRU_LATCHES. Generally the default value works.

Reduce data blocks visited by a query and thereby reduce LRU latch requests in the buffer pool by tuning the SQ

可能原因:

想檢視或者修改LRU+LRUW的程式，始終要持有cache buffers lru chain latch。

若在此過程中發生爭用，則要等待latch:cache buffers lru chain 事件。

總結出來如下兩種情況會導致cache buffers lru chain latch:

1.程式欲讀取還沒有裝載到記憶體上的塊時，透過查詢LRU 列分配到所需空閒緩衝區，在此過程中需要cache buffers lru chain latch。

2.DBWR 為了將髒緩衝區記錄到檔案上，查詢LRUW 列，將相應緩衝區移動到LRU 列的過程中也要獲得cache buffers lru chain latch。

DBWR在如下情況下將髒緩衝區記錄到檔案裡。

l Oracle 讀資料庫到Buffer Cache掃描一定比例lru list之後，還未找到空閒緩衝區，則會通知DBWR 寫髒緩衝區；

l Oracle程式為執行Parallel Query 或Tablespace Backup，Truncate/Drop等工作，請求記錄相關物件的髒緩衝區時；

l 週期性或管理上的原因檢查點（checkpointing）被執行時。

l Oracle 為了保障將透過FAST_START_MTTR_TARGET（或LOG_CHECKPOINT_TIMEOUT）指定的時間的恢復，週期性執行檢查點。

l 管理員執行檢查點命令或日誌檔案發生切換，也會發生Thread checkpoint檢查點，會促進DBWR程式寫入所有髒塊，但是Thread checkpoint在檢查點佇列中優先順序較低，DBWR會按照增量檢查點以及其他會促使DBWR寫髒塊動作來繼續按照正常步驟寫入髒塊，待寫到檢查點位置時，完成log switch checkpoint，所以當設定log_checkpoints_to_alert為true之後，會發現日誌切換時，開始檢查點，但可能要等幾分鐘之後才完成該檢查點。

l DBWR至少每三秒掃描一次一次是否有髒塊需要寫入。

cache buffers lru chain latch爭用的最重要的原因是過多請求空閒緩衝區。低效的SQL語句是過多請求空閒緩衝區的最典型情況，若多個會話同時執行低效的SQL語句，則在查詢空閒緩衝區過程中和記錄髒緩衝區的過程中，為了獲取buffers lru chain latch發生爭用。多個會話同時掃描不同表或索引時，發生cache buffers lru chain latch爭用的機率高。多個會話將各不相同的塊載入到記憶體過程中，確保空閒緩衝區的請求會增多，因此發生對工作組爭用的機率將提高。特別是因為資料修改頻繁，以至於髒緩衝區數量多，正因此DBWR 因為檢查點而查詢LRUW 列的次數頻繁，所以cache buffers lru chain latch爭用將更加嚴重。cache buffers lru chain latch爭用的另一個重要特點就是伴隨著物理I/O。若是低效的索引掃描引起的問題，則同時發生db file sequential read 等待和lru chain latch爭用；若是不必要的全表掃描引起的問題，則同時發生db file scattered read 等待和lru chain latch爭用。事實上，cache buffers chains latch爭用和cache buffers lru chain latch爭用同時發生的情況較多，因為複雜的應用程式將複合地應用上述模式。data buffer過小或檢查點週期過短時，也會增加cache buffers lru chain latch爭用；但是現在的資料庫的data buffer都不會太小,而檢查點週期一般使用預設值，所以通常定位cache buffers lru chain latch的原因還是在低效的SQL語句上。

3. latch: cache buffers chains

Cache buffers chains latch官方解釋：

This latch is acquired whenever a block in the buffer cache is accessed (pinned).Reducing contention for the cache buffer chains latch will usually require reducing logical I/O rates by tuning and minimizing the I/O requirements of the SQL involved. High I/O rates could be a sign of a hot block (meaning a block highly accessed).

P1 = Latch address

P2 = Latch number

P3 = Tries

理解：當一個資料塊讀入sga區，相應的buffer header會被放置到bucket（Hash buckets are grouped by relative data block address and class number）下的hash列表上。如果一個程式想訪問或修改hash chain上的block,需要先hash演算法到bucket，然後在bucket裡面掃描一個連結串列，需要獲取 latch：cache buffers chains保護記憶體結構。

原因一：低效率的SQL語句（主要體現在邏輯讀過高），cache buffers chains latch很大程度與邏輯讀有關，所以要觀注v$sql中BUFFER_GETS/EXECUTIONS大的語句。同時每一個邏輯讀需要一個latch get 操作及一個cpu操作，這樣的sql也會很耗cpu資源，9i可以共享持有cache buffers chains latch，所以這個等待事件一般會發生在熱塊的DML操作，頻繁的commit。

原因二：熱塊（訪問過於頻繁），Oracle 9i開始，cache buffers chain latch只讀訪問，latch可以共享，所以如果只是以只讀訪問，大家可以一起讀，如果有一些排他性的操作，修改，一致性查詢，修改之後checkpoint通知dbwr或者dbwr主動掃描buffer寫入磁碟等都是排他性的持有cache buffers chain latch，會造成其他會話讀取相同buffer時等待。

凡是能影響持有cache buffers chains latch操作的情形，都有可能造成cache buffers chains latch等待事件，一般cache buffers chains latch與latch: cache buffers lru chain經常伴隨出現。

insert/delete/update如果SQL語句寫的低效，則會造成讀取過多的buffer或者從磁碟讀取大量的資料塊到buffer cache中，會非常容易造成熱塊或cache bufffers chains lru latch競爭。

4.熱塊競爭

熱塊競爭

10g以前，都叫buffer busy waits，逐步演變分裂，分裂為以下眾多熱點塊相關等待事件，gc相關為RAC節點間熱點塊相關事件。

Wait until a buffer becomes available.

There are four reasons that a session cannot pin a buffer in the buffer cache, and a

separate wait event exists for each reason:

l "buffer busy waits":

A session cannot pin the buffer in the buffer cache because

another session has the buffer pinned.

l "read by other session":

A session cannot pin the buffer in the buffer cache

because another session is reading the buffer from disk.

l "gc buffer busy acquire":

A session cannot pin the buffer in the buffer cache

because another session is reading the buffer from the cache of another instance.

l "gc buffer busy release":

A session cannot pin the buffer in the buffer cache

because another session on another instance is taking the buffer from this cache

into its own cache so it can pin it.

Prior to release 10.1, all four reasons were covered by "buffer busy waits." In release

10.1, the "gc buffer busy" wait event covered both the "gc buffer busy acquire" and "gc

buffer busy release" wait events.

l gc cr block busy ，

當資料塊在遠端例項發生了修改，本地查詢該資料塊資料時，由於發現資料塊有未提交事務，遠端會先做一次 log flush ，將相關塊修改 redo 記錄寫入 redo log ，然後根據當前快應用 undo 構造 CR 塊，傳送給訪問例項，這個等待事件為，無法立即完成 cr block 請求， cr block 正在 busy ，一般遠端例項修改了資料，未及時提交，會造成本地例項產生該等待事件。

l gc current request

本地例項以 current 模式請求遠端例項資料塊。

l gc current grant busy

本地向資料塊 master 管理節點請求以 current 模式訪問資料塊，程式正在授權或者相關鎖資源無法獲得，無法完成授權訪問。

l gc cr block 2-way

當請求一個 block 時，如果經過兩個或者 3 個 network hop 就獲得了該塊的話，那就會產生 gc [current/cr][2/3]-way 。如果是 3-way ，那應該 master 和 holder 不是同一個 instance ，如果是 2-way ，那就應該 master 和 holder 是同一個 instance 。這應該是最好的情況，請求後，就獲得了請求的 block 即沒有 busy ，也沒有說在請求的過程中等待。該類事件應該暗示是進行了 block 的網路傳遞，會產生流量，而 grant 2-way 的網路流量應該相對小

l gc cr block busy

When a requestneeds a block in CR mode, it sends arequest to the master instance. The requestor evenutally gets the block viacache fusion transfer. However sometimes the block transfer is delayed due toeither the block was being used by a session on another instance or the blocktransfer was delayed because the holding instance could not write thecorresponding redo records to the online logfile immediately.

One can use thesession level dynamic performance views v$session and v$session_event to find theprograms or sesions causing the most waits on this events

SQL>selecta.sid , a.time_waited , b.program , b.module from v$session_event a ,v$session b where a.sid=b.sid and a.event='gc cr block busy' order bya.time_waited;

l gc current block busy 等待事件

When a requestneeds a block in current mode, it sends arequest to the master instance. The requestor evenutally gets the blockvia cache fusion transfer. However sometimes the block transfer isdelayed due to either the block was being used by a session on another instanceor the block transfer was delayed because the holding instance could not writethe corresponding redo records to the online logfile immediately.

One can use thesession level dynamic performance views v$session and v$session_event to findthe programs or sesions causing the most waits on this events

SQL>selecta.sid , a.time_waited , b.program , b.module from v$session_event a , v$sessionb where a.sid=b.sid and a.event='gc current block busy' order by a.time_waited;

l gc current block busy

等待是 RAC 中 global cache 全域性快取當前塊的爭用等待事件，該等待事件時長由三個部分組成：

Time to process current block request inthe cache= (pin time + flush time + send time)

gc current block flush time

The currentblock flush time is part of the service (or processing) time for a currentblock. The pending redo needs to be flushed to the log file by LGWR before LMSsends it. The operation is asynchronous in that LMS queues the request, postsLGWR, and continues processing. The LMS would check its log flush queue forcompletions and then send the block, or go to sleep and be posted by LGWR. Theredo log write time and redo log sync time can influence theoverall service time significantly.

flush time 是 Oracle 為了保證 Instance Recovery 例項恢復機制，而要求每一個 current block 在本地節點 local instance 被修改後 (modify/update) 必須要將該 current block 相關的 redo 寫入到 logfile 後（要求 LGWR 必須完成寫入後才能返回 ) ，才能由 LMS 程式傳輸給其他節點使用。

而 gc buffer busy acquire/release 往往是 gc current block busy 的衍生產品，當同一例項內的多個程式併發地訪問同一個資料塊時，首先發起的程式將進入 gc current block busy 的等待，而在 buffer waiter list 上的後續程式會陷入 gc buffer busy acquire/release 等待 (A user on the same instance has started a remote operation on thesame resource and the request has not completed yet or the block was requestedby another node and the block has not been released by the local instance whenthe new local access was made) ，這裡存在一個排隊效應，即 gc current block busy 是緩慢的，那麼在排隊的 gc buffer busy acquire/release 就會更慢：

Pin time = (timeto read the block into cache) + (time to modify/process the buffer)

Busy time =(average pin time) * (number of interested users waiting ahead of me)

不侷限於 current block （ reference AWR Avg global cache current block flush time(ms)), cr block(Avg global cache cr block flush time (ms)) 也存在 flush time 。

可以透過設定 _cr_server_log_flush to false(LMSare/is waiting for LGWR to flush the pending redo during CR fabrication.Without going too much in to details, you can turn off the behaviourby setting _cr_server_log_flush to false.) 來禁止 crserver flush redo log ，但是該引數對於 current block 的 flush time 無效，也強烈不推薦使用。

CR Block ： consistent read ，為一致性讀構造的 block ，當一個塊發生修改，未提交時，其他會話要讀取同一個塊內容時，由於不能讀取到未提交資料，所以需要利用 xcur （ exclusive current ）塊 +undo block applied 構造 CR 一致性讀塊，保證讀取資料的一致性，由於 CR 塊有時效性，所以當釋出查詢， Oracle 首先檢查 ITL ，一但發現某塊上尚有末提交的事務，根據其 UBA ，馬上開始構造其 CR 塊，而不管會話所檢索的行，根據未提交事務構造 CR 塊，之後，會將查詢時 SCN 與 ITL 事務槽中已提交事務 SCN 對比，如果查詢 SCN 大於 ITL 事務槽提交 SCN ，則構造 CR 結束，否則繼續回滾構造。

參考：

《深入解析Oracle》蓋國強

Oracle Concepts

Oracle Reference

以上部分解釋來源於一下部落格，感謝，許多知識非常通俗易懂。
https://blog.csdn.net/tianlesoftware/article/details/7777511

Buffer Cache以及buffer busy waits/gc相關事件

1 Buffer Cache原理

2. latch: cache buffers lru chain

3. latch: cache buffers chains

4.熱塊競爭

相關文章