Fri Oct 17 19:59:51 2014

Thread 1 cannot allocate new log, sequence 4722

Private strand flush not complete

Current log# 1 seq# 4721 mem# 0: /oradata/sgomp5/redo01.log

Thread 1 advanced to log sequence 4722 (LGWR switch)

Current log# 2 seq# 4722 mem# 0: /oradata/sgomp5/redo02.log

在MOS社群中找到了一篇關於這個問題的文章：

Historically, Every user session wrote the changes to redo log buffer and changes from redo log buffer are flushed to redo logs on disk by lgwr. As number of users increased, the race and the need to get latch for redo allocation and redo copy on the public redo buffer increased.

So, starting from 10g, Oracle came up with concept ofprivate redo (x$kcrfstrand) and in-memory undo (x$ktifp). Every session has private redo where session writes to and then a (small) batch of changes is written to public redo and finally from public redo log buffer to redo log files on disk. This mechanismreduces the gets/sleeps on redo copy and redo allocation latches on the public redo buffer and hence makes the architecture more scalable.

It is also worth noting that oracle falls back to old redo mechanism in case transaction is too big (with lots of changes) and if changes done by that transaction can't fit into private redo buffers.

當資料庫切換日誌時，所有private strand都必須重新整理到當前日誌，然後才能繼續。此資訊表示我們在嘗試切換時，還沒有完全將所有 redo資訊寫入到日誌中。這有點類似於“checkpoint not complete”，不同的是，它僅涉及到正在被寫入日誌的redo。在寫入所有redo前，無法切換日誌。

Private Strands是10gR2才有的，它用於處理redo的latch（redo allocation latch）。是一種允許程式利用多個allocation latch更高效地將redo寫入redo buffer cache的機制，它與9i中出現的log_parallelism引數相關。提出Strand的概念是為了確保例項的redo生成率達到最佳，並能確保在出現某種redo爭用時，可以動態調整strand的數量進行補償。初始分配的strand數量取決於CPU的數量，最少兩個strand，其中一個strand用於active的redo生成。

對於大型的oltp系統，redo生成量非常大，因此當前臺程式遇到redo爭用時，這些strand會被啟用。shared strand總是與多個private strand共存。Oracle 10g的redo（和undo）機制有一些重大變化，目的是為了減少爭用。此機制不再實時記錄redo，而是先記錄在一個private area，並在commit時flush到redo log buffer中去。在這種新機制引入後，一旦使用者程式申請到private strand，redo不再儲存到pga中，因此不再需要redo copy latch這個過程。

如果新事務申請不到private strand的redo allocation latch，則會繼續遵循舊的redo buffer機制，申請寫入shared strand中。對於這個新的機制，在進行redo被寫出到logfile時，LGWR需要將shared strand與private strand的內容寫出。當redo flush發生時，所有的public strands的redo allocation latch需要被獲取，所有的public strands的redo copy latch需要被檢查，所有包含活動事務的private strands需要被持有。

其實，對於這個現象也可以忽略，除非“cannot allocate new log”資訊和“advanced to log sequence”資訊之間有明顯的時間差。

如果想要在alert.log中避免出現Private strand flush not complete事件，那麼可以透過增加引數db_writer_processes的值來實現，因為DBWn會觸發LGWR將redo寫入到logfile，如果有多個DBWn程式一起寫，可以加速redo buffer cache寫入redo logfile。

可以使用以下命令修改：

SQL> alter system set db_writer_processes=4 scope=spfile; --該引數時靜態引數，必需重啟資料庫後生效

注意，DBWR程式數應該與邏輯CPU數相當。另外地，當oracle發現一個DB_WRITER_PROCESS不能完成工作時，也會自動增加其數量，前提是已經在初始化引數中設定過最大允許的值。

關於DB_WRITER_PROCESSES和DBWR_IO_SLAVES引數的一些說明：

DB_WRITER_PROCESSES replaces the Oracle7 parameter DB_WRITERS and specifies the initial number of database writer processes for an instance. If you use DBWR_IO_SLAVES, only one database writer process will be used, regardless of the setting for DB_WRITER_PROCESSES

DB_WRITER_PROCESSES引數就是在Oracle 7中的DB_WRITERS引數，用來指定資料庫例項的DBWR程式個數，當系統中還配置了DBWR_IO_SLAVES引數時（預設為0），則只能利用到一個DBWn程式，而忽略其他的。

DBWR_IO_SLAVES

If it is not practical to use multiple DBWR processes, then Oracle provides a facility whereby the I/O load can be distributed over multiple slave processes. The DBWR process is the only process that scans the buffer cache LRU list for blocks to be written out. However, the I/O for those blocks is performed by the I/O slaves. The number of I/O slaves is determined by the parameter DBWR_IO_SLAVES.

當使用單一DBWR程式時，Oralce提供了使用多個I/O slave程式來完成模擬非同步IO，去完成全本應該由DBWR做的事情（寫LRU上的資料塊到磁碟檔案），這個slave的數量是透過DBWR_IO_SLAVES引數來指定的

DBWR_IO_SLAVES is intended for scenarios where you cannot use multiple DB_WRITER_PROCESSES (for example, where you have a single CPU). I/O slaves are also useful when asynchronous I/O is not available, because the multiple I/O slaves simulate nonblocking, asynchronous requests by freeing DBWR to continue identifying blocks in the cache to be written. Asynchronous I/O at the operating system level, if you have it, is generally preferred.

DBWR_IO_SLAVES引數通常被用在單CPU的場景中，因為單CPU即使設定了多DBWR程式數也是沒有效果的。無論作業系統是否支援非同步IO，使用多個I/O slaves都是有效的，可以分擔DBWR的任務。如果使用了非同步IO，那就更加推薦設定了

DBWR I/O slaves are allocated immediately following database open when the first I/O request is made. The DBWR continues to perform all of the DBWR-related work, apart from performing I/O. I/O slaves simply perform the I/O on behalf of DBWR. The writing of the batch is parallelized between the I/O slaves.

DBWR的I/O slaves當資料庫open時發生第一次I/O請求時被分配，DBWR程式繼續完成與自身相關任務，而分離出部分I/O處理任務給I/O slaves，各個I/O slaves之間的I/O處理都是並行的

Choosing Between Multiple DBWR Processes and I/O Slaves

Configuring multiple DBWR processes benefits performance when a single DBWR process is unable to keep up with the required workload. However, before configuring multiple DBWR processes, check whether asynchronous I/O is available and configured on the system. If the system supports asynchronous I/O but it is not currently used, then enable asynchronous I/O to see if this alleviates the problem. If the system does not support asynchronous I/O, or if asynchronous I/O is already configured and there is still a DBWR bottleneck, then configure multiple DBWR processes.

關於如何選擇多個DBWR程式和I/O slaves程式

當單一的DBWR程式無法勝任大量的寫工作負載，配置多個DBWR程式是有效的。但是在配置多個DBWR程式前，需要先檢查OS上是否支援非同步I/O，如果支援但未開啟，那麼先開啟；如果系統不支援或已經配置了非同步IO後，仍然有DBWR瓶頸，那麼就可以配置多個DBWR程式

Using multiple DBWRs parallelizes the gathering and writing of buffers. Therefore, multiple DBWn processes should deliver more throughput than one DBWR process with the same number of I/O slaves. For this reason, the use of I/O slaves has been deprecated in favor of multiple DBWR processes. I/O slaves should only be used if multiple DBWR processes cannot be configured.

開啟多個DBWR程式就意味著可以並行寫更多的髒快取（dirty buffer）到資料檔案，而多個DBWR的吞吐量，也要比1個DBWR+相當數量的I/O slaves的要高，因此，當開啟了多個DBWR程式時，就不應該再配置DBWR_IO_SLAVES（如果原來是非零的話），可以把這個引數設定為0

總結：

DBWR_IO_SLAVES主要用於模擬非同步環境，在不支援非同步操作的OS上，可以提高IO的讀寫速度。

多個DBWR程式可以並行地從data buffer中獲取dirty block並且並行地寫入磁碟。但是，在單DBWR+多個I/O slaves的場景下，只能是一個DBWR負責從data buffer中獲取，而多個I/O slaves並行寫入。如果系統支援AIO（disk_async_io=true），一般不用設定多dbwr 或io slaves。

如果在有多個cpu的情況下建議使用DB_WRITER_PROCESSES，因為這樣的情況下不用去模擬非同步模式，但要注意程式數量不能大於cpu數量。而在只有一個cpu的情況下建議使用DBWR_IO_SLAVES來模擬非同步模式，以便提高資料庫效能。

————————————————

版權宣告：本文為CSDN博主「aaron8219」的原創文章，遵循CC 4.0 BY-SA版權協議，轉載請附上原文出處連結及本宣告。

原文連結：

https://blog.csdn.net/aaron8219/article/details/40398797

另

增加db_writer_processes一定要慎重

----官方建議：

Default value 1 or CPU_COUNT / 8, whichever is greater 預設值為1或CPU個數的1/8

[oracle@orcl ~]$ vmstat 5 ----對CPU當前資訊進行採集，每5秒一次

SQL> show parameter cpu_count; ----檢視當前cpu_count數

SQL> show parameter db_writer_processes; ----檢視當前資料寫程式數

alert日誌中出現Private Strand Flush Not Complete的處理方法

相關文章