log file sync等待事件總結

羽化殘虹發表於2015-09-06

事件

何時發生日誌寫入：

1.commit或者rollback

2.每3秒

3.log buffer 1/3滿或者已經有1M的redo資料。

更精確的解釋：_LOG_IO_SIZE 大小預設是LOG_BUFFER的1/3,當log buffer中redo資料達到_LOG_IO_SIZE 大小時，發生日誌寫入。

4.DBWR寫之前

_log_io_size隱含引數：

LOG_BUFFER（bytes）寫入的數量超過_LOG_IO_SIZE會觸發lgwr寫日誌的條件,預設值為LOG BUFFER的1/3或1M。

但是這個說法通過查詢並不能驗證，隱含引數儘量不要修改。

col name for a25

col VALUE for a20

col DESCRIB for a50

SELECT x.ksppinm NAME, y.ksppstvl VALUE, x.ksppdesc describ

FROM SYS.x$ksppi x, SYS.x$ksppcv y

WHERE x.inst_id = USERENV ('Instance')

AND y.inst_id = USERENV ('Instance')

AND x.indx = y.indx

AND x.ksppinm LIKE '_log_io_size';

NAME VALUE DESCRIB

------------------------- -------------------- --------------------------------------------------

_log_io_size 0 automatically initiate log write if this many redo

blocks in buffer

log file sync發生的過程：

此等待事件使用者發出提交或回滾宣告後,等待提交完成的事件,提交命令會去做日誌同步,也就是寫日誌快取到日誌檔案，在提交命令未完成前,使用者將會看見此等待事件.

注意,它專指因提交,回滾而造成的寫快取到日誌檔案的等待.當發生此等待事件時,有時也會伴隨log file parallel write.因為此等待事件將會寫日誌快取，如果日誌的I/O系統較為緩慢的話,

這必將造成log file parallel write 等待.當發生log file sync等待後,判斷是否由於緩慢的日誌I/O造成的,可以檢視兩個等待事件的等待時間,如果比較接近,就證明日誌I/O比較緩慢或重做日誌過多,這時,造成log file sync的原因是因為log file parallel write,可以參考解決log file parallel write的方法解決問題,

**如果log file sync的等待時間很高,而log file parallel write的等待時間並不高,這意味著log file sync的原因並不是緩慢的日誌I/O,而是應用程式過多的提交造成。

當log file sync的等待時間和 log file parallel write等待時間基本相同，說明是IO問題造成的log file sync等待事件。

-----

更好理解的解釋：

回顧一下單機資料庫中的'log file sync' 等待事件，當user session 提交（commit）時，user session會通知LGWR程式將redo buffer中的資訊寫入到redo log file，當LGWR程式完成寫操作後，LGWR再post（通知）user session 寫操作已經完成，user session 接收到LGWR的通知後提交操作才完成。因此user session 在沒有收到LGWR post（通知）之前一致處於等待狀態，具體的等待事件為'log file sync'。

-----

引起log file sync的原因：

1.頻繁提交或者rollback,檢查應用是否有過多的短小的事務，如果有，可以使用批處理來緩解。

2.OS的IO緩慢：解決辦法是將日誌檔案放裸裝置上或繫結在RAID 0或RAID 1+0中，而不是繫結在RAID 5中。

3.過大的日誌緩衝區（log_buffer ）

過大的log_buffer,允許LGWR變得懶惰，因為log buffer中的資料量無法達不到_LOG_IO_SIZE，導致更多的重做條目堆積在日誌緩衝區中。

當事務提交或者3s醒來時，LGWR才會把所有資料都寫入到redo log file中。

由於資料很多，LGWR要用更多時間等待redo寫完畢。

這種情況，可以調小引數_LOG_IO_SIZE引數，其預設值是LOG_BUFFER的1/3或1MB，取兩者之中較小的值。

換句話說，你可以具有較大的日誌緩衝區，但較小的_LOG_IO_SIZE將增加後臺寫入次數，從而減少log file sync的等待時間。

4.CPU負載高。詳見下面的描述。

5.RAC私有網路效能差，導致LMS同步commit SCN慢。

如何診斷log file sync：

1.AWR：發生log file sync時，先做個snapshot，然後做AWR，AWR時間選擇在10-30分鐘。

已發生的log file sync，那麼通過AWR依然可以分析，也要保持在10-30分鐘。

2.Lgwr trace file（10.2.0.4開始），大於500ms會寫入

trace檔案中如果有Warning: log write time 1000ms, size 2KB，很有可能IO慢。

3.分析CPU資源使用情況的工具，CPU過於繁忙，lgwr無法及時獲取CPU排程，出現log file sync。

vmstat，關注r是否大於CPU核數，大於說明cpu繁忙。

OSW:OSWatcher,同上。

4.Alert：確認log file 15到20分鐘切換一次

5.Script to Collect Log File Sync Diagnostic Information (lfsdiag.sql) [Document 1064487.1]

解決辦法：

1.如果確實是因為頻繁提交造成的log file sync,那麼減少commit。

2.如果確實是因為io引起的，那麼解決辦法是將日誌檔案放裸裝置上或繫結在RAID 1+0中，而不是放在在RAID 5中（切記，redo log file一定不要放在SSD上！！！）。

3.確保CPU資源充足。CPU資源不足，LGWR通知user session後，user session無法及時獲得CPU排程，不能正常工作。

4.是否有些表可以使用nologging，會減少redo產生量

5.檢查redo log file足夠大，確保redo log file每15到20分鐘切換一次。

更深入分析log file sync：

如果上面的分析沒有解決log file sync等待事件，那麼需要做下面的分析。

The log file sync wait may be broken down into the following components:

log file sync 能拆解為一下步驟：

1. Wakeup LGWR if idle 1.喚醒LGWR程式

2. LGWR gathers the redo to be written and issue the I/O 2.LGWR程式收集redo，然後發給I/O

3. Time for the log write I/O to complete 3.等待log寫入I/O完成

4. LGWR I/O post processing 4.LGWR I/O post processing

5. LGWR posting the foreground/user session that the write has completed 5.LGWR通知前臺/使用者回話，redo寫入完成

6. Foreground/user session wakeup 6.前臺/使用者會話喚醒

Steps 2 and 3 are accumulated in the "redo write time" statistic. (i.e. as found under STATISICS section of Statspack and AWR)

步驟2和3消耗的時間在AWR中的"redo write time"中有所體現。(AWR中 Instance Activity Stats )

Step 3 is the "log file parallel write" wait event. (Document:34583.1 "log file parallel write" Reference Note)

步驟3產生"log file parallel write"等待事件。

另外：如果是最大保護模式的DATAGUARD(SYNC傳輸),這一步驟還包含網路寫、RFS/redo寫入到備庫的standby log file sync的時間。

Steps 5 and 6 may become very significant as the system load increases. This is because even after the foreground has been posted it may take a some time for the OS to schedule it to run. May require monitoring from O/S level.

在系統負載高時（尤其是CPU高的情況，看vmstat r值），步驟5和6會變得非常明顯。因為，前臺收到LGWR寫入完成的通知後，作業系統需要消耗一些時間排程Foreground/user session程式喚醒（也就是CPU排程）。需要系統級別監控。

幾個技術指標：

log file sync 等待時間小於20ms算正常

log file parallel write 等待時間小於20ms算正常

log file parallel wirte 和log file sync等待時間很接近，說明就是IO問題，因為大部分時間都花在了log寫入到磁碟上。

相關指令碼：

--等待時間平均等待時間

		
			
			
				select EVENT,TOTAL_WAITS,TOTAL_TIMEOUTS,TIME_WAITED,AVERAGE_WAIT
				

					 from   v$system_event 
					

						 where  event in ('log file sync','log file parallel write'); 
						

							
							

								select value from v$parameter where name = 'log_buffer';

---------------新特性：log file sync 兩種方式--------------

Adaptive Log File Sync

Adaptive Log File sync was introduced in 11.2. The parameter controlling this feature, _use_adaptive_log_file_sync, is set to false by default in 11.2.0.1 and 11.2.0.2.

_use_adaptive_log_file_sync引數在11gR2提出。11.2.0.1和11.2.0.2兩個版本該引數預設是false。

從11.2.0.3開始，這個引數預設值是true，也就是開始啟用“自適應日誌同步機制”。

11.2.0.1和11.2.0.2也可以開啟改引數

ALTER SYSTEM SET "_use_adaptive_log_file_sync"= scope=;

開啟改引數後，日誌同步機制會在2種方式中切換。

該引數決定了，foreground/user session 和LGWR程式通過什麼方式獲知commit操作已完成（也就是redo寫log file完成）。

Post/wait, traditional method for posting completion of writes to redo log

傳統方式，在11.2.0.3之前，user session等待LGWR通知redo寫入到log file完畢，被動方式。

優點：post/wait方式，user session幾乎能立即發現redo已刷到磁碟。

Polling, a new method where the foreground process checks if the LGWR has completed the write.

新方式，主動監測LGWR是否完成寫入，主動方式。這種方式比Post/wait方式響應速度慢，但是可以節約CPU資源。

優點：當commit完成後，LGWR會把commit完成的訊息通知給很多user session，這個過程消耗大量CPU。

Polling方式採用朱勇監測LGWR釋放寫入redo完成，所以釋放了LGWR佔用的CPU資源。

系統負載高（CPU繁忙）採用Polling方式更好。

系統負載低（CPU清閒）採用post/wait方式更好，它能夠提供比polling方式更好的響應時間。

ORACLE根據內部統計資訊決定採用何種方式。post/wait和polling方式互相切換能引起過熱，為了確保安全，切換不要太頻繁。

LGWR的trace檔案記錄了switch記錄，關鍵字是 "Log file sync switching to ...":

Switch to polling:

										
											
											
												*** 2015-01-21 08:19:04.077
												

													kcrfw_update_adaptive_sync_mode: post->poll long#=2 sync#=5 sync=62 poll=1056 rw=454 ack=0 min_sleep=1056
													

														
														

															*** 2015-01-21 08:19:04.077
															

																Log file sync switching to polling
																

																	Current scheduling delay is 1 usec
																	

																		Current approximate redo synch write rate is 1 per sec
																		

																			kcrfw_update_adaptive_sync_mode: poll->post current_sched_delay=0 switch_sched_delay=1 current_sync_count_delta=1 switch_sync_count_delta=5

Switch to post/wait:

																					
																						
																						
																							*** 2015-01-21 08:46:09.428
																							

																								Log file sync switching to post/wait
																								

																									Current approximate redo synch write rate is 0 per sec
																									

																										
																										

																											*** 2015-01-21 08:47:46.473
																											

																												kcrfw_update_adaptive_sync_mode: post->poll long#=2 sync#=11 sync=228 poll=1442 rw=721 ack=0 min_sleep=1056

相關指令碼：

查詢當前log file sync 方式是post-wait還是poll

																														
																															
																															
																																SQL> select name,value from v$sysstat where name in ('redo sync poll writes','redo synch polls');
																																

																																	
																																	

																																		NAME                                                                  VALUE
																																		

																																			---------------------------------------------------------------- ----------
																																			

																																				redo synch polls                                                  325355850

每小時採用poll log file sync方式的次數

																																							
																																								
																																								
																																									col begin_interval_time format a25
																																									

																																										col instance_number format 99 heading INST
																																										

																																											col stat_name format a25
																																											

																																												select snap.BEGIN_INTERVAL_TIME,hist.instance_number , hist.stat_name,hist.redo_synch_polls
																																												

																																													from ( select snap_id,instance_number,stat_name,value -lag(value,1,null) over ( order by snap_id,instance_number,stat_name) redo_synch_polls
																																													

																																														        from dba_hist_sysstat
																																														

																																															        where stat_name='redo synch polls'
																																															

																																																        and dbid=(select dbid from v$database)
																																																

																																																	        and instance_number = nvl('&instance_number',1)) hist,
																																																	

																																																		        dba_hist_snapshot snap
																																																		

																																																			where redo_synch_polls >0
																																																			

																																																				and hist.snap_id=snap.snap_id
																																																				

																																																					and hist.instance_number=snap.instance_number
																																																					

																																																						order by 1,2
																																																						

																																																							/
																																																							

																																																								
																																																								

																																																									BEGIN_INTERVAL_TIME       INST STAT_NAME                 REDO_SYNCH_POLLS
																																																									

																																																										------------------------- ---- ------------------------- ----------------
																																																										

																																																											06-JAN-15 07.00.02.884 AM    2 redo synch polls                       734
																																																											

																																																												06-JAN-15 08.00.08.425 AM    2 redo synch polls                     23767
																																																												

																																																													06-JAN-15 09.00.13.770 AM    2 redo synch polls                     39827
																																																													

																																																														06-JAN-15 10.00.19.233 AM    2 redo synch polls                     48479
																																																														

																																																															06-JAN-15 11.00.24.431 AM    2 redo synch polls                     41541
																																																															

																																																																06-JAN-15 12.00.29.670 PM    2 redo synch polls                     47566
																																																																

																																																																	06-JAN-15 01.00.35.029 PM    2 redo synch polls                     32169
																																																																	

																																																																		06-JAN-15 02.00.04.159 PM    2 redo synch polls                     37405
																																																																		

																																																																			06-JAN-15 02.59.04.536 PM    2 redo synch polls                     41469
																																																																			

																																																																				06-JAN-15 04.00.08.556 PM    2 redo synch polls                     38683
																																																																				

																																																																					06-JAN-15 05.00.12.523 PM    2 redo synch polls                     51618
																																																																					

																																																																						06-JAN-15 06.00.16.584 PM    2 redo synch polls                     52511
																																																																						

																																																																							06-JAN-15 07.00.03.352 PM    2 redo synch polls                     42229
																																																																							

																																																																								06-JAN-15 08.00.08.663 PM    2 redo synch polls                     35229
																																																																								

																																																																									06-JAN-15 09.00.13.882 PM    2 redo synch polls                     18499

來自 “ ITPUB部落格 ” ，連結：http://blog.itpub.net/26175573/viewspace-1792379/，如需轉載，請註明出處，否則將追究法律責任。

log file sync等待事件
2018-11-06
事件
【等待事件】log file sync
2021-01-01
事件
log file sync" 等待事件
2010-03-08
事件
"log file sync"等待事件-2
2013-10-18
事件
"log file sync"等待事件-1
2013-10-17
事件
等待事件 log file sync 小解
2012-11-08
事件
log file sync等待事件處理思路
2022-10-11
事件
【TUNE_ORACLE】等待事件之日誌等待“log file sync”
2021-07-16
Oracle事件
【WAIT】 log file sync等待事件說明
2021-12-09
AI事件
【RAC】RAC 效能分析 - 'log file sync' 等待事件
2017-06-19
事件
關於log file sync等待事件的描述
2016-05-03
事件
Oracle之等待事件log file sync + log file parallel write (awr優化)
2017-09-10
Oracle事件Parallel優化
log file sync(日誌檔案同步) 與 Log file parallel write 等待事件
2011-06-09
Parallel事件
RAC 資料庫中的'log file sync' 等待事件
2015-01-07
資料庫事件
log file sync等待事件的成因及優化方法
2015-12-12
事件優化
Oracle 11g 遇到log file sync嚴重等待事件
2017-10-20
Oracle事件
_use_adaptive_log_file_sync設定降低log_file_sync等待
2016-12-15
APT
oracle等待事件3構造一個Direct Path write等待事件和構造一個Log File Sync等待事件
2013-01-01
Oracle事件
LOG FILE SWITCH等待事件
2011-08-24
事件
[20161228]奇怪log file sync等待事件.txt
2016-12-28
事件
log file switch相關等待事件
2011-10-21
事件
redo的等待log file sync和log file parallel write和redo size設定
2011-11-10
Parallel
[20201204]關於等待事件Log File Sync.txt
2020-12-04
事件
log file switch (checkpoint incomplete)等待事件
2016-08-01
事件
【等待事件】log file switch (checkpoint incomplete)
2013-04-09
事件
log file sync 和 log file parallel write
2013-08-12
Parallel
【TUNE_ORACLE】等待事件之日誌等待“log file parallel write”
2021-07-28
Oracle事件Parallel
一個os thread startup、log file sync等待的故障回顧
2018-10-31
thread
Oracle資料庫由dataguard備庫引起的log file sync等待
2020-11-26
Oracle資料庫
Oracle db file parallel write 和 log file parallel write 等待事件說明
2012-05-08
OracleParallel事件
_use_adaptive_log_file_sync
2017-05-03
APT
最近遇到log file sync等待時間，測試了一下非同步提交
2016-01-18
非同步
db file scattered read等待事件
2018-11-08
事件
db file sequential read等待事件
2018-11-07
事件
【等待事件】db file sequential read
2021-01-12
事件
【等待事件】db file scattered read
2021-01-12
事件
等待事件：Disk file operations I/O
2014-06-25
事件
等待事件--db file scattered reads
2013-08-01
事件

log file sync等待事件總結

相關文章