gc buffer busy的優化

myownstars發表於2011-12-02
 
通常情況下是因為等待其他節點傳送塊,如果出現太多次,則意味著叢集之間有contention
http://www.ardentperf.com/2007/09/12/gc-buffer-busy-waits-in-rac-finding-hot-blocks/
 
過去幾個小時系統比較慢,通過查詢dba_hist_active_sess_history瞭解一下經歷的等待事件
首先要確定這一時間段對應的snapshot
select min(begin_interval_time) min, max(end_interval_time) max
from dba_hist_snapshot
where snap_id between 12831 and 12838;
 
MIN                            MAX
------------------------------ ------------------------------

12-SEP-07 09.00.17.451 AM      12-SEP-07 05.00.03.683

檢視過去一段時間的等待事件型別彙總

select wait_class_id, wait_class, count(*) cnt

from dba_hist_active_sess_history

where snap_id between 12831 and 12838

group by wait_class_id, wait_class

order by 3;

WAIT_CLASS_ID WAIT_CLASS                            CNT
------------- ------------------------------ ----------
   3290255840 Configuration                         169
   2000153315 Network                               934
   4108307767 System I/O                           7199
   3386400367 Commit                               7809
   4217450380 Application                         12248
   3875070507 Concurrency                         14754
   1893977003 Other                               35499
   3871361733 Cluster                            104810

   1740759767 User I/O                           121999

可以更細化為具體的等待事件

select event_id, event, count(*) cnt from dba_hist_active_sess_history
where snap_id between 12831 and 12838 and wait_class_id=3871361733
group by event_id, event
order by 3;
1457266432 gc current split                                229
2685450749 gc current grant 2-way                          290
 957917679 gc current block lost                           579
 737661873 gc cr block 2-way                               699
2277737081 gc current grant busy                           991
3570184881 gc current block 3-way                         1190
3151901526 gc cr block lost                               1951
 111015833 gc current block 2-way                         2078
3046984244 gc cr block 3-way                              2107
 661121159 gc cr multi block request                      4092
3201690383 gc cr grant 2-way                              4129
1520064534 gc cr block busy                               4576
2701629120 gc current block busy                         14379

1478861578 gc buffer busy                                67275

--在這段時間內累計有67275session等待gc buffer busy,然後檢視出是哪些sql引起的

select sql_id, count(*) cnt from dba_hist_active_sess_history
where snap_id between 12831 and 12838
and event_id in (1478861578)
group by sql_id
having count(*)>1000
order by 2;
 
SQL_ID               CNT
------------- ----------
22ggtj4z9ak3a       1574
gsqhbt5a6d4uv       1744
cyt90uk11a22c       2240
39dtqqpr7ygcw       4251

8v3b2m405atgy      42292

執行次數最多的sql達到4萬多次,通過dba_hist_sqltext檢視出對應的sql

insert into bigtable(id, version, client, cl_business_id, cl_order_id, desc。。。。。

該表是個分割槽表,且其上有很多trigger和大量的索引,需要明確該sql等待的物件

select current_obj#, count(*) cnt from dba_hist_active_sess_history
where snap_id between 12831 and 12838
and event_id=1478861578 and sql_id='8v3b2m405atgy'
group by current_obj#
order by 2;
CURRENT_OBJ#        CNT
------------ ----------
           0        511
     3122795        617
     3064433        880
     3208619       3913
     3208620       5411
     3208618      22215

後三個是次數比較多的

select object_id, owner, object_name, subobject_name, object_type from dba_objects
where object_id in (3208618, 3208619, 3208620);
 
 OBJECT_ID OWNER      OBJECT_NAME                    SUBOBJECT_NAME                 OBJECT_TYPE
---------- ---------- ------------------------------ ------------------------------ -------------------
   3208618 JSCHDER    BIGTABLE_LOG                   P_2007_09                      TABLE PARTITION
   3208619 JSCHDER    BIGTABL_LG_X_ID                P_2007_09                      INDEX PARTITION

   3208620 JSCHDER    BIGTABL_LG_X_CHANGE_DATE       P_2007_09                      INDEX PARTITION

等待次數最多的卻是bigtable_log,這是因為表上有一個trigger,每對bigtable更新一次,就要往該表insert7次;

我們可以進一步確認那些塊競爭最激烈

select current_file#, current_block#, count(*) cnt
from dba_hist_active_sess_history
where snap_id between 12831 and 12838
and event_id=1478861578 and sql_id='8v3b2m405atgy'
and current_obj# in (3208618, 3208619, 3208620)
group by current_file#, current_block#
having count(*)>50
order by 3;
 
CURRENT_FILE# CURRENT_BLOCK#        CNT
------------- -------------- ----------
         1330         238073         51
         1542          22645         55
         1487         237914         56
         1330         238724         61
         1330         244129         76

         1487         233206        120

似乎並沒有過熱的塊,因為有超過4萬個session訪問這些object,但等待次數最多的塊只有120次;

檢查一下這些塊是否為段頭塊,

select segment_name, header_file, header_block
from dba_segments where wner='JHEIDER' and partition_name='P_2007_09'
and segment_name in ('PLACEMENTS_LOG','PLCMNTS_LG_X_ID',
  'PLCMNTS_LG_X_CHANGE_DATE');
 
SEGMENT_NAME                   HEADER_FILE HEADER_BLOCK
------------------------------ ----------- ------------
BIGTABL_LG_X_CHANGE_DATE              1207       204809
BIGTABL_LG_X_ID                       1207       196617

BIGTABLE_LOG                          1209        16393

可惜這些只是普通的資料塊。

任何時候,只要表資料塊在insert時候有嚴重的併發問題,首先要想到的就是space management;

檢查該表所屬表空間的管理方式

select distinct tablespace_name from dba_tab_partitions
where table_name='BIGTABLE_LOG';
 
TABLESPACE_NAME
------------------------------
BIGTABLE_LOG_DATA
 
select extent_management, allocation_type, segment_space_management
from dba_tablespaces where tablespace_name='BIGTABLE_LOG_DATA';
 
EXTENT_MAN ALLOCATIO SEGMEN
---------- --------- ------
LOCAL      USER      MANUAL

 

SQL> select distinct freelists, freelist_groups from dba_tab_partitions
  2  where table_name='BIGTABLE_LOG';
 
 FREELISTS FREELIST_GROUPS
---------- ---------------

1                        1

其表空間使用手工段管理且只有一個freelist;

小結:這是一個6節點RAC,最忙的表卻只有一個freelist,難怪會引起嚴重的gc buffer busy

但是因為每個塊都很快的被填滿,所以沒有等待次數過多的塊;

不需要考慮ITL的問題,因為插入的都是空塊,ITL如果不夠會自動分配直到塊滿為止;

該系統是從non-rac升級到RAC的,因此所有的object都是MSSM,解決完這個object其他object的問題則會後續顯現;

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/15480802/viewspace-712622/,如需轉載,請註明出處,否則將追究法律責任。

相關文章