long長事務回滾的模擬與定位

不一樣的天空w發表於2017-12-18
大型事務的回滾 大型事務的回滾產生非常大的代價,不僅鎖定需要的資源,並且消耗的CPU和IO,尤其是IO將極為密集。尤其在KILL大型事務之前檢查事務究竟有多大可能是必要的,同時我們也需要知道回滾已經進行了多少程度。
一、模擬:

1.刪除500多萬的資料。


SQL> conn hr/hr;
Connected.
SQL>  create table  test as select * from dba_objects;

Table created.

SQL> insert into test select * from test;

87055 rows created.

SQL> insert into test select * from test;

174110 rows created.

SQL> insert into test select * from test;

348220 rows created.

SQL> insert into test select * from test;

696440 rows created.

SQL> insert into test select * from test;

1392880 rows created.

SQL> select count(*) from test;

  COUNT(*)
----------
   2785760

SQL> insert into test select * from test;

2785760 rows created.

SQL> commit;

Commit complete.

SQL> select count(*) from test;

  COUNT(*)
----------
   5571520

SQL> analyze table test compute statistics;

Table analyzed.

SQL>  
SQL> select sid from v$mystat where rownum=1;

       SID
----------
        30

SQL>

--模擬刪除,不commit
SQL> delete test;
 
5571520 rows deleted.



--另開視窗,查詢kill掉會話

SQL>  select sid,serial#,sql_id,event,blocking_session from v$session where sid=30;

       SID    SERIAL# SQL_ID        EVENT                                                            BLOCKING_SESSION
---------- ---------- ------------- ---------------------------------------------------------------- ----------------
        30        165 7qqwcq9td6akt log buffer space                                                               11

SQL> select sql_text from v$sql where sql_id='7qqwcq9td6akt';

SQL_TEXT
----------------------------------------------------------------------
 delete test

SQL> alter system kill session '30,165' immediate;

System altered.

--回到原來視窗驗證:
SQL>  select count(*) from test;
 select count(*) from test
        *
ERROR at line 1:
ORA-03135: connection lost contact
Process ID: 28346
Session ID: 30 Serial number: 165


二、定位:
檢視回滾進度:
可以透過以下兩個檢視檢視回滾的進度,透過單位時間內恢復的undo block來估算恢復時間:
1.    透過x$ktuxe
alter session set nls_date_format='yyyy-mm-dd hh24:mi:ss';
select ADDR,KTUXEUSN,KTUXESLT,KTUXESQN,KTUXESIZ,sysdate from x$ktuxe where KTUXECFL='DEAD' and KTUXESIZ >0;


SQL> alter session set nls_date_format='yyyy-mm-dd hh24:mi:ss';

Session altered.

SQL> select ADDR,KTUXEUSN,KTUXESLT,KTUXESQN,KTUXESIZ,sysdate from x$ktuxe where KTUXECFL='DEAD' and KTUXESIZ >0;

ADDR               KTUXEUSN   KTUXESLT   KTUXESQN   KTUXESIZ SYSDATE
---------------- ---------- ---------- ---------- ---------- -------------------
00007F170E8AAC20          5         11       7784      84438 2017-12-09 14:19:22

SQL>

可以透過KTUXESLT ,KTUXESQN這兩個欄位,然後用以下指令碼回滾得出大概需要的時間:
set serveroutput on
declare
  l_start number;
  l_end   number;
begin
  select ktuxesiz
    into l_start
    from x$ktuxe
   where KTUXEUSN = 5
     and KTUXESLT = 11;
  dbms_lock.sleep(60);
  select ktuxesiz
    into l_end
    from x$ktuxe
   where KTUXEUSN = 5
     and KTUXESLT = 11;
  dbms_output.put_line('time est Day:' ||
                       round(l_end / (l_start - l_end) / 60 / 24, 2));
end;
/

time est Day:.01

PL/SQL procedure successfully completed.

SQL> SQL>  

2.    透過v$fast_start_trancsations 狀態為recovering表示恢復中;
select USN,SLT,SEQ,STATE,UNDOBLOCKSDONE,UNDOBLOCKSTOTAL,CPUTIME,XID,sysdate from V$FAST_START_TRANSACTIONS;

SQL> select USN,SLT,SEQ,STATE,UNDOBLOCKSDONE,UNDOBLOCKSTOTAL,CPUTIME,XID,sysdate from V$FAST_START_TRANSACTIONS;

       USN        SLT        SEQ STATE            UNDOBLOCKSDONE UNDOBLOCKSTOTAL    CPUTIME XID              SYSDATE
---------- ---------- ---------- ---------------- -------------- --------------- ---------- ---------------- -------------------
         5         11       7784 RECOVERING               123491          179829        120 05000B00681E0000 2017-12-09 14:20:25
         
透過如下檢視觀察回滾是序列還是並行回滾的,如下圖應是並行恢復的,
V$FAST_START_SERVERS provides information about all the recovery slaves performing parallel transaction recovery.

透過xid欄位與v$fast_start_trancsations關聯。
select * from v$fast_start_servers  where xid in (select XID from V$FAST_START_TRANSACTIONS);

SQL> select * from v$fast_start_servers  where xid in (select XID from V$FAST_START_TRANSACTIONS);

STATE       UNDOBLOCKSDONE        PID XID
----------- -------------- ---------- ----------------
RECOVERING          133950         20 05000B00681E0000

3.檢視回滾物件
透過dump undo block方式檢視回滾的物件:

首先透過v$fast_start_trancsations的usn欄位查詢到使用的回滾段。

SQL> select * from v$rollname where usn=5;

       USN NAME
---------- ------------------------------
         5 _SYSSMU5_898567397$
         
dump這個undo block,因為dump的檔案很大,在查詢出物件的object_id後,kill掉這個dump會話。
alter system dump undo block "" XID ;

SQL> alter system dump undo block "_SYSSMU5_898567397$" xid 5 11 7784;
 
。。。。。。。。。


[oracle@wang trace]$ ls -lrt
total 635992
-rw-r----- 1 oracle oinstall        81 Apr 27  2017 DBdb_ora_9045.trm
-rw-r----- 1 oracle oinstall        59 Apr 27  2017 DBdb_mman_9065.trm
-rw-r----- 1 oracle oinstall        60 Apr 27  2017 DBdb_ora_9084.trm
-rw-r----- 1 oracle oinstall       111 Apr 27  2017 DBdb_ora_9099.trm
.......................................
-rw-r----- 1 oracle oinstall     12306 Dec  9 14:23 DBdb_ora_28113.trm
-rw-r----- 1 oracle oinstall 536798705 Dec  9 14:23 DBdb_ora_28113.trc
[oracle@wang trace]$
[oracle@wang trace]$
[oracle@wang trace]$
[oracle@wang trace]$ grep objn DBdb_ora_28113.trc | head -5
* Rec #0x11  slt: 0x0b  objn: 90373(0x00016105)  objd: 90373  tblspc: 4(0x00000004)
* Rec #0x10  slt: 0x0b  objn: 90373(0x00016105)  objd: 90373  tblspc: 4(0x00000004)
* Rec #0xf  slt: 0x0b  objn: 90373(0x00016105)  objd: 90373  tblspc: 4(0x00000004)
* Rec #0xe  slt: 0x0b  objn: 90373(0x00016105)  objd: 90373  tblspc: 4(0x00000004)
* Rec #0xd  slt: 0x0b  objn: 90373(0x00016105)  objd: 90373  tblspc: 4(0x00000004)
[oracle@wang trace]$
[oracle@wang trace]$

可以查詢到objn為121192,對應的就是dba_objects的object_id,即mosongtao.rollback_test,正是前邊測試的物件。再查詢v$session_longops配合username,last_update_time,target,可以大概定位到執行sql_id。

注意:在查詢到object_id後手工停掉dump undo block 動作

SQL> alter system dump undo block "_SYSSMU5_898567397$" xid 5 11 7784;
 
^C^C^C^C^C^C
alter system dump undo block "_SYSSMU5_898567397$" xid 5 11 7784
*
ERROR at line 1:
ORA-01013: user requested cancel of current operation

SQL> SQL> SQL> SQL>               


4.查詢sql
SQL> select SID,TARGET,SQL_ID,START_TIME,LAST_UPDATE_TIME from v$session_longops where target like '%TEST%' order by LAST_UPDATE_TIME desc;

       SID TARGET                                                           SQL_ID        START_TIME   LAST_UPDATE_
---------- ---------------------------------------------------------------- ------------- ------------ ------------
        30 HR.TEST                                                          7qqwcq9td6akt 09-DEC-17    09-DEC-17

SQL> select sql_text from v$sql where sql_id='7qqwcq9td6akt';

SQL_TEXT
-------------------------------------------------------------------------
 delete test

SQL>








來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/31397003/viewspace-2148845/,如需轉載,請註明出處,否則將追究法律責任。

相關文章