怎樣獲取發生錯誤的會話資訊 -- 轉

ningzi82發表於2010-10-19

某些Oracle錯誤,並不是總是伴隨著產生trace檔案,這些錯誤,對Oracle來說並不是嚴重的錯誤,比如像ORA-01555這樣的錯誤。

我們可以設定一個事件,在發生錯誤時,產生一個Trace檔案,事件通常使用下面的命令格式:

  1. alter system set events ' trace name errorstack level ';
  2. alter session set events ' trace name errorstack level ';
    alter system set events ' trace name errorstack level ';
    alter session set events ' trace name errorstack level ';

然而,使用alter system命令設定事件後,只會對新連線的會話有效。比如下面的測試:

[@more@]
  1. 會話一:
  2. [oracle@xty scripts]$ sqlplus test/test
  3. SQL*Plus: Release 10.2.0.4.0 - Production on Sat Jul 25 23:25:51 2009
  4. Copyright (c) 1982, 2007, Oracle. All Rights Reserved.
  5. Connected to:
  6. Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
  7. With the Partitioning, OLAP, Data Mining and Real Application Testing options
  8. SQL> drop table t1;
  9. Table dropped.
  10. SQL>
  11. SQL> create table t1 ( a number primary key);
  12. Table created.
  13. SQL> insert into t1 values (1);
  14. 1 row created.
  15. SQL> commit;
  16. Commit complete.
  17. 會話二:
  18. alter system set events '1 trace name errorstack level 1';
  19. 會話一:
  20. SQL> insert into t1 values (1);
  21. insert into t1 values (1)
  22. *
  23. ERROR at line 1:
  24. ORA-00001: unique constraint (TEST.SYS_C005801) violated
會話一:
[oracle@xty scripts]$ sqlplus test/test

SQL*Plus: Release 10.2.0.4.0 - Production on Sat Jul 25 23:25:51 2009

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> drop table t1;

Table dropped.

SQL>          
SQL> create table t1 ( a number primary key);

Table created.

SQL> insert into t1 values (1);

1 row created.

SQL> commit;

Commit complete.

會話二:

alter system set events '1 trace name errorstack level 1';

會話一:
SQL> insert into t1 values (1);
insert into t1 values (1)
*
ERROR at line 1:
ORA-00001: unique constraint (TEST.SYS_C005801) violated

此時檢查user_dump_dest目錄,沒有相應的trace檔案產生,如果我們再執行下面的動作:

  1. SQL> conn test/test
  2. Connected.
  3. SQL> insert into t1 values (1);
  4. insert into t1 values (1)
  5. *
  6. ERROR at line 1:
  7. ORA-00001: unique constraint (TEST.SYS_C005801) violated
SQL> conn test/test
Connected.
SQL> insert into t1 values (1);
insert into t1 values (1)
*
ERROR at line 1:
ORA-00001: unique constraint (TEST.SYS_C005801) violated

就可以在user_dump_dest發現產生的trace檔案。

那麼發生錯誤的會話已經連線到資料庫一段時間了,怎麼得到這個會話在的資訊?比如某個資料庫,資料庫中頻繁地報下面的錯誤:

  1. select sysdate create_time from dual
  2. ORA-01555 caused by SQL statement below (Query Duration=0 sec, SCN: 0x09e5.0c3c77b1):
  3. Wed Jul 22 11:17:51 2009
  4. select g.*,m.* from Tb_Model m right outer join (select t.*,v.table_name from.....
select sysdate create_time from dual
ORA-01555 caused by SQL statement below (Query Duration=0 sec, SCN: 0x09e5.0c3c77b1):
Wed Jul 22 11:17:51 2009
select g.*,m.*  from Tb_Model m right outer join (select t.*,v.table_name from.....

這個ORA-01555錯誤是非常怪異的,首先是查詢DUAL表都會報錯,其次,每次報錯都是“Query Duration=0 sec, SCN: 0×09e5.0c3c77b1”,這裡除了BUG,實在想不到其他的理由 。不過為了查明到底是哪個會話和哪個應用,是不是同一個會話引起,我們需要得到這個引起錯誤的會話的資訊。

這裡,我們可以用觸發器。以SYS使用者執行下面的程式碼:

  1. create table t ( msg varchar2(4000));
  2. create or replace trigger xj_error
  3. after servererror on database
  4. declare
  5. l_sql_text ora_name_list_t;
  6. l_n number;
  7. begin
  8. if ( is_servererror(1555) )
  9. then
  10. insert into t values ( 'ora_sysevent = ' || ora_sysevent );
  11. insert into t values ( 'ora_login_user = ' || ora_login_user );
  12. insert into t values ( 'ora_server_error = ' || ora_server_error(1) );
  13. l_n := ora_sql_txt( l_sql_text );
  14. for i in 1 .. l_n
  15. loop
  16. insert into t values ( 'l_sql_text(' || i || ') = ' || l_sql_text(i) );
  17. end loop;
  18. insert into t
  19. select 'sid:' || sid || ' machina:' || machine || ' program:' || program || ' module:' || module from v$session where sid=(select sid from v$mystat where rownum=1);
  20. insert into t values ('IP:' || sys_context('USERENV', 'IP_ADDRESS') || ' HOST:' || sys_context('USERENV', 'HOST'));
  21. end if;
  22. end;
  23. /
create table t ( msg varchar2(4000));

create or replace trigger xj_error
after servererror on database
declare
    l_sql_text ora_name_list_t;
    l_n        number;
begin
    if ( is_servererror(1555) )
    then
        insert into t values ( 'ora_sysevent = ' || ora_sysevent );
        insert into t values ( 'ora_login_user = ' || ora_login_user );
        insert into t values ( 'ora_server_error = ' || ora_server_error(1) );

        l_n := ora_sql_txt( l_sql_text );
        for i in 1 .. l_n
        loop
            insert into t values ( 'l_sql_text(' || i || ') = ' || l_sql_text(i) );
        end loop;

        insert into t
        select 'sid:' || sid || ' machina:' || machine || ' program:' || program  || ' module:' || module from v$session where sid=(select sid from v$mystat where rownum=1);
        insert into t values ('IP:' || sys_context('USERENV', 'IP_ADDRESS') || ' HOST:' || sys_context('USERENV', 'HOST'));
    end if;
end;
/

過一段時間,然後可以從SYS.T表可以得到:

  1. SQL> select * from t;
  2. MSG
  3. --------------------------------------------------------------------
  4. ora_sysevent = SERVERERROR
  5. ora_login_user = XXX
  6. ora_server_error = 1555
  7. l_sql_text(1) = select g.*,m.* from Tb_Model m right outer join (select t.*,v.t
  8. ....
  9. sid:434 machine :temp1 program: module:
  10. IP:xxx.xxx.xxx.xxx HOST: temp1
SQL> select * from t;

MSG
--------------------------------------------------------------------
ora_sysevent = SERVERERROR
ora_login_user = XXX
ora_server_error = 1555
l_sql_text(1) = select g.*,m.*  from Tb_Model m right outer join (select t.*,v.t
....
sid:434 machine :temp1 program: module:
IP:xxx.xxx.xxx.xxx HOST: temp1

透過得到的會話資訊,得知是一個Web應用的連線,可以安全地KILL之。將這個會話KILL掉,錯誤不再出現。

轉自:http://www.laoxiong.net/how_to_get_session_occured_error.html

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/789833/viewspace-1039974/,如需轉載,請註明出處,否則將追究法律責任。

相關文章