搭建測試環境後出現的問題

lsl031發表於2011-08-17

仍然是很久以前的記錄

ORA-03113&ORA-07445;
今天weblogic jdbc突然中斷幾次,導致系統無法登陸,登陸時錯誤提示為:
Error 500--Internal Server Error
org.springframework.transaction.CannotCreateTransactionException: Could not open Hibernate Session for transaction; nested exception is org.hibernate.exception.GenericJDBCException: Cannot open connection
 at org.springframework.orm.hibernate3.HibernateTransactionManager.doBegin(HibernateTransactionManager.java:545)
 at org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(AbstractPlatformTransactionManager.java:315)
 at org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(TransactionAspectSupport.java:234)
 at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:93)
 at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:170)
 at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:176)
 at $Proxy11.findByPrimaryKey(Unknown Source)
 at com.chinainsurance.application.common.CustomAuthenticationProcessingFilter.attemptAuthentication(CustomAuthenticationProcessingFilter.java:85)
 at org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:198)
 at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:274)
 at org.acegisecurity.ui.logout.LogoutFilter.doFilter(LogoutFilter.java:108)
 at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:274)
 at org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:195)
 at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:274)
 at org.acegisecurity.util.FilterChainProxy.doFilter(FilterChainProxy.java:148)
 at org.acegisecurity.util.FilterToBeanProxy.doFilter(FilterToBeanProxy.java:90)
 at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:27)
 at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:7053)
 at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
 at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:121)
 at weblogic.servlet.internal.WebAppServletContext.invokeServlet(WebAppServletContext.java:3902)
 at weblogic.servlet.internal.ServletRequestImpl.execute(ServletRequestImpl.java:2773)
 at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:224)
 at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:183)
Caused by: org.hibernate.exception.GenericJDBCException: Cannot open connection
 at org.hibernate.exception.SQLStateConverter.handledNonSpecificException(SQLStateConverter.java:103)
 at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:91)
 at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:43)
 at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:29)
 at org.hibernate.jdbc.ConnectionManager.openConnection(ConnectionManager.java:420)
 at org.hibernate.jdbc.ConnectionManager.getConnection(ConnectionManager.java:144)
 at org.hibernate.jdbc.JDBCContext.connection(JDBCContext.java:129)
 at org.hibernate.transaction.JDBCTransaction.begin(JDBCTransaction.java:57)
 at org.hibernate.impl.SessionImpl.beginTransaction(SessionImpl.java:1290)
 at org.springframework.orm.hibernate3.HibernateTransactionManager.doBegin(HibernateTransactionManager.java:517)
 ... 23 more
Caused by: java.sql.SQLException: Cannot obtain connection: driverURL = jdbc:weblogic:pool:ConnectionPool9999, props = {enableTwoPhaseCommit=false, connectionPoolID=ConnectionPool9999, jdbcTxDataSource=true, dataSourceName=chinainsuranceDataSource}.
Nested Exception: java.lang.RuntimeException: Failed to setAutoCommit to true for pool connection.
 at weblogic.jdbc.wrapper.PoolConnection.init(PoolConnection.java:56)
 at weblogic.jdbc.pool.Driver.allocateConnection(Driver.java:254)
 at weblogic.jdbc.pool.Driver.connect(Driver.java:164)
 at weblogic.jdbc.jts.Driver.getNonTxConnection(Driver.java:546)
 at weblogic.jdbc.jts.Driver.connect(Driver.java:140)
 at weblogic.jdbc.common.internal.RmiDataSource.getConnection(RmiDataSource.java:344)
 at org.springframework.orm.hibernate3.LocalDataSourceConnectionProvider.getConnection(LocalDataSourceConnectionProvider.java:81)
 at org.hibernate.jdbc.ConnectionManager.openConnection(ConnectionManager.java:417)
 at org.hibernate.jdbc.ConnectionManager.getConnection(ConnectionManager.java:144)
 at org.hibernate.jdbc.JDBCContext.connection(JDBCContext.java:129)
 at org.hibernate.transaction.JDBCTransaction.begin(JDBCTransaction.java:57)
 at org.hibernate.impl.SessionImpl.beginTransaction(SessionImpl.java:1290)
 at org.springframework.orm.hibernate3.HibernateTransactionManager.doBegin(HibernateTransactionManager.java:517)
 at org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(AbstractPlatformTransactionManager.java:315)
 at org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(TransactionAspectSupport.java:234)
 at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:93)
 at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:170)
 at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:176)
 at $Proxy11.findByPrimaryKey(Unknown Source)
 at com.chinainsurance.application.common.CustomAuthenticationProcessingFilter.attemptAuthentication(CustomAuthenticationProcessingFilter.java:85)
 at org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:198)
 at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:274)
 at org.acegisecurity.ui.logout.LogoutFilter.doFilter(LogoutFilter.java:108)
 at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:274)
 at org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:195)
 at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:274)
 at org.acegisecurity.util.FilterChainProxy.doFilter(FilterChainProxy.java:148)
 at org.acegisecurity.util.FilterToBeanProxy.doFilter(FilterToBeanProxy.java:90)
 at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:27)
 at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:7053)
 at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
 at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:121)
 at weblogic.servlet.internal.WebAppServletContext.invokeServlet(WebAppServletContext.java:3902)
 at weblogic.servlet.internal.ServletRequestImpl.execute(ServletRequestImpl.java:2773)
 at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:224)
 at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:183)

 at weblogic.jdbc.jts.Driver.wrapAndThrowSQLException(Driver.java:497)
 at weblogic.jdbc.jts.Driver.getNonTxConnection(Driver.java:550)
 at weblogic.jdbc.jts.Driver.connect(Driver.java:140)
 at weblogic.jdbc.common.internal.RmiDataSource.getConnection(RmiDataSource.java:344)
 at org.springframework.orm.hibernate3.LocalDataSourceConnectionProvider.getConnection(LocalDataSourceConnectionProvider.java:81)
 at org.hibernate.jdbc.ConnectionManager.openConnection(ConnectionManager.java:417)
 ... 28 more
重置JDBC後,可以正常使用,但是過20多分鐘問題依舊。

我同時連線的會話卻一直正常,沒有出現過網路中斷。

心裡這時想的是肯定是應用開發有問題,本來這個測試環境就是上的新版本應用嘛。

出於職業習慣,檢查了一下alert檔案(原來是沒有做日誌監控的,現在包括測試庫,自動10分鐘日誌監控,應該可以避免問題沒有及時發現),發現有如下提示:
Fri Dec  7 12:03:23 2007
Errors in file /opt/oracle/admin/matest/udump/matest_ora_7208.trc:
ORA-07445: exception encountered: core dump [kkdcacr()+3280] [SIGSEGV] [Address not mapped to object] [0x8] [] []
Fri Dec  7 12:21:34 2007
Errors in file /opt/oracle/admin/matest/udump/matest_ora_7388.trc:
ORA-07445: exception encountered: core dump [kkdcacr()+3280] [SIGSEGV] [Address not mapped to object] [0x8] [] []
Fri Dec  7 13:17:40 2007
Errors in file /opt/oracle/admin/matest/udump/matest_ora_8022.trc:
ORA-07445: exception encountered: core dump [kkdcacr()+3280] [SIGSEGV] [Address not mapped to object] [0x8] [] []
Fri Dec  7 14:02:10 2007
Errors in file /opt/oracle/admin/matest/udump/matest_ora_10751.trc:
ORA-07445: exception encountered: core dump [kkdcacr()+3280] [SIGSEGV] [Address not mapped to object] [0x8] [] []

對照時間,和發生問題的時間一樣,
靠,居然還是我資料庫的問題。
開啟其中一個跟蹤檔案:
內容擷取如下:

/opt/oracle/admin/matest/udump/matest_ora_4292.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /ORACLE/product/10.2/db_1
System name: Linux
Node name: linux-testdb
Release: 2.6.16.21-0.8-bigsmp
Version: #1 SMP Mon Jul 3 18:25:39 UTC 2006
Machine: i686
Instance name: matest
Redo thread mounted by this instance: 1
Oracle process number: 32
Unix process pid: 4292, image:

*** 2007-12-07 10:26:36.956
*** SERVICE NAME:(SYS$USERS) 2007-12-07 10:26:36.955
*** SESSION ID:(126.126) 2007-12-07 10:26:36.955
Exception signal: 11 (SIGSEGV), code: 1 (Address not mapped to object), addr: 0x8, PC: [0x901111c, kkdcacr()+3280]
Registers:
%eax: 0x8ca14614 %ebx: 0xb70e601c %ecx: 0x00000000
%edx: 0x00000000 %edi: 0xb70e4608 %esi: 0x8ca1448c
%esp: 0xbff38648 %ebp: 0xbff38754 %eip: 0x0901111c
%efl: 0x00010246
  kkdcacr()+3270 (0x9011112) mov %eax,%edx
  kkdcacr()+3272 (0x9011114) add $12,%esp
  kkdcacr()+3275 (0x9011117) jmp 0x901111c
  kkdcacr()+3277 (0x9011119) mov 0x44(%esi),%edx
> kkdcacr()+3280 (0x901111c) mov 0x8(%edx),%ecx
  kkdcacr()+3283 (0x901111f) cmp 0x8(%edi),%ecx
  kkdcacr()+3286 (0x9011122) jne 0x9011161
  kkdcacr()+3288 (0x9011124) mov 0xc(%edx),%ecx
  kkdcacr()+3291 (0x9011127) cmp 0xc(%edi),%ecx
*** 2007-12-07 10:26:36.959
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [kkdcacr()+3280] [SIGSEGV] [Address not mapped to object] [0x8] [] []
Current SQL statement for this session:
select count(*) from GuPolicyCopyMain em,GuPolicyCopyRisk er,GuPolicyCopyEndorHead eh where em.policyNo=er.policyNo and em.endorSeqNo=er.endorSeqNo and er.policyNo=eh.policyNo and er.endorSeqNo=eh.endorSeqNo and em.endorSeqNo!='000'  AND (em.policyNo = 'PFNU03004403')  AND (er.uwYear = '2007')  AND (eh.validInd = '1')  AND (eh.inputDate>=to_date('2007-12-06', 'YYYY-MM-DD HH24:MI:SS'))  AND (eh.underWriteInd = '6')
----- Call Stack Trace -----
calling              call     entry                argument values in hex     
location             type     point                (? means dubious value)    
-------------------- -------- -------------------- ----------------------------
ksedst()+27          call     ksedst1()            1 ? 1 ?
ksedmp()+557         call     ksedst()             1 ? B72D4496 ? 0 ? 0 ?
                                                   B73E34AC ? B72B0003 ?
ssexhd()+863         call     ksedmp()             3 ? 901111C ? 63646B6B ?
                                                   28726361 ? 32332B29 ? 3038 ?
kkdcacr()+3280       signal   00000000             B ? B72C0C8C ? B72C0D0C ?
kkqvIsPrimaryForeig  call     kkdcacr()            B70EA6E0 ? 8CC50C9C ?
nKey()+79                                          915C9BB8 ? 31E ?
kkqvipr()+579        call     kkqvIsPrimaryForeig  B70EA374 ? B70EA2D8 ?
                              nKey()              
kkqvacu()+178        call     kkqvipr()            B70EA224 ? B70EA2D8 ?
kkqvcrf()+233        call     kkqvacu()            B70EA580 ? B70EA2D8 ?
                                                   BFF388F8 ?
kkqvcerq()+667       call     kkqvcrf()            B70EA580 ? BFF388F8 ?
vopqbc()+2379        call     kkqvcerq()           B70EAAE4 ? 0 ? 0 ? 0 ? 0 ?
                                                   0 ?
vopdrv()+458         call     vopqbc()             B70EAAE4 ? 0 ? 0 ? 1 ? 1 ?
                                                   0 ? 0 ? BFF38AA4 ? 0 ? 1 ?
                                                   0 ?
kkqdrv()+1145        call     vopdrv()             B70EAAE4 ? 3 ? 1 ?
opiSem()+6114        call     kkqdrv()             B70EAAE4 ? 1 ?
opiDeferredSem()+61  call     opiSem()             BFF39400 ? 9382B96C ? 1A2 ?
1                                                  1 ?
opitca()+12475       call     opiDeferredSem()     0 ? 0 ? 26 ? 0 ? BFF3E9BC ?
                                                   C24A63D ?
kksFullTypeCheck()+  call     opitca()             B71179A0 ? 8CC50C9C ?
20                                                
rpiswu2()+1061       call     00000000             BFF3F30C ? BFF3F130 ? 1 ?
                                                   13 ? 915C9D44 ? 915C9D08 ?
kksLoadChild()+1313  call     rpiswu2()            A237DAF0 ? 3D ? 915C9D44 ?
8                                                  13 ? 915C9F0C ? 3D ?
                                                   915C9D70 ? 0 ? 9573680 ? 0 ?
                                                   BFF3F30C ? 0 ?
kkslod()+213         call     kksLoadChild()       CBC2A40 ? 9A5F75E0 ?
                                                   BFF4029C ?
kglobld()+1556       call     00000000             CBC2A40 ? 9A5F75E0 ?
                                                   BFF4029C ?
kglobpn()+2419       call     kglobld()            CBC2A40 ? 915C9BB8 ?
                                                   BFF4029C ? 9FE774F0 ? 0 ?
kglpim()+263         call     kglobpn()            CBC2A40 ? 9FE774F0 ?
                                                   BFF4029C ? 1 ?
kglpin()+1110        call     kglpim()             CBC2A40 ? BFF4029C ?
                                                   9FE774F0 ?
kxsGetRuntimeLock()  call     kglpin()             CBC2A40 ? BFF4029C ?
+410                                               9FE35724 ? 9A5F75E0 ? 3 ? 3 ?
                                                   B71179C8 ?
kksfbc()+9915        call     kxsGetRuntimeLock()  CBC2A40 ? B71179A0 ?
                                                   BFF4029C ? 3 ? 1 ?
kkspsc0()+2191       call     kksfbc()             B71179A0 ? 3 ? 108 ?
                                                   BFF418FC ? 1A3 ? 0 ? 0 ? 0 ?
kksParseCursor()+45  call     kkspsc0()            B7100CE4 ? BFF418FC ? 1A3 ?
3                                                  3 ? 6 ? A4 ?
opiosq0()+2054       call     kksParseCursor()     BFF40AD8 ? CB740C8 ?
                                                   CBC2A40 ? 0 ? B7120274 ? 0 ?
kpooprx()+250        call     opiosq0()            3 ? E ? BFF40D10 ? A4 ?
kpoal8()+867         call     kpooprx()            BFF42EE8 ? BFF418FC ? 1A2 ?
                                                   1 ? 0 ? A4 ?
opiodr()+2347        call     00000000             5E ? 17 ? BFF42EE4 ?
ttcpip()+4227        call     00000000             5E ? 17 ? BFF42EE4 ? 0 ?
                                                   CC37296 ? E ?
opitsk()+1991        call     ttcpip()             CBCA240 ? 5E ? BFF42EE4 ? 0 ?
                                                   BFF423C4 ? BFF43008 ?
opiino()+1387        call     opitsk()             0 ? 0 ?
opiodr()+2347        call     00000000             3C ? 4 ? BFF43AD0 ?
opidrv()+915         call     opiodr()             3C ? 4 ? BFF43AD0 ? 0 ?
sou2o()+113          call     opidrv()             3C ? 4 ? BFF43AD0 ?
opimai_real()+212    call     sou2o()              BFF43AB4 ? 3C ? 4 ?
                                                   BFF43AD0 ?
main()+111           call     opimai_real()        2 ? BFF43B00 ?
__libc_start_main()  call     00000000             2 ? BFF43BC4 ? BFF43BD0 ?
+220                                               B7FB024D ? 0 ? B73E34E8 ?
 
--------------------- Binary Stack Dump ---------------------
 
========== FRAME. [1] (ksedst()+27 -> ksedst1()) ==========
Dump of memory from 0xB72BEE98 to 0xB72BEEA8
B72BEE90                   B72BEEA8 082E3A21          [..+.!:..]
B72BEEA0 00000001 00000001                    [........]       
========== FRAME. [2] (ksedmp()+557 -> ksedst()) ==========
上網去查詢ora-07445的內容,其中有一個建議是如下的:

oracle的BUG,在初始化檔案裡面設定引數
alter system set event = "10262 trace name context forever, level 1024000" scope=spfile;
重啟再看看還會不會產生這種錯誤。


按照這個方法作修改,重啟資料庫,10多分鐘後,問題又來了。

苦於沒有metlink帳號(付了錢的,但是過了半年還沒有拿到帳號),還是仔細看看跟蹤檔案的提示吧。

看了一會兒還是看不出什麼問題,不過每個檔案裡面都有以下的語句
Current SQL statement for this session:
select count(*) from GuPolicyCopyMain em,GuPolicyCopyRisk er,GuPolicyCopyEndorHead eh where em.policyNo=er.policyNo and em.endorSeqNo=er.endorSeqNo and er.policyNo=eh.policyNo and er.endorSeqNo=eh.endorSeqNo and em.endorSeqNo!='000'  AND (em.policyNo = 'PFNU03004403')  AND (er.uwYear = '2007')  AND (eh.validInd = '1')  AND (eh.inputDate>=to_date('2007-12-06', 'YYYY-MM-DD HH24:MI:SS'))  AND (eh.underWriteInd = '6')
引起了注意,好歹看看這句語句吧。執行這句語句,報ORA-03113的錯誤。
問題根源找到!!!!!!!!!!!


單獨查詢任何一個表都是沒有問題的。
懷疑是約束或索引有問題,先禁用掉這些表的約束試試吧。
禁用掉這些表相關的約束後,再執行這條語句,成功執行。

為了找出哪個表有問題,只得一個個表啟用約束,在看是否可以執行,終於在執行啟用GuPolicyCopyRisk的
pk_rf91時,約束一啟用,執行那條語句就有ORA-03113的錯誤。

解決辦法:
刪掉主鍵pk_rf91,
用drop constraint pk_rf91時,提示有外來鍵引用到這個主鍵,不讓刪。
先禁用掉所有的外來鍵吧,結果還是一樣的問題。
於是來一招狠的吧,drop constraint pk_rf91 cascade(刪之前查一下有哪些外來鍵應用這個約束,並且獲得生成他們的語句,這個主鍵可是有11個外來鍵引用呢!)
再重新生成主鍵,外來鍵。

按照此方法執行,到目前為止是沒有異常了。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/288166/viewspace-705058/,如需轉載,請註明出處,否則將追究法律責任。

相關文章