Bug 8572205 : CHILDCRASH, OS ERROR: 0, OTHER: ABNORMAL TERMINATION OF CHILD

panpong發表於2013-02-20
Bug 8572205 : CHILDCRASH, OS ERROR: 0, OTHER: ABNORMAL TERMINATION OF CHILD
Click to add to Favorites Email link to this document Printable Page To BottomTo Bottom
 

Bug Attributes

 

Type B - Defect Fixed in Product Version
Severity 2 - Severe Loss of Service Product Version 10.2.0.4
Status 33 - Suspended, Req'd Info not Avail Platform 23 - Oracle Solaris on SPARC (64-bit)
Created 03-Jun-2009 Platform. Version 10
Updated 16-Jun-2009 Base Bug N/A
Database Version 10.2.0.4 Affects Platforms Generic
Product Source Oracle
 

Related Products

 

Line Oracle Database Products Family Oracle Database
Area Oracle Database Product 5 - Oracle Database - Enterprise Edition
Hdr: 8572205 10.2.0.4 PCW 10.2.0.4 RACG PRODID-5 PORTID-23

Abstract: CHILDCRASH, OS ERROR: 0, OTHER: ABNORMAL TERMINATION OF CHILD

*** 06/03/09 10:03 am ***
TAR:
----
7537479.993

PROBLEM:
--------
complete outage in the 4 instances out of 6 because of the following issue:
=====
   
    2009-06-02 02:26:00.470: [  CRSEVT][911170] CAAMonitorHandler :: 0:Could
not join
    /opt/oracle/product/10.2.0/crs/bin/racgwrap(check)
    category: 1234, operation: scls_process_join, loc: childcrash, OS error:
0, other: Abnormal
    termination of the child
   
    2009-06-02 02:26:00.470: [  CRSEVT][911170] CAAMonitorHandler :: 0:Action
Script
    /opt/oracle/product/10.2.0/crs/bin/racgwrap(check) timed out for
ora.eprvd4244.vip! (timeout=60)
    2009-06-02 02:26:00.470: [  CRSAPP][911170] CheckResource error for
ora.eprvd4244.vip error code =
    -2
    2009-06-02 02:35:22.561: [  CRSEVT][911158] CAAMonitorHandler :: 0:Could
not join
    /opt/oracle/product/10.2.0/racdb_04/bin/racgwrap(check)
    category: 1234, operation: scls_process_join, loc: childcrash, OS error:
0, other: Abnormal
    termination of the child
   
    2009-06-02 02:35:22.561: [  CRSEVT][911158] CAAMonitorHandler :: 0:Action
Script
    /opt/oracle/product/10.2.0/racdb_04/bin/racgwrap(check) timed out for
ora.se001p.se001p5.inst!
    (timeout=600)
    2009-06-02 02:35:22.561: [  CRSAPP][911158] CheckResource error for
ora.se001p.se001p5.inst error
    code = -2
    2009-06-02 02:35:23.101: [  CRSEVT][911159] CAAMonitorHandler :: 0:Could
not join
    /opt/oracle/product/10.2.0/racdb_04/bin/racgwrap(check)
    category: 1234, operation: scls_process_join, loc: childcrash, OS error:
0, other: Abnormal
    termination of the child
   
    2009-06-02 02:35:23.101: [  CRSEVT][911159] CAAMonitorHandler :: 0:Action
Script
    /opt/oracle/product/10.2.0/racdb_04/bin/racgwrap(check) timed out for
    ora.eprvd4244.LISTENER_OFAC0P_EPRVD4244.lsnr! (timeout=600)
    2009-06-02 02:35:23.101: [  CRSAPP][911159] CheckResource error for
    ora.eprvd4244.LISTENER_OFAC0P_EPRVD4244.lsnr error code = -2
    2009-06-02 02:35:23.691: [  CRSEVT][911162] CAAMonitorHandler :: 0:Could
not join
    /opt/oracle/product/10.2.0/racdb_04/bin/racgwrap(check)
    category: 1234, operation: scls_process_join, loc: childcrash, OS error:
0, other: Abnormal
    termination of the child
=======

DIAGNOSTIC ANALYSIS:
--------------------
This issue is already addressed in bug:6196746.  This bug is fixed in
10.2.0.5. and
    11.1.0.7.The Workaround is as follows:
    =====
   
    1. Stop CRS on the Node.
   
    2. Make a copy of racgwrap located under $ORACLE_HOME/bin and
$CRS_HOME/bin on the Node
   
    3. Edit the file racgwrap and modify the last 3 lines from:
   
    $ORACLE_HOME/bin/racgmain "$"
    status=$?
    exit $status
   
    to:
   
    # Line added to test fix for Bug 6196746
    exec $ORACLE_HOME/bin/racgmain "$"
   
    4. Restart CRS and make sure that all the resources are starts.
    =====

WORKAROUND:
-----------
The Workaround is NOT working in rolling way. customer CAN NOT have the
complete outage in the cluster as its their vital business generating system

RELATED BUGS:
-------------
Bug 6196746 - HUGE AND GROWING LIST OF RACG CHECK VIP PROCESSES, TIMEOUT

REPRODUCIBILITY:
----------------

TEST CASE:
----------

STACK TRACE:
------------

SUPPORTING INFORMATION:
-----------------------
    I was in a con call with customer tried to do this in a rolling fashion
to implement and it did
    not work.
   
    Based on this note, it says to Stop CRS on ALL Nodes.
   
    732086.1- Many Orphaned Or Hanging "racgmain" processes Running.

24 HOUR CONTACT INFORMATION FOR P1 BUGS:
----------------------------------------

DIAL-IN INFORMATION:
--------------------

IMPACT DATE:
------------

*** 06/03/09 10:11 am *** (CHG: Sta->16)
*** 06/03/09 10:11 am ***
*** 06/03/09 10:11 am ***
*** 06/03/09 10:21 am ***
*** 06/03/09 10:36 am ***
*** 06/03/09 10:38 am ***
*** 06/03/09 10:38 am *** (ADD: Impact/Symptom->DATABASE CRASH )
*** 06/03/09 10:38 am *** (ADD: Impact/Symptom->FEATURE UNUSABLE )
*** 06/03/09 10:38 am ***
*** 06/03/09 11:20 am *** (CHG: Sta->10 Asg->RDBMSREP)
*** 06/03/09 11:20 am ***
*** 06/03/09 11:21 am ***
*** 06/03/09 12:14 pm *** (CHG: Sta->16)
*** 06/03/09 12:14 pm ***
*** 06/03/09 12:45 pm *** (CHG: Sta->10)
*** 06/03/09 12:45 pm ***
*** 06/05/09 04:45 am ***
*** 06/05/09 04:45 am ***
*** 06/16/09 07:38 am *** (CHG: Sta->33)
*** 06/16/09 07:38 am ***

 
Copyright (c) 2013, Oracle. All rights reserved.    

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/16976507/viewspace-754334/,如需轉載,請註明出處,否則將追究法律責任。

相關文章