Disks of Dismounted DG Are Still Hold/Lock By Oracle Process 11.2.0.3_1485163.1

rongshiyuan發表於2015-02-27

In this Document

Symptoms
Cause
Solution
References


Applies to:

Oracle Database - Enterprise Edition - Version 11.1.0.7 to 11.2.0.3 [Release 11.1 to 11.2]
Information in this document applies to any platform.

Symptoms

You are on 11.2.0.3 or have the fix for the same issue on 11.2.0.2 as described in MOS Doc ID 1306574.1

Still, you are seeing that the disks of the dismounted diskgroup are held by processes.

This is not allowing you to copy the VGs of the dismounted diskgroup

 

In the ASM alert.log you might see the followings

ORA-27061: waiting for async I/Os failed
Linux-x86_64 Error: 5: Input/output error
Additional information: -1
Additional information: 4096
WARNING: Read Failed. group:0 disk:5 AU:0 offset:0 size:4096

and in the OS logs

May 16 12:59:12 myoda1 kernel: Buffer I/O error on device dm-35, logical block 8

 

Cause

The diskgroup DATA, used for the test, has these disks:

NOTE: cache opening disk 0 of grp 2: DATA_0000 path:/dev/vgvbsdb1r/rlvol1
NOTE: cache opening disk 1 of grp 2: DATA_0001 path:/dev/vgvbsdb1r/rlvol2
NOTE: cache opening disk 2 of grp 2: DATA_0002 path:/dev/vgvbsdb1r/rlvol3
NOTE: cache opening disk 3 of grp 2: DATA_0003 path:/dev/vgvbsdb1r/rlvol4
NOTE: cache opening disk 4 of grp 2: DATA_0004 path:/dev/vgvbsdb2r/rlvol1
NOTE: cache opening disk 5 of grp 2: DATA_0005 path:/dev/vgvbsdb2r/rlvol2

There is another diskgroup SYSTEM that has the OCR and Votefiles.

 

The command lsof/fuser shows the disks are still used:

$ lsof
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
oracle 11920 oracle 2049u CHR 64,0x130001 0t0 697 /dev/vgvbsdb1r/rlvol1
oracle 11920 oracle 2050u CHR 64,0x130002 0t0 1280 /dev/vgvbsdb1r/rlvol2
oracle 11920 oracle 2051u CHR 64,0x130003 0t0 1609 /dev/vgvbsdb1r/rlvol3
oracle 11920 oracle 2052u CHR 64,0x130004 0t0 1628 /dev/vgvbsdb1r/rlvol4
oracle 11920 oracle 2053u CHR 64,0x140001 0t0 5242 /dev/vgvbsdb2r/rlvol1
oracle 11920 oracle 2054u CHR 64,0x140002 0t0 5246 /dev/vgvbsdb2r/rlvol2

$ fuser /dev/vgvbsdb1r/rlvol1
/dev/vgvbsdb1r/rlvol1: 11920o

 

IMPORTANT : Note that 'asmcmd lsod' does not list these devices are locked. It just lists the disks of SYSTEM diskgroup that is mounted.

 

This process keeping the descriptor is:

oracle 11920 1 0 Mar 26 ? 0:00 oracle+ASM1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))

From system state dump the process is seen and it is 'oraagent.bin'

PROCESS 23:
----------------------------------------
SO: 0xc00000003fcdc780, type: 2, owner: 0x0000000000000000, flag:
INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0xc00000003fcdc780, name=process, file=ksu.h LINE:12616 ID:, pg=0
(process) Oracle pid:23, ser:1, calls cur/top:
0x0000000000000000/0xc00000003f5e4718
flags : (0x0) -
flags2: (0x0), flags3: (0x10)
intr error: 0, call error: 0, sess error: 0, txn error 0
intr queue: empty
ksudlp FALSE at location: 0
(post info) last post received: 0 0 27
last post received-location: ksa2.h LINE:289 ID:ksasnr
last process to post me: c00000003fcd7410 1 6
last post sent: 0 0 26
last post sent-location: ksa2.h LINE:285 ID:ksasnd
last process posted by me: c00000003fcc9b20 1 6
(latch info) wait_event=0 bits=0
Process Group: DEFAULT, pseudo proc: 0xc00000003fe49480
O/S info: user: oracle, term: UNKNOWN, ospid: 11920
OSD pid info: Unix process pid: 11920, image: oracle@shpdbs15 (TNS V1-V3)

 

This is discussed in

Bug 13869294 : DISMOUNTING DISKGROUP IN ASM BUT DEVICE STILL IN USE BY AN ASM PROCESS

closed as duplicate of

    Bug 14223113 : ASM DISK NOT RELEASED BY CRSD.BIN PROCESS AFTER DROPPING DISK
 

Solution

Bug 14223113 is fixed in 11.2.0.4 onwards

The fix for defect 14223113 is not backportable on lower version due to complexity, therefore you should use the mentioned workarounds.

This issue is also known to affect 11.1.0.7, although the fix cannot be backported.

WORKAROUNDS

1) One workaround is to restart the clusterware.

2) Another workaround is stated below:

     2.1] Use the fuser command to find the process which still holds the dismounted disk.

     2.2] Run the following SQL in the local ASM instance to ID the corresponding clusterware daemon process responsible

select p.spid FG_OSPID, s.process CL_OSPID, s.program CL_NAME
from v$session s, v$process p
where p.addr = s.paddr
and spid = '&ospid_seen_in_fuser';

     2.3] Take the following corrective action to close the stale FD's. This can be done while the database is running.
    Please make sure to compare the output of "crsctl stat res -t" from before/after running these commands to ensure correctness.


o If the daemon process is crsd.bin, run the following commands as the clusterware user to restart crsd.bin:

$ crsctl stop res ora.crsd -init
$ crsctl start res ora.crsd -init

 

o If the daemon process is oraagent.bin, verify whether the agent is a CRSD agent or an OHASD agent by running the following:

$ grep $ORACLE_HOME/log/`hostname`/agent/*/*/*pid
(CL_OSPID is taken from the query output above)

 

o If the agent is a CRSD agent (pid file found in the agent/crsd/* directory), restarting crsd.bin using the above commands should resolve the issue.
o If the agent is an OHASD agent (pid file found in the agent/ohasd/* directory), run "kill -9" against the oraagent.bin process and it will be restarted automatically.

 

IMPORTANT : If the process is any other process, like background process, then a restart of the instance might be needed.

References

NOTE:402526.1 - Asm Devices Are Still Held Open After Dismount or Drop
BUG:14223113 - ASM DISK NOT RELEASED BY CRSD.BIN PROCESS AFTER DROPPING DISK
BUG:13869294 - DISMOUNTING DISKGROUP IN ASM BUT DEVICE STILL IN USE BY AN ASM PROCESS
 

Document Details

 
Rate this document Email link to this documentOpen document in new windowPrintable Page
Type:
Status:
Last Major Update:
Last Update:
PROBLEM
PUBLISHED
27-Aug-2014
24-Oct-2014
     
 

Related Products

 
Oracle Database - Enterprise Edition
     
 

Document References

 
     
 

Recently Viewed

 
     
 

Attachments

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/17252115/viewspace-1442634/,如需轉載,請註明出處,否則將追究法律責任。

相關文章