[oracle@rh2 ~]$ oerr ora 10357
10357, 00000, "turn on debug information for direct path"
// *Cause:
// *Action:  turn on debug information for direct path

The cause of this issue was identified as unpublished Bug 9650718

In the bug, it was found that during cleanup from a direct path load, qesmm context was freed
without the pointer to the context being set to zero. Because the pointer was non zero,
there was code that tried to use the context.

To help in diagnosing the issue, the following was used:

This command will enable tracing for direct path operations in the server:
ALTER SYSTEM SET EVENTS `TRACE [DIRPATH_LOAD] DISK=high`;
This command will enable tracing for the I/O component of direct path load:
alter system set events `10357 trace name context forever, level 12`;




klmalp: heap=0x7f9bd14efea8 size=72 comment=klcliti:klclih+kltb+klpt mem=0x7f9bd14eab98
klmalp: heap=0x7f9bd14efea8 size=224 comment=klcliti:klclih+kltb+klpt mem=0x7f9bd14eabe0
klmalp: heap=0x7f9bd14efea8 size=336 comment=klcliti:klclih+kltb+klpt mem=0x7f9bd14eacc0
klmalp: heap=0x7f9bd14efea8 size=80 comment=klcliti:klclih+kltb+klpt mem=0x7f9bd14eae10
klmalp: heap=0x7f9bd14efea8 size=8168 comment=kdobjrbb mem=0x7f9bd14edbc0
klmalp: heap=0x7f9bd14efea8 size=1150 comment=ktbbhs:kdbh mem=0x7f9bd14eb230
klmalp: heap=0x7f9bd14efea8 size=432 comment=kcbl_structure_instance mem=0x7f9bd14eb6b0
klmalp: heap=0x7f9bd14efea8 size=64 comment=kllcqc:kllcq mem=0x7f9bd14eb860
klmalp: heap=0x7f9bd14efea8 size=1312 comment=kllcqc:kllcqslt mem=0x7f9bd14ecb40
kcblin: lbs=0x7f9bd14eb6b0 flg=104 slt=(nil) cnt=2 sz=262144 st_obj=0x8b0cc8c0 fb=0
kcblin: state objects are: Call=8b0cc8c0, Current Call=8b0cc8c0, Session=8b082f18, Proc=8bae46c8
kdblil2<-klclil1r<-qerltFRop<-qercoRop<-kdstf0000101km<-kdsttgr<-qertbFetch<-qercoFetch<-rwsfcd<-qerltFetch
<-insdlexe<-insExecStmtExecIniEngine<-insexe<-opiexe<-kpoal8<
-opiodr<-ttcpip<-opitsk<-opiino<-opiodr
klmalp: heap=0x7f9bd14efea8 size=48 comment=kdblix:klixllkey mem=0x7f9bd14ed060
klmalp: heap=0x7f9bd14efea8 size=528 comment=kcblinlm mem=0x7f9bd14ed090
kcblnb: lbs=d14eb6b0, slt=0, d=10dd2a9, nwr=0, cnt=0
klmalp: heap=0x7f9bd14efea8 size=262656 comment=kllcqgf:kllsltba mem=0x7f9bd135de00
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2aa, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ab, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ac, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ad, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ae, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2af, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2b0, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2b1, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2b2, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2b3, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2b4, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2b5, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2b6, nwr=0, cnt=0
kcblcow: lbs=d14eb6b0, nslots=2, nwr=1
kcblco:lbs=d14eb6b0 slt=d14ecb40 typ=1 afn=4 blk=905897 d=10dd2a9 cnt=17 buf=d135de00 rq=d14ecc18fob=898375c8
kcblco ret: lbs=d14eb6b0 slt=d14ecb40 type=1 afn=4 blk=905897 cnt=17 buf=d135de00 rq=d14ecc18 fob=898375c8, wait=0, more=-783370504, wrc=0
kcblcow: slt=d14ecb40 dba=10dd2a9, sz=32, blks=23, st=1, idx=0
kcblio: lbs=d14eb6b0 slt=d14ecdd0 nslt=2, d=10dd2c0, st=2, fbon=0, flg=104
kcblio: Logging lbs=d14eb6b0 slt=d14ecdd0 nslt=2, d=10dd2c0, st=2
kcblgr: lbs=d14eb6b0 slt=d14ecdd0, rd=10dd2c0, flg=104, blks=32
kcbldio:lbs=d14eb6b0 flg=104 slt=d14ecdd0 typ=1 async=16 afn=4 blk=905920 cnt=20 buf=d13ade00 rq=d14ecea8 fob=0 blks=32 st=2
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2e1, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2e2, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2e3, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2e4, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2e5, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2e6, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2e7, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2e8, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2e9, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ea, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2eb, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ec, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ed, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ee, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ef, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f0, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f1, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f2, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f3, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f4, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f5, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f6, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f7, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f8, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f9, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2fa, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2fb, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2fc, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2fd, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2fe, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ff, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd300, nwr=1, cnt=23

[oracle@rh2 ~]$ oerr ora 10351
10351, 00000, "size of slots"
// *Cause:
// *Action:  sets the size of slots to use
// *Comment: a slot is a unit of I/O and this factor controls the size
// *Comment: of the IO.

alter session set events `10351 trace name context forever, level 128`;

level 128 -> direct path write max block 128

I generated a new run of the big testcase with event 10357, Patch 4417285
applied, manual workarea_size_policy, sort_area_size=50000000,
db_file_multiblock_read_count=16 and event 10351 with level 128.

I tried it with disk_asynch_io=TRUE and FALSE just to be certain this is not
something related to the async.

In the trace files I see something very peculiar. The slots size is 128 as
expected and I see many writes of 128 blocks but not all of them are and they
look like the they come in clusters. A few 128 writes, then a lot smaller of
different sizes but mainly less than 16 blocks and then another cluster of
big ones and so on.

kcblcow: dba=100c91b, sz=128, blks=117, st=3, idx=14
kcblcow: dba=100c91b, sz=128, blks=117, st=3, idx=14
kcblcow: dba=100c991, sz=128, blks=1, st=3, idx=15
kcblcow: dba=100c91b, sz=128, blks=117, st=3, idx=14
kcblcow: dba=100c991, sz=128, blks=1, st=3, idx=15
kcblcow: dba=100c990, sz=128, blks=1, st=3, idx=0
kcblcow: dba=100c992, sz=128, blks=128, st=3, idx=1
kcblcow: dba=100c992, sz=128, blks=128, st=3, idx=1
kcblcow: dba=100ca12, sz=128, blks=39, st=3, idx=2
kcblcow: dba=100c992, sz=128, blks=128, st=3, idx=1
kcblcow: dba=100ca12, sz=128, blks=39, st=3, idx=2
kcblcow: dba=100ca3a, sz=128, blks=1, st=3, idx=3
kcblcow: dba=100ca12, sz=128, blks=39, st=3, idx=2
kcblcow: dba=100ca3a, sz=128, blks=1, st=3, idx=3
kcblcow: dba=100ca39, sz=128, blks=1, st=3, idx=4

but
it is possible that there are other factor out of our control that forces
Oracle to stop adding blocks to the slot and write small batches.

In conclusion, in order to have the least ammount of direct operations and
have the maximum possible read/write batches these are the parameters to set
:

alter session set events `10351 trace name context forever, level 128`;
alter session set workarea_size_policy=manual;
alter session set sort_area_size=50000000;