which situation IN is better than exist, and vice versa.

star_guan2008發表於2008-06-10

You Asked (Jump to Tom's latest followup)

Tom:

can you give me some example at which situation
IN is better than exist, and vice versa.


and we said...

Well, the two are processed very very differently.

Select * from T1 where x in ( select y from T2 )

is typically processed as:

select *
  from t1, ( select distinct y from t2 ) t2
where t1.x = t2.y;

The subquery is evaluated, distinct'ed, indexed (or hashed or sorted) and then
joined to the original table -- typically.


As opposed to

select * from t1 where exists ( select null from t2 where y = x )

That is processed more like:


   for x in ( select * from t1 )
   loop
      if ( exists ( select null from t2 where y = x.x )
      then
         OUTPUT THE RECORD
      end if
   end loop

It always results in a full scan of T1 whereas the first query can make use of
an index on T1(x).


So, when is where exists appropriate and in appropriate?

Lets say the result of the subquery
    ( select y from T2 )

is "huge" and takes a long time.  But the table T1 is relatively small and
executing ( select null from t2 where y = x.x ) is very very fast (nice index on
t2(y)).  Then the exists will be faster as the time to full scan T1 and do the
index probe into T2 could be less then the time to simply full scan T2 to build
the subquery we need to distinct on.


Lets say the result of the subquery is small -- then IN is typicaly more
appropriate.


If both the subquery and the outer table are huge -- either might work as well
as the other -- depends on the indexes and other factors.

-------------------------------------------------------------------------------------------
select a.jsjdm
  from  a
 where  not exists (select /*+ hash_aj */ 1
                         from  b
                        where a.jsjdm=b.jsjdm)

select /*+  use_hash(m n) no_merge(m) no_merge(n)*/m.jsjdm
  from (select jsjdm
          from  a
         where a.kydjrq < to_date('20080101', 'yyyymmdd')
           and a.nsrzt = '10'
           and substr(a.djzclxdm, 1, 1) != '4') m,
       (select distinct jsjdm
          from  b
         where b.zyrq >= to_date('20070101', 'yyyymmdd')
           and b.zyrq < to_date('20080101', 'yyyymmdd')
           and substr(b.sklxdm, 1, 1) in ('1', '2')
           and substr(b.zwbs, 2, 1) = '1') n
 where m.jsjdm = n.jsjdm(+)
  and n.jsjdm is null

select count(jsjdm)
  from a
 where a.kydjrq < to_date('20080101', 'yyyymmdd')
   and a.nsrzt = '10'
   and substr(a.djzclxdm, 1, 1) != '4'
minus (select distinct jsjdm
         from b
        where b.zyrq >= to_date('20070101', 'yyyymmdd')
          and b.zyrq < to_date('20080101', 'yyyymmdd')
          and substr(b.sklxdm, 1, 1) in ('1', '2')
          and substr(b.zwbs, 2, 1) = '1')

 

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/10742223/viewspace-343294/,如需轉載,請註明出處,否則將追究法律責任。

相關文章