March 12, 2007

Filed under: — Jonathan Lewis @ 9:52 pm UTC Mar 12,2007

The following question appeared a little while ago on comp.databases.oracle.server, with reference to Oracle 10.2.0.1:

drop table t1 purge;      

create table t1
as
select * from all_objects;       

create index t1_fbi1   on t1(lower(object_name));
create index t1_i1 on t1(object_name);       

-- now generate statistics

Given the above data and indexes, why do the following two queries show significantly different execution plans ?

SQL> set autotrace traceonly explain     

SQL> select  max(object_name)
  2  from    t1
  3  ;     

Execution Plan
----------------------------------------------------------
Plan hash value: 1743745495     

------------------------------------------------------------------------------------
| Id  | Operation                  | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |       |     1 |    25 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE            |       |     1 |    25 |            |          |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T1_I1 | 46531 |  1136K|     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------     

SQL> select  max(lower(object_name))
  2  from    t1
  3  ;     

Execution Plan
----------------------------------------------------------
Plan hash value: 3308075536     

---------------------------------------------------------------------------------
| Id  | Operation             | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |         |     1 |    25 |    65   (2)| 00:00:01 |
|   1 |  SORT AGGREGATE       |         |     1 |    25 |            |          |
|   2 |   INDEX FAST FULL SCAN| T1_FBI1 | 46531 |  1136K|    65   (2)| 00:00:01 |
---------------------------------------------------------------------------------

Note how the query that can use the simple index uses the special min/max optimisation for index range/full scans that appeared in the 8.1 timeline, but the query that can use the function-based index does a fast full scan and sort of the index - and a check of the resource usage shows that autotrace is telling us the truth about the plans in both cases.

To investigate this type of problem, one of my first “tricks” is simply to tell the optimizer to do what I think it should do. In this case, give it a hint to use the index properly.

SQL> select  /*+ index(t1) */
  2  	     max(lower(object_name))
  3  from    t1
  4  ;    

Execution Plan
----------------------------------------------------------
Plan hash value: 1546143440    

---------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |     1 |    25 |   235   (1)| 00:00:03 |
|   1 |  SORT AGGREGATE             |         |     1 |    25 |            |          |
|   2 |   FIRST ROW                 |         | 46531 |  1136K|   235   (1)| 00:00:03 |
|   3 |    INDEX FULL SCAN (MIN/MAX)| T1_FBI1 | 46531 |  1136K|   235   (1)| 00:00:03 |
---------------------------------------------------------------------------------------

The path I was expecting has appeared - with an interesting “first row” operation and a surprising cost ! A quick check of resource usage shows that Oracle used the path given, with minimal resource usage, confirming that the cost is a serious miscalculation. So where does that cost come from.

Change the query slightly, and you’ll see:

SQL> select	/*+ index(t1) */
  2  	object_name
  3  from	t1
  4  ;   

Execution Plan
----------------------------------------------------------
Plan hash value: 2969740442   

--------------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT |       | 46531 |  1136K|   235   (1)| 00:00:03 |
|   1 |  INDEX FULL SCAN | T1_I1 | 46531 |  1136K|   235   (1)| 00:00:03 |
--------------------------------------------------------------------------

The index-hinted query to find the max() did actually use the min/max access path at run-time - we can see that from the resource usage - but the optimizer used the cost for a simple full scan, which is rather expensive and made the default behaviour switch to the fast full scan with sort. It’s some sort of bug in the optimizer.

Interestingly, we can get the min/max plan to appear by adding a predicate to the query that (notionally) addresses any problems that might be caused by nulls:

SQL> select  max(lower(object_name))
  2  from    t1
  3  where   lower(object_name) is not null
  4  ;  

Execution Plan
----------------------------------------------------------
Plan hash value: 1546143440  

---------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |     1 |    25 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |         |     1 |    25 |            |          |
|   2 |   FIRST ROW                 |         | 46531 |  1136K|     2   (0)| 00:00:01 |
|*  3 |    INDEX FULL SCAN (MIN/MAX)| T1_FBI1 | 46531 |  1136K|     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------  

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter(LOWER("OBJECT_NAME") IS NOT NULL)

So ultimately we have to go to the 10053 trace file to try and pin down the exact nature of the bug. In the trace file, we can see that the version of code with the ‘not null’ predicate is the only one that results in a section of the Single Table Access Path calculation that examines the possibility of the min/max path.

  Access Path: index (Min/Max)
    Index: T1_FBI1
    resc_io: 2.00  resc_cpu: 14443
    ix_sel: 2.1491e-005  ix_sel_with_filters: 2.1491e-005
    Cost: 2.00  Resp: 2.00  Degree: 1

From this, we can conclude the problem lies in the optimizer failing to spot the option for using the min/max path in the default scenario, rather than the optimizer doing the wrong calculation for the path.

It doesn’t however, tell us why the run-time engine can apparently use the min/max optimisation when the optimizer obeyed our hint and generated a plan that included a full index scan. The trace file in the hinted case only showed the full scan calculation, it didn’t suggest a min/max, nor a “first row”, nor a descending scan.

And just one final thought - the problem shouldn’t have anything to do with the fact that we have supplied an explicit ‘not null’ predicate. If this were an example of a “nulls not in index” problem, the optimizer would have to fall back to using a tablescan in every case.

And at this point, I usually pass the buck to Oracle support.

FBI? MAX? INDEX FULL SCAN (MIN/MAX)?

March 12, 2007

相關文章