No_expand hint 解決CBO針對where...or..語句中的cardinality計算問題

yezhibin發表於2009-12-16
        在where...or..語句中,CBO在計算關聯cardinality數量中,會產生計算的偏差,從而造成語句效率低下,通常我們解決的辦法:
       1、重寫SQL語句
       2、加no_expand hint

      以下是從原理上分析CBO造成計算cardinality錯誤的原因。

     1、建立測試指令碼

SQL>execute dbms_random.seed(0)
SQL>create table t1
           as
           select
                 trunc(dbms_random.value(0, 30 ))    join1,
                 trunc(dbms_random.value(0, 50 ))    join2,
                 lpad(rownum,10)                v1,
                  rpad('x',100)                padding
            from
                 all_objects
            where
                rownum <= 10000;

SQL>create table t2
          as
          select
                trunc(dbms_random.value(0, 40 ))    join1,
                trunc(dbms_random.value(0, 40 ))    join2,
                lpad(rownum,10)                v1,
                rpad('x',100)                padding
          from
                all_objects
          where
               rownum <= 10000;

2、統計分析(略)

3、檢視執行計劃
SQL>set autotrace traceonly explain
SQL>select
                 t1.v1, t2.v1
           from
                 t1,
                 t2
           where
                   t2.join1 = t1.join1
           or    t2.join2 = t1.join2;

Execution Plan
-----------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |              |  2125K|    68M|   210  (24)| 00:00:03 |
|   1 |  CONCATENATION         |              |              |            |                   |                  |
|*  2 |   HASH JOIN                    |              |  2000K|     64M|   102  (22)| 00:00:02 |
|   3 |    TABLE ACCESS FULL|       T1   | 10000 |   166K|      40   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL|       T2   | 10000 |   166K|      40   (0)| 00:00:01 |
|*  5 |   HASH JOIN                    |               |    125K|  4150K|   108  (26)| 00:00:02 |
|   6 |    TABLE ACCESS FULL|        T1   | 10000 |   166K|    40   (0)| 00:00:01 |
|   7 |    TABLE ACCESS FULL|        T2   | 10000 |   166K|    40   (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T2"."JOIN2"="T1"."JOIN2")
   5 - access("T2"."JOIN1"="T1"."JOIN1")
       filter(LNNVL("T2"."JOIN2"="T1"."JOIN2"))

分析並計算:
A、SQL語句變成CONCATENATION方式,從predicate information中看到filter(LNNVL( )),lnnvl(predicate) 如果predicate 是 false 或 nul,將返回True
 
B、第一個hash join的Rows計算
   
Join Selectivity(t2.join2=t1.join2)=
                       ((t1.num_rows(join2) - t1.null(join2))/t1.num_rows) *
                       ((t2.num_rows(join2) - t2.null(join2))/t2.num_rows) *
                        1/max(t1.distinct_value(join2), t2.distinct_value(join2)
                    =((10000-0)/10000) * (10000-0)/10000) * 1/max(40,50)
                    =1/50
Join cardinality(t2.join2=t1.join2) =
                           filter(t1) * filter(t2)* Join Selectivity(t2.join2=t1.join2)
                      = 10000 * 10000 * 1/50
                      =2000000 = 2000K
與執行計劃id =2 輸出結果一致

 C、第二個hash join的Rows計算
    從以上可以看出包含兩個預測:
              5 - access("T2"."JOIN1"="T1"."JOIN1")
                    filter(LNNVL("T2"."JOIN2"="T1"."JOIN2"))
     所以
      Join Selectivity= Join Selectivity(t2.join1=t1.join1) *
                                    Join Selectivity(LNNVL(t2.join2 = t1.join2))

      如上計算方法
      Join Selectivity(t2.join1=t1.join1) = 1/max(30,40) = 1/40

      而Join Selectivity(LNNVL(t2.join2 = t1.join2))計算是以5%(捆綁變數方式計算)
  
      從而:
         join Selectivity = 1/40*5%=1/800
 
         Join cardinality = 10000 * 10000 * 1/800 = 125K

 不難看出,造成計算偏差的是由於CBO在進行預測的時候,轉化了filter,並評估其Selectivity為5%造成的結果,從而使整個執行計劃中的rows值與實際相差巨大。

4、解決辦法1:重寫SQL語句

SQL>select    t1.v1, t2.v1
           from  t1, t2
           where
                  t2.join2 = t1.join2
           union all
           select    t1.v1, t2.v1
           from t1, t2
           where
                      t2.join1 = t1.join1
           and    t2.join2 != t1.join2;

Execution Plan
-----------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |  4450K|   132M|   210  (62)| 00:00:03 |
|   1 |  UNION-ALL                       |             |              |             |                   |                    |
|*  2 |   HASH JOIN                      |             |  2000K|     53M|   102  (22)| 00:00:02 |
|   3 |    TABLE ACCESS FULL  |      T1   | 10000 |   136K|      40   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL   |     T2   | 10000 |   136K|      40   (0)| 00:00:01 |
|*  5 |   HASH JOIN                       |             |  2450K|    79M|     108  (26)| 00:00:02 |
|   6 |    TABLE ACCESS FULL   |      T1   | 10000 |   166K|      40   (0)| 00:00:01 |
|   7 |    TABLE ACCESS FULL   |      T2   | 10000 |   166K|      40   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T2"."JOIN2"="T1"."JOIN2")
   5 - access("T2"."JOIN1"="T1"."JOIN1")
       filter("T2"."JOIN2"<>"T1"."JOIN2")

分析並計算:

A. Join Selectivity(t2.join2 = t1.join2)的計算公式如上所描述等於1/50
     Join Cardinanlity(t2.join2 = t1.join2) = 2000K

B. 關鍵是在filter中已經發生了變化,從filter(LNNVL("T2"."JOIN2"="T1"."JOIN2"))變成了filter("T2"."JOIN2"<>"T1"."JOIN2"),所以也造成了計算的差異

   Join Selectivity( ID=5) = Join Selectivity (t2.join1 = t1.join1) *
                                              Join Selectivity(t2.join2<>t1join2)
  
   Join Selectivity (t2.join1 = t1.join1) = 1/max(30,40) = 1/40
   Join Selectivity(t2.join2<>t1join2) = 1 - Join Selectivity(t2.join2 =t1join2)
                                                               = 1-1/50=49/50

   Join Selectivity( ID=5) = 1/40*49/50 = 49/2000 = 0.0245

   Join Cardinanlity(ID=5) = 10000 * 10000 * 0.0245 =2450K
  
5、解決辦法2:新增no_expand hint
SQL>select  /*+ no_expand */
                   t1.v1, t2.v1
            from  t1,  t2
           where
                   t2.join1 = t1.join1
           or    t2.join2 = t1.join2;
          
  Execution Plan
---------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name  |  Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |               |  4450K|   144M|   381K  (1)| 01:16:22 |
|   1 |  NESTED LOOPS            |               |  4450K|    144M|   381K  (1)| 01:16:22 |
|   2 |   TABLE ACCESS FULL |        T1   | 10000 |    166K|    40   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL |        T2   |     445 |   7565 |    38   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("T2"."JOIN1"="T1"."JOIN1" OR "T2"."JOIN2"="T1"."JOIN2")

分析並計算
Join Selectivity(t2.join1= t1.join1 or t2.join2 = t1.join2)
                = Join Selectivity(t2.join1= t1.join1) +
                   Join Selectivity(t2.join2 = t1.join2) -
                   Join Selectivity(t2.join1= t1.join1) *
                   Join Selectivity(t2.join2 = t1.join2)

Join Selectivity(t2.join1 = t1.join1) = 1/max(30,40) = 1/40
Join Selectivity(t2.join2 = t2.join2) = 1/max(40,50) = 1/50

從而
  Join Selectivity(t2.join1= t1.join1 or t2.join2 = t1.join2)
                  = 1/40+1/50-1/40*1/50=89/2000=0.0445
   
  Join Cardinality (ID=3) = 10000 * 0.0445=445

NL總的Cardinality = 445 * 10000 = 4450k

所以no_expand 的含義是不執行OR-expansion (例如,不使用Concatenation)

注:以上測試指令碼來自jonathan lewis 的《CBO》

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/354732/viewspace-622563/,如需轉載,請註明出處,否則將追究法律責任。

相關文章