oracle實驗記錄 (oracle單表選擇率與基數計算(1))
單表選擇率與 基數(cardinality)有關係,也就是執行計劃中的CARD 資訊
CARD表示oracle 認為此操作將返回多少rows
基本的CARD 計算 就等於 num_rows*選擇率
CARD 對於otimizer 選初始的連線順序,選INDEX 都有很大影響,所以能夠看明白計算過程方法,對優化是很有幫助的
實驗 單表中選擇率 (本次實驗學習參考cost-based oracle)
SQL> create table t1(a int);
表已建立。
SQL> ed
已寫入 file afiedt.buf
1 declare
2 begin
3 for i in 1..10000 loop
4 insert into t1 values(i);
5 end loop;
6 commit;
7* end;
SQL> /
PL/SQL 過程已成功完成。
SQL> execute dbms_stats.gather_table_stats('SYS','T1');
PL/SQL 過程已成功完成。
SQL> select table_name,density,num_nulls,num_distinct from user_tab_col_statisti
cs where table_name='T1';
TABLE_NAME DENSITY NUM_NULLS NUM_DISTINCT
------------------------------ ---------- ---------- ------------
T1 .0001 0 10000
SQL> set autotrace traceonly explain
SQL> select count(*) from t1;
執行計劃
----------------------------------------------------------
Plan hash value: 3724264953
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 6 (0)| 00:00:01 |
-------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement
~沒有謂詞 沒有過濾 掃描所有行 所以card=10000
SQL> select count(*) from t1 where a=2;
執行計劃
----------------------------------------------------------
Plan hash value: 3724264953
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T1 | 1 | 3 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"=2)
card=num_row*1/num_distinct=1
SQL> select 10000/10000 from dual;
10000/10000
-----------
1
如果要是這個表存在HISTOGRAM 那麼,OPTIMIZER將使用DENSITY 列 ,CARD=NUM_rows*density
以上的如果 謂詞 就是COLUMN=字面值 那麼 card=num_row/num_distinct (=num_row*1/num_distinct)如果有histogram card=NUM_rows*density
~~~~~~~~~~~~~~~~~~~~~~~~~~沒有統計資訊,使用動態採集
SQL> create table t2 (a int);
表已建立。
SQL> declare
2 begin
3 for i in 1..10000 loop
4 insert into t2 values(i);
5 end loop;
6 commmit;
7 end;
8 /
commmit;
SQL> ed
已寫入 file afiedt.buf
1 declare
2 begin
3 for i in 1..10000 loop
4 insert into t2 values(i);
5 end loop;
6 commit;
7* end;
8 /
PL/SQL 過程已成功完成。
SQL> select table_name,density,num_nulls,num_distinct from user_tab_col_statisti
cs where table_name='T2';~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~沒有統計資訊
未選定行
SQL> show parameter dyna
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_dynamic_sampling integer 2~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~oracle將使用動態採集
SQL> select count(*) from t2;
執行計劃
----------------------------------------------------------
Plan hash value: 3321871023
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| T2 | 10000 | 6 (0)| 00:00:01 |
-------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement
SQL> select count(*) from t2 where a=2;
執行計劃
----------------------------------------------------------
Plan hash value: 3321871023
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
|* 2 | TABLE ACCESS FULL| T2 | 1 | 13 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"=2)~~~~~~~~
Note
-----
- dynamic sampling used for this statement~~~~~~~~~~~~~~~~~~~~~可以看到 ORACLE 利用動態採集的資訊 也是很準確的
Dynamic Sampling Levels(以下是動態採集 各個LEVEL 說明)
The sampling levels are as follows if the dynamic sampling level used is from a cursor hint or from the OPTIMIZER_DYNAMIC_SAMPLING initialization parameter:
Level 0: Do not use dynamic sampling.
Level 1: Sample all tables that have not been analyzed if the following criteria are met: (1) there is at least 1 unanalyzed table in the query; (2) this
unanalyzed table is joined to another table or appears in a subquery or non-mergeable view; (3) this unanalyzed table has no indexes; (4) this unanalyzed
table has more blocks than the number of blocks that would be used for dynamic sampling of this table. The number of blocks sampled is the default number of
dynamic sampling blocks (32).
Level 2: Apply dynamic sampling to all unanalyzed tables. The number of blocks sampled is two times the default number of dynamic sampling blocks.
Level 3: Apply dynamic sampling to all tables that meet Level 2 criteria, plus all tables for which standard selectivity estimation used a guess for some
predicate that is a potential dynamic sampling predicate. The number of blocks sampled is the default number of dynamic sampling blocks. For unanalyzed
tables, the number of blocks sampled is two times the default number of dynamic sampling blocks.
Level 4: Apply dynamic sampling to all tables that meet Level 3 criteria, plus all tables that have single-table predicates that reference 2 or more columns.
The number of blocks sampled is the default number of dynamic sampling blocks. For unanalyzed tables, the number of blocks sampled is two times the default
number of dynamic sampling blocks.
Levels 5, 6, 7, 8, and 9: Apply dynamic sampling to all tables that meet the previous level criteria using 2, 4, 8, 32, or 128 times the default number of
dynamic sampling blocks respectively.
Level 10: Apply dynamic sampling to all tables that meet the Level 9 criteria using all blocks in the table.
The sampling levels are as follows if the dynamic sampling level for a table is set using the DYNAMIC_SAMPLING optimizer hint:
Level 0: Do not use dynamic sampling.
Level 1: The number of blocks sampled is the default number of dynamic sampling blocks (32).
Levels 2, 3, 4, 5, 6, 7, 8, and 9: The number of blocks sampled is 2, 4, 8, 16, 32, 64, 128, or 256 times the default number of dynamic sampling blocks
respectively.
Level 10: Read all blocks in the table.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SQL> alter system set optimizer_dynamic_sampling=1;
系統已更改。
SQL> alter system set optimizer_dynamic_sampling=1; ~~~~關閉動態採集
系統已更改。
SSQL> set autotrace traceonly explain
SQL> alter session set events '10053 trace name context forever';
會話已更改。
SQL> select count(*) from t2;
執行計劃
----------------------------------------------------------
Plan hash value: 3321871023
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| T2 | 82 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------
SQL> select count(*) from t2 where a=2;
執行計劃
----------------------------------------------------------
Plan hash value: 3321871023
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
|* 2 | TABLE ACCESS FULL| T2 | 1 | 13 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"=2)
~~從trace中可以看到 資訊十分不準確
Table Stats::
Table: T2 Alias: T2 (NOT ANALYZED)
#Rows: 82 #Blks: 1 AvgRowLen: 100.00
***************************************
SINGLE TABLE ACCESS PATH
Table: T2 Alias: T2
Card: Original: 82 Rounded: 82 Computed: 82.00 Non Adjusted: 82.00
Access Path: TableScan
Cost: 2.00 Resp: 2.00 Degree: 0
Cost_io: 2.00 Cost_cpu: 19421
Resp_io: 2.00 Resp_cpu: 19421
Best:: AccessPath: TableScan
Cost: 2.00 Degree: 1 Resp: 2.00 Card: 82.00 Bytes: 0
Table Stats::
Table: T2 Alias: T2 (NOT ANALYZED) ~~~~~~~~~~~沒統計資訊(按公式算的話oracle 自己決定num_distinct 為100)
#Rows: 82 #Blks: 1 AvgRowLen: 100.00
***************************************
SINGLE TABLE ACCESS PATH
Column (#1): A(NUMBER) NO STATISTICS (using defaults)
AvgLen: 22.00 NDV: 3 Nulls: 0 Density: 0.39024
Table: T2 Alias: T2
Card: Original: 82 Rounded: 1 Computed: 0.82 Non Adjusted: 0.82
Access Path: TableScan
Cost: 2.00 Resp: 2.00 Degree: 0
Cost_io: 2.00 Cost_cpu: 23521
Resp_io: 2.00 Resp_cpu: 23521
Best:: AccessPath: TableScan
Cost: 2.00 Degree: 1 Resp: 2.00 Card: 0.82 Bytes: 0
````````````````````````````````````````````````````````````````````
一個問題
關於 列中 值 嚴重不均勻
create table t3(a int)
SQL> ed
已寫入 file afiedt.buf
1 declare
2 begin
3 for i in 1..1000 loop
4 insert into t3 values(1);
5 end loop;
6 for i in 2..9001 loop
7 insert into t3 values(i);
8 end loop;
9 commit;
10* end;
11 /
PL/SQL 過程已成功完成。
SQL> execute dbms_stats.gather_table_stats('SYS','T3');
PL/SQL 過程已成功完成。
SQL> select count(*) from t3;
COUNT(*)
----------
10000
SQL> select num_rows from user_tables where table_name='T3';
NUM_ROWS
----------
10000
SQL> select table_name,density,num_nulls,num_distinct from user_tab_col_statisti
cs where table_name='T3';
TABLE_NAME DENSITY NUM_NULLS NUM_DISTINCT
------------------------------ ---------- ---------- ------------
T3 .000111099 0 9001
SQL> select count(*) from t3 where a=1;
COUNT(*)
----------
1000
從上面資訊可以看到 ~~列A 有10000ROWS 其中A=1的 1000 ROWS, A=2 到9001 的有9000 ROW
這樣NUM_DISTINCT=9001
SQL> set autotrace traceonly explain
SQL> alter session set events '10053 trace name context forever';
會話已更改。
SQL> select count(*) from t3 where a=1;
執行計劃
----------------------------------------------------------
Plan hash value: 463314188
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 7 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T3 | 1 | 3 | 7 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"=1)
可以看到CARD 非常不準確 還是按公式 num_rows*1/num_distinct算的
SQL> select 10000*1/9001 from dual;
10000*1/9001
------------
1.11098767
***********************
Table Stats::
Table: T3 Alias: T3
#Rows: 10000 #Blks: 23 AvgRowLen: 3.00
***************************************
SINGLE TABLE ACCESS PATH
Column (#1): A(NUMBER)
AvgLen: 4.00 NDV: 9001 Nulls: 0 Density: 1.1110e-004 Min: 1 Max: 9001
Table: T3 Alias: T3
Card: Original: 10000 Rounded: 1 Computed: 1.11 Non Adjusted: 1.11
Access Path: TableScan
Cost: 7.14 Resp: 7.14 Degree: 0
Cost_io: 7.00 Cost_cpu: 2163793
Resp_io: 7.00 Resp_cpu: 2163793
Best:: AccessPath: TableScan
Cost: 7.14 Degree: 1 Resp: 7.14 Card: 1.11 Bytes: 0~~~~~~~~~~~~~~~~~~看到trace 中card =1.11 ,實際應該為1000
我們可以用HISTOGREAM來解決
1 begin
2 dbms_stats.gather_table_stats(
3 'SYS',
4 't3',
5 cascade => true,
6 estimate_percent => null,
7 method_opt => 'for all columns size 120'
8 );
9* end;
SQL> /
PL/SQL 過程已成功完成。
使用HISTOGRAM收集統計資訊後 ORACLE 將使用DENSITY 來計算
SQL> select table_name,density,num_nulls,num_distinct from user_tab_col_statisti
cs where table_name='T3';
TABLE_NAME DENSITY NUM_NULLS NUM_DISTINCT
------------------------------ ---------- ---------- ------------
T3 .0001 0 9001
SQL> set autotrace traceonly explain
SQL> alter session set events '10053 trace name context forever';
會話已更改。
SQL> select count(*) from t3 where a=1;
執行計劃
----------------------------------------------------------
Plan hash value: 463314188
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 7 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T3 | 917 | 2751 | 7 (0)| 00:00:01 |~~~~~~~~~~可以看到CARD接近了=917
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"=1)
SQL> select count(*) from t3 where a=2;
執行計劃
----------------------------------------------------------
Plan hash value: 463314188
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 7 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T3 | 1 | 3 | 7 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"=2)
SQL>
上面可以看出 A=1時候CARD=917 ORACLE 並沒有使用density來計算, A=2時候用的DENSITY計算的 card=10000*0.0001=1
針對a=1 oracle 使用的 histogram bucket來計算的
SQL> select endpoint_number,endpoint_value from dba_histograms where table_name=
'T3' and rownum<5;
ENDPOINT_NUMBER ENDPOINT_VALUE
--------------- --------------
11 1
12 9
13 93
14 177
可以看出histogram是一個 高度均衡的(HISTOGRAM分2種 第一中 頻率histogram 每個bucket存一個值 當列的histogram bucket >=列distinct時候為頻率 ,另一種高度均衡
HISTOGRAM
當列DISTINCT比 histogram bucket多 時為高度均衡 每個bucket存相同數量的值)
此例為高度均衡 可以看出來 1-11 BUCKET 存 值1,其實第12個BUCKET中也存值1, 因為12-13中可以看出每個BUCKET 存 84個值(93-9)(此例中除值1外 其他都唯一,所以很好看
出來),而 第12個 BUCKET 最高存的值為9 這樣表示 第12個BUCKET還存了 值1(應該是76 =84-8 ,8是2到9的個數) ,oracle 發現 值1 是一個高頻率出現的值(跨越的多個
桶),oracle將採用BUCKET計算11/120(跨越bucket數/總bucket數)=.091666667 從這裡看到了雖然BUCKET12中也有值1 但ORACLE沒算進來(因為該bucket中還有其它值2-8) 用
11/120 這也就造成了CARD只是接近而不是準確的1000
關於histogram 後面將會有與histogram詳細的實驗~~
card=10000*.091666667= 916.66667
看下trace
SINGLE TABLE ACCESS PATH
Column (#1): A(NUMBER)
AvgLen: 4.00 NDV: 9001 Nulls: 0 Density: 1.0000e-004 Min: 1 Max: 9001
Histogram: HtBal #Bkts: 120 UncompBkts: 120 EndPtVals: 110
Table: T3 Alias: T3
Card: Original: 10000 Rounded: 917 Computed: 916.67 Non Adjusted: 916.67
Access Path: TableScan
Cost: 7.14 Resp: 7.14 Degree: 0
Cost_io: 7.00 Cost_cpu: 2163793
Resp_io: 7.00 Resp_cpu: 2163793
Best:: AccessPath: TableScan
Cost: 7.14 Degree: 1 Resp: 7.14 Card: 916.67 Bytes: 0 ************ Card: 916.67
***************************************
~列中有NULL的情況
SQL> create table t6 (a int);
表已建立。
1 declare
2 begin
3 for i in 1..1000 loop
4 insert into t6 values(1);
5 end loop;
6 for i in 2..9001 loop
7 insert into t6 values(i);
8 end loop;
9 commit;
10* end;
11 /
PL/SQL 過程已成功完成。
SQL> update t6 set a=null where a>8001;
已更新1000行。
SQL> commit;
提交完成。
SQL> execute dbms_stats.gather_table_stats('SYS','T6');
PL/SQL 過程已成功完成。
SQL> select table_name,density,num_nulls,num_distinct from user_tab_col_statisti
cs where table_name='T6';
TABLE_NAME DENSITY NUM_NULLS NUM_DISTINCT
------------------------------ ---------- ---------- ------------
T6 .000124984 1000 8001
SQL> select num_rows from user_tables where table_name='T6';
NUM_ROWS
----------
10000
可以看出工10000ROWS 空值為1000
SQL> set autotrace traceonly explain
SQL> alter session set events '10053 trace name context forever';
會話已更改。
SQL> select count(*) from t6 where a=2;
執行計劃
----------------------------------------------------------
Plan hash value: 4096694858
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T6 | 1 | 3 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"=2)
包含NULL的 選擇率按以下公式算
Adjusted selectivity = Base selectivity * (num_rows - num_nulls) / num_rows
=1/8001*((10000-1000)/10000)
card = NUM_ROWS*Adjusted selectivity=10000*(/8001*((10000-1000)/10000))
SQL> select 1/8001*((10000-1000)/10000) from dual;
1/8001*((10000-1000)/10000)
---------------------------
.000112486
SQL> select 10000*0.000112486 from dual;
10000*0.000112486
-----------------
1.12486
SINGLE TABLE ACCESS PATH
Column (#1): A(NUMBER)
AvgLen: 4.00 NDV: 8001 Nulls: 1000 Density: 1.2498e-004 Min: 1 Max: 8001
Table: T6 Alias: T6
Card: Original: 10000 Rounded: 1 Computed: 1.12 Non Adjusted: 1.12
Access Path: TableScan
Cost: 6.14 Resp: 6.14 Degree: 0
Cost_io: 6.00 Cost_cpu: 2142429
Resp_io: 6.00 Resp_cpu: 2142429
Best:: AccessPath: TableScan
Cost: 6.14 Degree: 1 Resp: 6.14 Card: 1.12 Bytes: 0====================card 1.12
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
關於使用BIND
SQL> set autotrace traceonly explain
SQL> set autotrace off
SQL> variable a number
SQL> set autotrace traceonly explain
SQL> select count(*) from t6 where a=:a;*********************
執行計劃
----------------------------------------------------------
Plan hash value: 4096694858
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T6 | 1 | 3 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"=TO_NUMBER(:A))
對於BIND 所使用的 選擇率= COLUMN=字面值時候的選擇率(字面值情況也要分是否列中有NULL來計算,此例中 列中有NULL)
SQL> select table_name,density,num_nulls,num_distinct from user_tab_col_statisti
cs where table_name='T6';
TABLE_NAME DENSITY NUM_NULLS NUM_DISTINCT
------------------------------ ---------- ---------- ------------
T6 .000124984 1000 8001~~~~~~~~~~~~~~~~~包含NULL
SQL> select num_rows from user_tables where table_name='T6';
NUM_ROWS
----------
10000
Adjusted selectivity = Base selectivity * (num_rows - num_nulls) / num_rows~~~~~~~~~~~~~~~調整選擇率
=1/8001*((10000-1000)/10000)
BIN選擇率=Adjusted selectivity(是字面值時候 情況的選擇率)
card = NUM_ROWS*Adjusted selectivity=10000*(/8001*((10000-1000)/10000))
SQL> select 1/8001*((10000-1000)/10000) from dual;
1/8001*((10000-1000)/10000)
---------------------------
.000112486
SQL> select 10000*0.000112486 from dual;
10000*0.000112486
-----------------
1.12486
trace
SINGLE TABLE ACCESS PATH
Column (#1): A(NUMBER)
AvgLen: 4.00 NDV: 8001 Nulls: 1000 Density: 1.2498e-004 Min: 1 Max: 8001
Table: T6 Alias: T6
Card: Original: 10000 Rounded: 1 Computed: 1.12 Non Adjusted: 1.12
Access Path: TableScan
Cost: 6.21 Resp: 6.21 Degree: 0
Cost_io: 6.00 Cost_cpu: 3142429
Resp_io: 6.00 Resp_cpu: 3142429
Best:: AccessPath: TableScan
Cost: 6.21 Degree: 1 Resp: 6.21 Card: 1.12 Bytes: 0
***************************************
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
使用INLIST
SQL> create table t4 (a int);
表已建立。
SQL> declare
2 begin
3 for i in 1..10000 loop
4 insert into t4 values(i);
5 end loop;
6 commit;
7 end;
8 /
PL/SQL 過程已成功完成。
SQL> execute dbms_stats.gather_table_stats('SYS','T4');
PL/SQL 過程已成功完成。
SQL> select num_rows from user_tables where table_name='T4';
NUM_ROWS
----------
10000
SQL> select table_name,density,num_nulls,num_distinct from user_tab_col_statisti
cs where table_name='T4';
TABLE_NAME DENSITY NUM_NULLS NUM_DISTINCT
------------------------------ ---------- ---------- ------------
T4 .0001 0 10000
SQL> col table_name format a10
SQL> select table_name,density,num_nulls,num_distinct,HIGH_VALUE from user_tab_c
ol_statistics where table_name='T4';
TABLE_NAME DENSITY NUM_NULLS NUM_DISTINCT HIGH_VALUE
---------- ---------- ---------- ------------ --------------------
T4 .0001 0 10000 C302
SQL> variable a number
SQL> variable b number
SQL> set autotrace traceonly explain
SQL> alter session set events '10053 trace name context forever';
會話已更改。
SQL> select count(*) from t4 where a in (1,2);~~~~~~~~~2個
執行計劃
----------------------------------------------------------
Plan hash value: 405148644
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T4 | 2 | 6 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"=1 OR "A"=2)
SQL> select count(*) from t4 where a in (1,2,2);~~~~~~~~~~~~含重複值得
執行計劃
----------------------------------------------------------
Plan hash value: 405148644
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T4 | 2 | 6 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"=1 OR "A"=2)
SQL> select count(*) from t4 where a in (1,2,3);~~~~~~~~~~3個
執行計劃
----------------------------------------------------------
Plan hash value: 405148644
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T4 | 3 | 9 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"=1 OR "A"=2 OR "A"=3)
SQL> select count(*) from t4 where a in (1,2,2,null);~~~~~~~包含空值
執行計劃
----------------------------------------------------------
Plan hash value: 405148644
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T4 | 3 | 9 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"=1 OR "A"=2 OR "A"=TO_NUMBER(NULL))
SQL> select count(*) from t4 where a in (1,100052);~~~~~~~~~超過 該列的最大值(HIGHT_VALUE)
執行計劃
----------------------------------------------------------
Plan hash value: 405148644
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T4 | 2 | 6 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"=1 OR "A"=100052)
SQL> select count(*) from t4 where a in (:a,:b);~~~~~~~~~~使用變數
執行計劃
----------------------------------------------------------
Plan hash value: 405148644
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T4 | 2 | 6 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"=TO_NUMBER(:A) OR "A"=TO_NUMBER(:B))
Moreover, when you check the values for 9i and 10g, you see
that the cardinalities always come out as N * number of entries in the list until the number of
entries in the list exceeds the number of distinct values
結論:關於inlist card= n*(num_rows*1/num_distinct) (N 為中 inlist (數目 ))
oracle可以判斷出重複值,~~檢查不出是否超出最大值,是否含NULL
另外INLIST (N,N)中N的數目超過列的Num_distinct資料那麼 card=num_distinct*(num_rows*1/num_distinct)=num_rows
早期8I的 IN LIST 計算 是按OR算的
in (1,2,3)= 1 or 2 or 3 8I 是這麼算的 選擇率
sel(A or B or C) =
sel(A) + sel(B) + sel(C) – Sel(A)sel(B) – Sel(B)sel(C) – sel(C)sel(A) + Sel(A)Sel(B)Sel(C)
8I INLIST CARD=NUM_ROWS*1/(sel(A) + sel(B) + sel(C) – Sel(A)sel(B) – Sel(B)sel(C) – sel(C)sel(A) + Sel(A)Sel(B)Sel(C))****
~~~~~~~~~~~~~~~~~~~~~~~NOT IN
SQL> select count(*) from t4 where a in (1,2,2);用HINTS USE_CONCAT那麼可以將 OR操作轉換為 UNION ALL 操作(3個SELECT FROM X WHERE =1 union all=2 union all=3)
執行計劃
----------------------------------------------------------
Plan hash value: 405148644
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T4 | 2 | 6 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"=1 OR "A"=2)
SQL> select count(*) from t4 where a not in (1,2);
執行計劃
----------------------------------------------------------
Plan hash value: 405148644
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T4 | 9998 | 29994 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"<>1 AND "A"<>2)
SQL> select count(*) from t4 where a not in (1,2,2);
執行計劃
----------------------------------------------------------
Plan hash value: 405148644
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T4 | 9998 | 29994 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"<>1 AND "A"<>2)
SQL> select count(*) from t4 where a not in (1,2,3);
執行計劃
----------------------------------------------------------
Plan hash value: 405148644
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T4 | 9997 | 29991 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"<>1 AND "A"<>2 AND "A"<>3)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~可以看出是自相容的 card=NUM_ROWS-n*(num_rows*1/num_distinct)
關於like*************
SQL> set autotrace traceonly explain
SQL> select count(*) from t6 where a like '%1';
執行計劃
----------------------------------------------------------
Plan hash value: 4096694858
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T6 | 500 | 1500 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_CHAR("A") LIKE '%1')
SQL> select count(*) from t6 where a like '1%';
執行計劃
----------------------------------------------------------
Plan hash value: 4096694858
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T6 | 500 | 1500 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_CHAR("A") LIKE '1%')
SQL> select count(*) from t6 where a like '1%';
執行計劃
----------------------------------------------------------
Plan hash value: 4096694858
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------ss----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T6 | 500 | 1500 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_CHAR("A") LIKE '1%')
SQL> set autotrace off
SQL> select count(*) from t6 where a like '1%';
COUNT(*)
----------
2110
SQL>
選擇率 應該是5% CARD=NUM_ROW*5%=500
另外 LIKE %a 是不走INDEX 的,而 like a% 是走index的 針對 LIKE%A 不 走 , 可以建一個reverse(X) 的函式index 查詢時候 使用reverse(XX) like '%a' 可以走這個函式index
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/12020513/viewspace-614301/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- oracle實驗記錄 (oracle單表選擇率與基數計算(2))Oracle
- oracle實驗記錄 (oracle單表選擇率與基數計算(3))Oracle
- oracle實驗記錄 (選擇率)Oracle
- oracle實驗記錄 (FTS的cost與基數計算)Oracle
- oracle實驗記錄 (連線選擇率,範圍與null)OracleNull
- oracle實驗記錄 (分割槽表,掃描基數的計算)Oracle
- oracle計算表的記錄數Oracle
- oracle實驗記錄 (oracle b*tree index訪問Cost計算(1))OracleIndex
- oracle實驗記錄 (oracle consistent gets 計算)Oracle
- 【cbo計算公式】單表選擇率(二)公式
- oracle實驗記錄 (全表掃描COST計算方法)Oracle
- oracle實驗記錄 (計算hash join cost)Oracle
- oracle實驗記錄 (storage儲存引數(1))Oracle
- oracle實驗記錄 (基礎,truncate與delete區別實驗)Oracledelete
- oracle實驗記錄 (快取記憶體命中率與成本)Oracle快取記憶體
- oracle實驗記錄 (oracle 分析shared pool(1))Oracle
- oracle實驗記錄 (oracle 詳細分析redo(1))Oracle
- oracle實驗記錄 手工 duplicate database(1)OracleDatabase
- oracle實驗記錄Rman duplicate database(1)OracleDatabase
- oracle實驗記錄(INDEX fast full scan 的成本計算)OracleIndexAST
- oracle實驗記錄 (關於表實際大小)Oracle
- oracle實驗記錄 (線上重定義表)Oracle
- oracle實驗記錄 (buffer_cache分析(1))Oracle
- oracle實驗記錄 (oracle reset parameter)Oracle
- oracle實驗記錄 (oracle 10G dataguard(1)手工搭建)Oracle
- 選擇率(selectivity)與基數(cardinality)
- oracle實驗記錄 (database_properties與表空間屬性)OracleDatabase
- oracle實驗記錄 (logfile基礎操作)Oracle
- oracle實驗記錄 (flashback)Oracle
- oracle實驗記錄 (OMF)Oracle
- oracle實驗記錄 (NET)Oracle
- oracle實驗記錄 (audit)Oracle
- oracle實驗記錄 (子游標與解析)Oracle
- oracle實驗記錄 (oracle 資料字典)Oracle
- oracle實驗記錄 (恢復-rman維護(1))Oracle
- oracle實驗記錄 (oracle 關於instance引數設定)Oracle
- oracle實驗記錄 (oracle 10G dataguard(2)引數部分)Oracle
- CBO中基數(cardinality)、可選擇率(selectivity)的計算公式公式