Index Range Scan成本 Histogram 和 10053

foreverlee發表於2006-06-30

原貼發表在ITPUB

http://www.itpub.net/581612.html

試驗環境:
Linux AS4 + 10.2.0.1.0 - 64bit Production


為了模擬資料庫物件佔用多個block,故設定較高pctfree.

SQL> create table irs pctfree 80 tablespace users as
2 select rownum id,a.* from all_objects a;

Table created.

SQL> alter table irs add constraint irs_id_pk primary key(id);

Table altered.

SQL> create index irs_owner_idx on irs(owner) tablespace indx pctfree 80;

Index created.

SQL> show parameter optimizer_index

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_index_caching integer 0
optimizer_index_cost_adj integer 100

獲取表和索引的統計資訊
SQL> analyze table irs compute statistics for table for all indexes;

Table analyzed.

獲取列的histogram資訊
SQL> analyze table irs compute statistics for all indexed columns;

Table analyzed.

[@more@]

SQL> select owner,count(*) from irs
2 group by owner;

OWNER COUNT(*)
------------------------------ ----------
MDSYS 859
DMSYS 189
TSMSYS 2
CTXSYS 338
OLAPSYS 718
OUTLN 7
PUBLIC 20079
EXFSYS 279
SCOTT 6
SYSTEM 425
DBSNMP 46

OWNER COUNT(*)
------------------------------ ----------
LIYONG 2
ORDPLUGINS 10
ORDSYS 1669
SYSMAN 1291
XDB 346
CWT 310
PERFSTAT 139
SI_INFORMTN_SCHEMA 8
SYS 23134
WMSYS 232

21 rows selected.

SQL> select NUM_ROWS,BLOCKS,EMPTY_BLOCKS from dba_tables
2 where table_name='IRS';

NUM_ROWS BLOCKS EMPTY_BLOCKS
---------- ---------- ------------
50089 3423 33

SQL> select BLEVEL,LEAF_BLOCKS,DISTINCT_KEYS,CLUSTERING_FACTOR from dba_indexes
2 where INDEX_NAME='IRS_OWNER_IDX';

BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR
---------- ----------- ------------- -----------------
2 570 21 5890

SQL> select DENSITY,NUM_DISTINCT,HISTOGRAM from dba_tab_columns
2 where OWNER='LIYONG'
3 and TABLE_NAME='IRS'
4 and COLUMN_NAME='OWNER';

DENSITY NUM_DISTINCT HISTOGRAM
---------- ------------ ---------------
9.9822E-06 21 FREQUENCY

SQL> alter session set events '10053 trace name context forever ,level 2';

Session altered.

SQL> select * from irs where owner='SYSMAN';

1291 rows selected.

SQL> alter session set events '10053 trace name context off';

10053內容擷取
***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
Table: IRS Alias: IRS
#Rows: 50089 #Blks: 3423 AvgRowLen: 102.00
Index Stats::
Index: IRS_ID_PK Col#: 1
LVLS: 1 #LB: 104 #DK: 50089 LB/K: 1.00 DB/K: 1.00 CLUF: 3361.00
Index: IRS_OWNER_IDX Col#: 2
LVLS: 2 #LB: 570 #DK: 21 LB/K: 27.00 DB/K: 280.00 CLUF: 5890.00
***************************************
SINGLE TABLE ACCESS PATH
Column (#2): OWNER(VARCHAR2)
AvgLen: 5.00 NDV: 21 Nulls: 0 Density: 9.9822e-06
這裡看到OWNER列的Density: 9.9822e-06
Histogram: Freq #Bkts: 21 UncompBkts: 50089 EndPtVals: 21
Table: IRS Alias: IRS
Card: Original: 50089 Rounded: 1291 Computed: 1291.00 Non Adjusted: 1291.00 --這裡可以看到CBO根據列的histogram資訊統計出Card集的相關資訊,
SQL> select 1291/50089 from dual;

1291/50089
----------
.025774122
這樣可以準確算出effective index selectivity為0.025774122

Access Path: TableScan --全表掃描的代價為750
Cost: 754.17 Resp: 754.17 Degree: 0
Cost_io: 750.00 Cost_cpu: 35706109
Resp_io: 750.00 Resp_cpu: 35706109
Access Path: index (AllEqRange)
Index: IRS_OWNER_IDX
resc_io: 169.00 resc_cpu: 1991033
ix_sel: 0.025774 ix_sel_with_filters: 0.025774
Cost: 169.23 Resp: 169.23 Degree: 1
Best:: AccessPath: IndexRange Index: IRS_OWNER_IDX
Cost: 169.23 Degree: 1 Resp: 169.23 Card: 1291.00 Bytes: 0

根據索引掃描成本計算公式
cost = (cost(INDEX RANGE SCAN)+cost(TABLE ACCESS BY INDEX ROWID)) * optimizer_index_cost_adj%
= blevel +ceiling(leaf_blocks * effective index selectivity) + -- cost(INDEX RANGE SCAN)
ceiling(clustering_factor * effective table selectivity) --cost(TABLE ACCESS BY INDEX ROWID)
= 2 + ceil(570*0.025774) + ceil(5890*0.025774)
= 2 + 15 (17) -- cost(INDEX RANGE SCAN)
+ 152 --cost(TABLE ACCESS BY INDEX ROWID)
= 169

----------------------------------------------------+-----------------------------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time |
----------------------------------------------------+-----------------------------------+
| 0 | SELECT STATEMENT | | | | 169 | |
| 1 | TABLE ACCESS BY INDEX ROWID | IRS | 1291 | 129K | 169 | 00:00:03 |
| 2 | INDEX RANGE SCAN | IRS_OWNER_IDX| 1291 | | 17 | 00:00:01 |
----------------------------------------------------+-----------------------------------+
解釋一下整個sql執行過程:
1 首先根據謂詞 table_name='IRS'做INDEX RANGE SCAN,找到1291個Index Entrys,整個這個過程的代價為17.
2 Oracle根據Index Entrys中Rowid掃描原表中的blocks,獲取到1291條記錄. 整個TABLE ACCESS BY INDEX ROWID的代價為169-17=152.
3 TABLE ACCESS BY INDEX ROWID過程中,表的blocks被Server process讀入到SGA的DB Cache中,讀入的資料量近似的等於129K.(Oracle讀取一行記錄會讀整個block;命中率等因素)

SQL> select AVG_ROW_LEN,BLOCKS from dba_tables
2 where table_name='IRS';

AVG_ROW_LEN BLOCKS
----------- ----------
102 3423

SQL> select 102*1291/1024 from dual;

102*1291/1024
-------------
128.595703

關於索引的索引聚簇因子CLUF演算法下次再作探討.

索引掃描成本還和另一個引數有很大關係.先前eygle老大已經發表過專門的帖子.

我這裡拿來主義.

SQL> alter session set optimizer_index_cost_adj=40;

Session altered.

SQL> set autotrace traceonly

SQL> set linesize 150

SQL> select * from irs where owner='SYSMAN';

1291 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 4071038474

---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1291 | 128K| 68 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| IRS | 1291 | 128K| 68 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IRS_OWNER_IDX | 1291 | | 7 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
可以看到 Cost=68=ceil(169*0.4)


附:
在沒有列的histogram情況下的成本計算,我們看到偏差很大.CBO無法準確獲取到Card集的資訊,所以Oracle只能估算.


10053內容擷取
***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
Table: IRS Alias: IRS
#Rows: 50089 #Blks: 3423 AvgRowLen: 102.00
Index Stats::
Index: IRS_ID_PK Col#: 1
LVLS: 1 #LB: 104 #DK: 50089 LB/K: 1.00 DB/K: 1.00 CLUF: 3361.00
Index: IRS_OWNER_IDX Col#: 2
LVLS: 2 #LB: 570 (索引LEAF_BLOCKS個數) #DK: 21 (索引distinct key個數) LB/K: 27.00 DB/K: 280.00 CLUF: 5890.00 (索引聚簇因子,稍後會有詳細介紹)
***************************************
SINGLE TABLE ACCESS PATH
Column (#2): OWNER(VARCHAR2) NO STATISTICS (using defaults)
AvgLen: 30.00 NDV: 1565 Nulls: 0 Density: 6.3886e-04
Table: IRS Alias: IRS
Card: Original: 50089 Rounded: 501 Computed: 500.89 Non Adjusted: 500.89
Access Path: TableScan
Cost: 754.15 Resp: 754.15 Degree: 0
Cost_io: 750.00 Cost_cpu: 35516509
Resp_io: 750.00 Resp_cpu: 35516509
Access Path: index (AllEqGuess) -- 我們注意這裡索引的訪問路徑為index (AllEqGuess),我猜想是由於沒有OWNER列的Histogram,Oracle無法計算OWNER列的selectivity造成的
Index: IRS_OWNER_IDX
resc_io: 53.00 resc_cpu: 500046
ix_sel: 0.004 ix_sel_with_filters: 0.004
Cost: 53.06 Resp: 53.06 Degree: 1
Best:: AccessPath: IndexRange Index: IRS_OWNER_IDX
Cost: 53.06 Degree: 1 Resp: 53.06 Card: 500.89 Bytes: 0
根據10053 Oracle評估出:
cost(TableScan)=750
cost(index)=53.06
所以選擇索引掃描為執行計劃.
----------------------------------------------------+-----------------------------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time |
----------------------------------------------------+-----------------------------------+
| 0 | SELECT STATEMENT | | | | 53 | |
| 1 | TABLE ACCESS BY INDEX ROWID | IRS | 501 | 50K | 53 | 00:00:01 |
| 2 | INDEX RANGE SCAN | IRS_OWNER_IDX| 200 | | 29 | 00:00:01 |
----------------------------------------------------+-----------------------------------+

根據索引掃描成本計算公式
cost = (cost(INDEX RANGE SCAN)+cost(TABLE ACCESS BY INDEX ROWID)) * optimizer_index_cost_adj%
= blevel +ceiling(leaf_blocks * effective index selectivity) + -- cost(INDEX RANGE SCAN)
ceiling(clustering_factor * effective table selectivity) --cost(TABLE ACCESS BY INDEX ROWID)
= 2 + ceil(570*0.004) + ceil(5890*0.004)
= 2 + 3 + 24
= 29

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/76065/viewspace-846417/,如需轉載,請註明出處,否則將追究法律責任。

相關文章