[Oracle] Clustering Factor
1.Clustering Factor is a measure of the orderedness of an index in comparison to the table that is based upon
It is used as an indicator for computing the estimated cost of the table lookup following an index access.
2.The Clustering Factor records the number of data blocks that will be accessed when scanning an index.
3.Clustering Factor only happen on Physical reads.
SQL> create table tt5(id1 int,id2 int);
Table created.
SQL> create index tt5_i1 on tt5(id1);
Index created.
SQL> create index tt5_i2 on tt5(id2);
Index created.
SQL> insert into tt5 select level,trunc(dbms_random.value(0,10000)) from dual connect by level<=10000;
10000 rows created.
SQL> commit;
Commit complete.
SQL> exec dbms_stats.gather_table_stats('scott','tt5');
PL/SQL procedure successfully completed.
SQL> select table_name,num_rows,blocks,sample_size from user_tables where table_name='TT5';
TABLE_NAME NUM_ROWS BLOCKS SAMPLE_SIZE
-------------------- ---------- ---------- -----------
TT5 10000 20 10000
SQL> select column_name,num_distinct,num_nulls,density from user_tab_columns where table_name='TT5';
COLUMN_NAME NUM_DISTINCT NUM_NULLS DENSITY
-------------------- ------------ ---------- ----------
ID1 10000 0 .0001
ID2 6262 0 .000159693
SQL> select index_name,blevel,leaf_blocks,clustering_factor from user_indexes where table_name='TT5';
INDEX_NAME BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- ---------- ----------- -----------------
TT5_I1 1 19 18
TT5_I2 1 30 9419
index tt5_i1's clustering factor is close tt5 table's blocks number.This is good.
index tt5_i2's clustering factor is far than tt5 table's blocks number.This is not good.
test:
sql1
SQL> explain plan for select * from tt5 where id1 between 1 and 100;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
Plan hash value: 3764321786
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 700 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TT5 | 100 | 700 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TT5_I1 | 100 | | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID1">=1 AND "ID1"<=100)
14 rows selected.
sql2
SQL> explain plan for select * from tt5 where id2 between 1 and 100;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
Plan hash value: 3997547206
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 101 | 707 | 6 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TT5 | 101 | 707 | 6 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ID2"<=100 AND "ID2">=1)
13 rows selected.
可以發現第兩條sql的cost比第一條的.而且sql2的選擇的行數比例為101/10000 約等於 1%,確沒有走index,比較奇怪。
看一下強制走index,cost是多少.
SQL> explain plan for select /*+index(tt5)*/ * from tt5 where id2 between 1 and 100;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------
Plan hash value: 2796517782
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 101 | 707 | 97 (0)| 00:00:02 |
| 1 | TABLE ACCESS BY INDEX ROWID| TT5 | 101 | 707 | 97 (0)| 00:00:02 |
|* 2 | INDEX RANGE SCAN | TT5_I2 | 101 | | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID2">=1 AND "ID2"<=100)
14 rows selected.
可以發現強制走index的時候cost為97,cost更高。
clustering factor的解決辦法只有一個就是rebuild table
It is used as an indicator for computing the estimated cost of the table lookup following an index access.
2.The Clustering Factor records the number of data blocks that will be accessed when scanning an index.
3.Clustering Factor only happen on Physical reads.
SQL> create table tt5(id1 int,id2 int);
Table created.
SQL> create index tt5_i1 on tt5(id1);
Index created.
SQL> create index tt5_i2 on tt5(id2);
Index created.
SQL> insert into tt5 select level,trunc(dbms_random.value(0,10000)) from dual connect by level<=10000;
10000 rows created.
SQL> commit;
Commit complete.
SQL> exec dbms_stats.gather_table_stats('scott','tt5');
PL/SQL procedure successfully completed.
SQL> select table_name,num_rows,blocks,sample_size from user_tables where table_name='TT5';
TABLE_NAME NUM_ROWS BLOCKS SAMPLE_SIZE
-------------------- ---------- ---------- -----------
TT5 10000 20 10000
SQL> select column_name,num_distinct,num_nulls,density from user_tab_columns where table_name='TT5';
COLUMN_NAME NUM_DISTINCT NUM_NULLS DENSITY
-------------------- ------------ ---------- ----------
ID1 10000 0 .0001
ID2 6262 0 .000159693
SQL> select index_name,blevel,leaf_blocks,clustering_factor from user_indexes where table_name='TT5';
INDEX_NAME BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- ---------- ----------- -----------------
TT5_I1 1 19 18
TT5_I2 1 30 9419
index tt5_i1's clustering factor is close tt5 table's blocks number.This is good.
index tt5_i2's clustering factor is far than tt5 table's blocks number.This is not good.
test:
sql1
SQL> explain plan for select * from tt5 where id1 between 1 and 100;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
Plan hash value: 3764321786
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 700 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TT5 | 100 | 700 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TT5_I1 | 100 | | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID1">=1 AND "ID1"<=100)
14 rows selected.
sql2
SQL> explain plan for select * from tt5 where id2 between 1 and 100;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
Plan hash value: 3997547206
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 101 | 707 | 6 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TT5 | 101 | 707 | 6 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ID2"<=100 AND "ID2">=1)
13 rows selected.
可以發現第兩條sql的cost比第一條的.而且sql2的選擇的行數比例為101/10000 約等於 1%,確沒有走index,比較奇怪。
看一下強制走index,cost是多少.
SQL> explain plan for select /*+index(tt5)*/ * from tt5 where id2 between 1 and 100;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------
Plan hash value: 2796517782
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 101 | 707 | 97 (0)| 00:00:02 |
| 1 | TABLE ACCESS BY INDEX ROWID| TT5 | 101 | 707 | 97 (0)| 00:00:02 |
|* 2 | INDEX RANGE SCAN | TT5_I2 | 101 | | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID2">=1 AND "ID2"<=100)
14 rows selected.
可以發現強制走index的時候cost為97,cost更高。
clustering factor的解決辦法只有一個就是rebuild table
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/24237320/viewspace-2063649/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- oracle的clustering_factorOracle
- Clustering Factor——索引的成本指標索引指標
- 高效的SQL( clustering factor減少COST)SQL
- CLUSTERING_FACTOR影響執行計劃
- DBA_INDEXES.CLUSTERING_FACTOR 索引的群集因子初探Index索引
- 淺談索引系列之聚簇因子(clustering_factor)索引
- DBA_INDEXES.CLUSTERING_FACTOR 索引的群集因子初探(原)Index索引
- clustering factor索引聚簇因子和執行計劃索引
- 11g 是否有解決clustering_factor 高的問題
- 關於索引聚簇因子(CLUSTERING_FACTOR)引起的sql優化過程索引SQL優化
- CLUSTERING_FACTOR、回表、資料分佈傾斜走全表還是索引索引
- 索引掃描可能不如全表掃描的場景的理解__純粹資料量而言,不涉及CLUSTERING_FACTOR索引
- Crest Factor Reduction(CFR)REST
- 密度聚類。Clustering by fast search and聚類AST
- hudi clustering 資料聚集(一)
- hudi clustering 資料聚集(二)
- Agglomerative Hierarchical Clustering詳解
- cluster factor對執行計劃的影響
- LeetCode-Factor CombinationsLeetCode
- hudi clustering 資料聚集(三 zorder使用)
- 譜聚類(spectral clustering)原理總結聚類
- 關於 estd_physical_read_factor 的翻譯
- 《推薦系統實踐》關於Latent Factor Model
- 各類聚類(clustering)演算法初探聚類演算法
- 論文解讀(GCC)《Graph Contrastive Clustering》GCAST
- 論文解讀SDCN《Structural Deep Clustering Network》Struct
- 論文解讀(DFCN)《Deep Fusion Clustering Network》
- Clustering and Projected Clustering with Adaptive Neighbors(自適應鄰域聚類CAN和自適應鄰域投影聚類PCAN)ProjectAPT聚類PCA
- 論文解讀(SCGC)《Simple Contrastive Graph Clustering》GCAST
- 文獻閱讀 — Clustering by Fast Search and Find of Density PeaksAST
- 超休閒遊戲的LTV,K-factor與綜合出價模型遊戲模型
- 論文解讀(DCRN)《Deep Graph Clustering via Dual Correlation Reduction》
- 論文解讀DEC《Unsupervised Deep Embedding for Clustering Analysis》
- 論文解讀(DAEGC)《Improved Deep Embedded Clustering with Local Structure Preservation》GCStruct
- 論文解讀(AGCN)《 Attention-driven Graph Clustering Network》GC
- 一文徹底掌握Apache Hudi非同步Clustering部署Apache非同步
- Spring Boot微服務中的十二因子方法論(12Factor) - BaeldungSpring Boot微服務
- 12-factor應用和微服務架構應用的區別微服務架構