降低列值聚簇因子提高查詢效率

oracle 在 dba_indexes 檢視中提供一個名為 clustering_factor 的列，通知最佳化器關於表的行與索引的同步情況。當集簇因子接近資料塊的數量時，表的行與索引同步,即列值相同的資料行存放得比較集中，聚集度高。列值的選擇性、db_block_size、avg_row_len 以及集合基數全都協同工作，幫助最佳化器決定是使用索引還是使用全表掃描。如果資料列具有高度的選擇性和低的 clustering_factor，則索引掃描通常是最快的執行方法。即使列具有高度的選擇性，高 clustering_factor 和小 avg_row_len 也會表示列值在表中隨機分佈，而獲取這些行需要額外的 I/O。在此情況下，索引範圍掃描會導致大量不必要的 I/O;全表掃描則會高效得多。

【實驗環境】

作業系統：RHEL 5.5

資料庫：Oracle 10.2.0

【實驗過程】

1、環境部署

1.1、建立表、新增索引

SCOTT@ prod>create table test as select * from emp;

SCOTT@ prod>create index ind_test_empno on test (empno);

1.2、分析表

SCOTT@ prod>analyze table test compute statistics;

1.3、檢視列值的集簇因子：

SYS@ prod>select t.table_name,t.NUM_ROWS,t.BLOCKS,t.AVG_ROW_LEN,i.index_name,

i.CLUSTERING_FACTOR from dba_tables t,dba_indexes i where t.table_name=i.table_name

and t.owner='SCOTT' and t.table_name='TEST';

TABLE_NAME NUM_ROWS BLOCKS AVG_ROW_LEN INDEX_NAME CLUSTERING_FACTOR

---------- ---------- ------- ----------- --------------- -----------------

TEST 14 4 40 IND_TEST_EMPNO 1

NUM_ROWS 總行數14

AVG_ROW_LEN 平均每行長度40 bytes

14*40=560，一個塊中14行佔了560 bytes位元組，一個塊的大小是8192bytes（8K）。

560/8192=0.0683，560個位元組約佔了一個塊的7%

1.4、使用索引列進行查詢

SCOTT@ prod>set autotrace traceonly

SCOTT@ prod>select * from test where empno=7788;

Execution Plan

----------------------------------------------------------

Plan hash value: 4043037449

--------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 32 | 2 (0)| 00:00:01 |

| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 1 | 32 | 2 (0)| 00:00:01 |

|* 2 | INDEX RANGE SCAN | IND_TEST_EMPNO | 1 | | 1 (0)| 00:00:01 |

--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("EMPNO"=7788)

Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

0 consistent gets

0 physical reads

0 redo size

0 bytes sent via SQL*Net to client

0 bytes received via SQL*Net from client

0 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

1 rows processed

1.5、修改pctfree值，改為93，插入更多值

SCOTT@ prod>alter table test pctfree 93;

SCOTT@ prod>insert into test select * from emp;

SCOTT@ prod>/

SCOTT@ prod>commit;

1.6、分析表、檢視執行計劃

SCOTT@ prod>analyze table test estimate statistics;

SCOTT@ prod>select * from test where empno=7788;

8 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1357081020

--------------------------------------------------------------------------

--------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 8 | 256 | 5 (0)| 00:00:01 |

|* 1 | TABLE ACCESS FULL | TEST | 8 | 256 | 5 (0)| 00:00:01 |

--------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("EMPNO"=7788)

Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

0 consistent gets

0 physical reads

0 redo size

0 bytes sent via SQL*Net to client

0 bytes received via SQL*Net from client

0 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

8 rows processed

1.7、檢視集簇因子

SYS@ prod>select t.table_name,t.NUM_ROWS,t.BLOCKS,t.AVG_ROW_LEN,i.index_name,

i.CLUSTERING_FACTOR from dba_tables t,dba_indexes i where t.table_name=i.table_name

and t.owner='SCOTT' and t.table_name='TEST';

TABLE_NAME NUM_ROWS BLOCKS AVG_ROW_LEN INDEX_NAME CLUSTERING_FACTOR

---------- ---------- ------- ----------- --------------- -----------------

TEST 112 16 40 IND_TEST_EMPNO 112

CLUSTERING_FACTOR的值和NUM_ROWS的值接近，列值相同的行比較分散，走全表掃描。

可以對emp3重新排序，把列值相同的行變的集中，降低集簇因子的值。

2、改變集簇因子

2.1、建立中間表存放資料

SCOTT@ prod>create table test_tmp as select * from test;

2.2、truncate原表

SCOTT@ prod>truncate table test;

2.3、按順序重新插入資料

SCOTT@ prod>insert into test select * from test_tmp order by empno;

SCOTT@ prod>commit;

SCOTT@ prod>select * from test where rownum < 10;

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO

------ ----- --------- ----- ------------------- ------- ------ -------

7369 SMITH CLERK 7902 1980-12-17 00:00:00 800 20

7499 ALLEN SALESMAN 7698 1981-02-20 00:00:00 1600 300 30

9 rows selected.

2.4、重新分析表、檢視執行計劃

SCOTT@ prod>analyze table test estimate statistics;

SCOTT@ prod>select * from test where empno=7788;

8 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 4043037449

--------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 8 | 256 | 2 (0)| 00:00:01 |

| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 8 | 256 | 2 (0)| 00:00:01 |

|* 2 | INDEX RANGE SCAN | IND_TEST_EMPNO | 8 | | 1 (0)| 00:00:01 |

--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("EMPNO"=7788)

Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

0 consistent gets

0 physical reads

0 redo size

0 bytes sent via SQL*Net to client

0 bytes received via SQL*Net from client

0 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

8 rows processed

這次執行計劃選擇了索引

2.5、檢視集簇因子的改變

SYS@ prod>select t.table_name,t.NUM_ROWS,t.BLOCKS,t.AVG_ROW_LEN,i.index_name,
i.CLUSTERING_FACTOR from dba_tables t,dba_indexes i where t.table_name=i.table_name and t.owner='SCOTT' and t.table_name='TEST';

TABLE_NAME NUM_ROWS BLOCKS AVG_ROW_LEN INDEX_NAME CLUSTERING_FACTOR

---------- ---------- ------- ----------- --------------- -----------------

TEST 112 13 40 IND_TEST_EMPNO 13

【實驗總結】

1、如果CLUSTERING_FACTOR的值和BLOCKS的值比較接近，則列值相同的行比較集中，則執行計劃應該走索引。

2、如果CLUSTERING_FACTOR的值和NUM_ROWS的值比較接近，則列值相同的行比較分散，這樣最佳化器會選擇走全表掃描。如果想要走索引，需要對原表進行重排序，降低集簇因子的值。

呂星昊
2014.9.4

【效能最佳化】Oracle 效能最佳化：降低列值聚簇因子提高查詢效率

【實驗環境】

作業系統：RHEL 5.5

資料庫：Oracle 10.2.0

【實驗環境】

【實驗過程】

1、環境部署

1.1、建立表、新增索引

1.2、分析表

1.3、檢視列值的集簇因子：

1.4、使用索引列進行查詢

1.5、修改pctfree值，改為93，插入更多值

1.6、分析表、檢視執行計劃

1.7、檢視集簇因子

2、改變集簇因子

2.1、建立中間表存放資料

2.2、truncate原表

2.3、按順序重新插入資料

2.4、重新分析表、檢視執行計劃

2.5、檢視集簇因子的改變

【實驗總結】

相關文章

【效能最佳化】Oracle 效能最佳化：降低列值聚簇因子 提高查詢效率

【實驗環境】 作業系統：RHEL 5.5 資料庫：Oracle 10.2.0

【實驗環境】

【實驗過程】

1、環境部署

1.1、建立表、新增索引

1.2、分析表

1.3、檢視列值的集簇因子：

1.4、使用索引列進行查詢

1.5、修改pctfree值，改為93，插入更多值

1.6、分析表、檢視執行計劃

1.7、檢視集簇因子

2、改變集簇因子

2.1、建立中間表存放資料

2.2、truncate原表

2.3、按順序重新插入資料

2.4、重新分析表、檢視執行計劃

2.5、檢視集簇因子的改變

【實驗總結】

相關文章

【效能最佳化】Oracle 效能最佳化：降低列值聚簇因子提高查詢效率

【實驗環境】

作業系統：RHEL 5.5

資料庫：Oracle 10.2.0