【效能最佳化】Oracle 效能最佳化:降低列值聚簇因子 提高查詢效率

海星星hktk發表於2014-09-04

降低列值聚簇因子 提高查詢效率

oracle 在 dba_indexes 檢視中提供一個名為 clustering_factor 的列,通知最佳化器關於表的行與索引的同步情況。當集簇因子接近資料塊的數量時,表的行與索引同步,即列值相同的資料行存放得比較集中,聚集度高。 列值的選擇性、db_block_size、avg_row_len 以及集合基數全都協同工作,幫助最佳化器決定是使用索引還是使用全表掃描。如果資料列具有高度的選擇性和低的 clustering_factor,則索引掃描通常是最快的執行方法。即使列具有高度的選擇性,高 clustering_factor 和小 avg_row_len 也會表示列值在表中隨機分佈,而獲取這些行需要額外的 I/O。在此情況下,索引範圍掃描會導致大量不必要的 I/O;全表掃描則會高效得多。

【實驗環境】

作業系統:RHEL 5.5

資料庫:Oracle 10.2.0



【實驗過程

1、環境部署

1.1、建立表、新增索引

SCOTT@ prod>create table test as select * from emp;

SCOTT@ prod>create index ind_test_empno on test (empno);

1.2、分析表

 SCOTT@ prod>analyze table test compute statistics;

1.3、檢視列值的集簇因子:

 SYS@ prod>select t.table_name,t.NUM_ROWS,t.BLOCKS,t.AVG_ROW_LEN,i.index_name,

i.CLUSTERING_FACTOR from dba_tables t,dba_indexes i where t.table_name=i.table_name

and t.owner='SCOTT' and t.table_name='TEST';

 

TABLE_NAME   NUM_ROWS  BLOCKS AVG_ROW_LEN INDEX_NAME      CLUSTERING_FACTOR

---------- ---------- ------- ----------- --------------- -----------------

TEST               14      4          40 IND_TEST_EMPNO                 1

 

NUM_ROWS 總行數14

AVG_ROW_LEN 平均每行長度40 bytes

14*40=560,一個塊中14行佔了560 bytes位元組,一個塊的大小是8192bytes(8K)。

560/8192=0.0683,560個位元組約佔了一個塊的7%

1.4、使用索引列進行查詢

SCOTT@ prod>set autotrace traceonly

SCOTT@ prod>select * from test where empno=7788;

Execution Plan

----------------------------------------------------------

Plan hash value: 4043037449

--------------------------------------------------------------------------------------------

| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |                |     1 |    32 |     2   (0)| 00:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID| TEST           |     1 |    32 |     2   (0)| 00:00:01 |

|*  2 |  INDEX RANGE SCAN           | IND_TEST_EMPNO |     1 |       |     1   (0)| 00:00:01 |

--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   2 - access("EMPNO"=7788)

Statistics

----------------------------------------------------------

          0  recursive calls

          0  db block gets

          0  consistent gets

          0  physical reads

          0  redo size

          0  bytes sent via SQL*Net to client

          0  bytes received via SQL*Net from client

          0  SQL*Net roundtrips to/from client

          0  sorts (memory)

          0  sorts (disk)

          1  rows processed

 

1.5、修改pctfree值,改為93,插入更多值

SCOTT@ prod>alter table test pctfree 93;

SCOTT@ prod>insert into test select * from emp;

SCOTT@ prod>/

SCOTT@ prod>/

SCOTT@ prod>/

SCOTT@ prod>/

SCOTT@ prod>/

SCOTT@ prod>/

SCOTT@ prod>commit;

 

 

1.6、分析表、檢視執行計劃

SCOTT@ prod>analyze table test estimate statistics;

SCOTT@ prod>select * from test where empno=7788;

8 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1357081020

--------------------------------------------------------------------------

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------

|   0 | SELECT STATEMENT  |      |     8 |   256 |     5   (0)| 00:00:01 |

|*  1 | TABLE ACCESS FULL | TEST |     8 |   256 |     5   (0)| 00:00:01 |

--------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   1 - filter("EMPNO"=7788)

Statistics

----------------------------------------------------------

          0  recursive calls

          0  db block gets

          0  consistent gets

          0  physical reads

          0  redo size

          0  bytes sent via SQL*Net to client

          0  bytes received via SQL*Net from client

          0  SQL*Net roundtrips to/from client

          0  sorts (memory)

          0  sorts (disk)

          8  rows processed

 

1.7、檢視集簇因子

SYS@ prod>select t.table_name,t.NUM_ROWS,t.BLOCKS,t.AVG_ROW_LEN,i.index_name,

i.CLUSTERING_FACTOR from dba_tables t,dba_indexes i  where t.table_name=i.table_name

 and t.owner='SCOTT' and t.table_name='TEST';

 

TABLE_NAME   NUM_ROWS  BLOCKS AVG_ROW_LEN INDEX_NAME      CLUSTERING_FACTOR

---------- ---------- ------- ----------- --------------- -----------------

TEST             112      16          40 IND_TEST_EMPNO               112

 

CLUSTERING_FACTOR的值和NUM_ROWS的值接近,列值相同的行比較分散,走全表掃描。

可以對emp3重新排序,把列值相同的行變的集中,降低集簇因子的值。

2、改變集簇因子

2.1、建立中間表存放資料

SCOTT@ prod>create table test_tmp as select * from test;

 

2.2、truncate原表

SCOTT@ prod>truncate table test;

2.3、按順序重新插入資料

SCOTT@ prod>insert into test select * from test_tmp order by empno;

SCOTT@ prod>commit;

SCOTT@ prod>select * from test where rownum < 10;

 EMPNO ENAME JOB         MGR HIREDATE                SAL   COMM  DEPTNO

------ ----- --------- ----- ------------------- ------- ------ -------

  7369 SMITH CLERK      7902 1980-12-17 00:00:00     800             20

  7369 SMITH CLERK      7902 1980-12-17 00:00:00     800             20

  7369 SMITH CLERK      7902 1980-12-17 00:00:00     800             20

  7369 SMITH CLERK      7902 1980-12-17 00:00:00     800             20

  7369 SMITH CLERK      7902 1980-12-17 00:00:00     800             20

  7369 SMITH CLERK      7902 1980-12-17 00:00:00     800             20

  7369 SMITH CLERK      7902 1980-12-17 00:00:00     800             20

  7369 SMITH CLERK      7902 1980-12-17 00:00:00     800             20

  7499 ALLEN SALESMAN   7698 1981-02-20 00:00:00    1600    300      30

9 rows selected.

 

2.4、重新分析表、檢視執行計劃

SCOTT@ prod>analyze table test estimate statistics;

SCOTT@ prod>select * from test where empno=7788;

8 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 4043037449

--------------------------------------------------------------------------------------------

| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |                |     8 |   256 |     2   (0)| 00:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID| TEST           |     8 |   256 |     2   (0)| 00:00:01 |

|*  2 |  INDEX RANGE SCAN         | IND_TEST_EMPNO |     8 |       |     1   (0)| 00:00:01 |

--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   2 - access("EMPNO"=7788)

Statistics

----------------------------------------------------------

          0  recursive calls

          0  db block gets

          0  consistent gets

          0  physical reads

          0  redo size

          0  bytes sent via SQL*Net to client

          0  bytes received via SQL*Net from client

          0  SQL*Net roundtrips to/from client

          0  sorts (memory)

          0  sorts (disk)

          8  rows processed

這次執行計劃選擇了索引

2.5、檢視集簇因子的改變

SYS@ prod>select t.table_name,t.NUM_ROWS,t.BLOCKS,t.AVG_ROW_LEN,i.index_name, 
i.CLUSTERING_FACTOR from dba_tables t,dba_indexes i  where t.table_name=i.table_name and t.owner='SCOTT' and t.table_name='TEST';

TABLE_NAME   NUM_ROWS  BLOCKS AVG_ROW_LEN INDEX_NAME      CLUSTERING_FACTOR

---------- ---------- ------- ----------- --------------- -----------------

TEST              112     13          40 IND_TEST_EMPNO                13

 


【實驗總結】


1、如果CLUSTERING_FACTOR的值和BLOCKS的值比較接近,則列值相同的行比較集中,則執行計劃應該走索引。

2、如果CLUSTERING_FACTOR的值和NUM_ROWS的值比較接近,則列值相同的行比較分散,這樣最佳化器會選擇走全表掃描。如果想要走索引,需要對原表進行重排序,降低集簇因子的值。



呂星昊
2014.9.4 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29475508/viewspace-1264262/,如需轉載,請註明出處,否則將追究法律責任。

相關文章