cluster factor對執行計劃的影響
cluster factor對執行計劃的影響
測試環境:Linux 7.6 + ORACLE 19.6.1
1.建立測試環境
1.1 建立測試表並插入資料
CZH@czhpdb > create table test_ffs as select * from hr.employees; Table created. CZH@czhpdb > insert into test_ffs select * from test_ffs; Execution Plan ---------------------------------------------------------- Plan hash value: 296244252 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 107 | 7383 | 3 (0)| 00:00:01 | | 1 | LOAD TABLE CONVENTIONAL | TEST_FFS | | | | | | 2 | OPTIMIZER STATISTICS GATHERING | | 107 | 7383 | 3 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL | TEST_FFS | 107 | 7383 | 3 (0)| 00:00:01 | --------------------------------------------------------------------------------------------- Note ----- - dynamic statistics used: statistics for conventional DML Statistics ---------------------------------------------------------- 72 recursive calls 89 db block gets 81 consistent gets 12 physical reads 21576 redo size 195 bytes sent via SQL*Net to client 394 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 3 sorts (memory) 0 sorts (disk) 107 rows processed
上面autotrace執行計劃可以看到兩個新特性:
1.2 12c R1與19c兩個新特性
1.2.1 12c R1新特性OPTIMIZER STATISTICS GATHERING:
# OPTIMIZER STATISTICS GATHERING:12cR1以後的新特性,direct path load時,空表第一次載入資料時會自動收集統計資訊。
# Oracle Database 12c introduced online statistics gathering for CREATE TABLE AS SELECT statements and direct-path inserts.
1.2.2 19c新特性 real-time statistics
Oracle Database 19c introduces
real-time statistics
, which extend online support to
conventional DML statements
. Because statistics can go stale between DBMS_STATS jobs, real-time statistics helps the optimizer generate more optimal plans.Whereas bulk load operations gather all necessary statistics, real-time statistics augment rather than replace traditional statistics.
· Oracle introduced new parameters
· "_optimizer_gather_stats_on_conventional_dml" and "_optimizer_use_stats_on_conventional_dml" which are true by default
· "_optimizer_stats_on_conventional_dml_sample_rate" at 100%
· How does real time statistics works?
· By default the "_optimizer_gather_stats_on_conventional_dml" is true so its automatically kicks off
· When a DML operation is currently modifying a table (conventional), Oracle Database dynamically computes values for the most essential statistics if the above parameter is on.
· Consider a example of table that is having lot of inserts and rows are increasing. Real-time statistics keep track of the increasing row count as rows are being inserted. If the optimizer performs a hard parse of a new query, then the optimizer can use the real-time statistics to obtain a more accurate cost estimate.
· DBA_TAB_COL_STATISTICS and DBA_TAB_STATISITICS has columns NOTES tell real time statistics have been used. STATS_ON_CONVENTIONAL_DML
SELECT NVL(PARTITION_NAME, 'GLOBAL') PARTITION_NAME, NUM_ROWS, BLOCKS, NOTES
FROM USER_TAB_STATISTICS
WHERE TABLE_NAME = 'SALES'
ORDER BY 1, 4;
PARTITION_NAM NUM_ROWS BLOCKS NOTES
------------- ---------- ---------- -------------------------
GLOBAL 1837686 3315 STATS_ON_CONVENTIONAL_DML
1.3 插入大量資料並收集統計資訊
CZH@czhpdb > set autot off CZH@czhpdb > insert into test_ffs select * from test_ffs; CZH@czhpdb > insert into test_ffs select * from test_ffs; CZH@czhpdb > insert into test_ffs select * from test_ffs; CZH@czhpdb > insert into test_ffs select * from test_ffs; CZH@czhpdb > insert into test_ffs select * from test_ffs; CZH@czhpdb > insert into test_ffs select * from test_ffs; CZH@czhpdb > insert into test_ffs select * from test_ffs; CZH@czhpdb > insert into test_ffs select * from test_ffs; CZH@czhpdb > insert into test_ffs select * from test_ffs; CZH@czhpdb > insert into test_ffs select * from test_ffs; CZH@czhpdb > commit; CZH@czhpdb > CREATE INDEX IDX_TEST_FFS ON TEST_FFS(EMPLOYEE_ID); CZH@czhpdb > EXEC DBMS_STATS.GATHER_TABLE_STATS(user,’TEST_FFS’,cascade=>true);
1.4 使用Hint /*+ gather_plan_statistics */獲取sql真實執行計劃
# sqlplus 中set autotrace與explain plan for都是CBO預估出來的執行計劃,可能與真實執行的並不相同,我們使用下面hint獲取真實執行計劃。
CZH@czhpdb > SELECT /*+ gather_plan_statistics */ salary from test_ffs where employee_id < 100; no rows selected 真實執行計劃: SYS@orcl2 > select * from table(dbms_xplan.display_cursor('c9qg9su5khysd',null,'allstats last')); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ SQL_ID c9qg9su5khysd, child number 0 ------------------------------------- SELECT /*+ gather_plan_statistics */ salary from test_ffs where employee_id < 100 Plan hash value: 296244252 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.01 | 2288 | |* 1 | TABLE ACCESS FULL| TEST_FFS | 1 | 1 | 0 |00:00:00.01 | 2288 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): ---------------------------------------------------
1 - filter("EMPLOYEE_ID"<100)
# 可以看到由於表中資料沒有employee_id < 100,我們認為明顯走索引的sql並未選擇索引,那肯定是由於某些原因,cbo認為走索引並不是最優執行路徑,我們就利用10053獲取為什麼cbo認為全表掃描cost更低。
SYS@orcl2 > alter system flush shared_pool; System altered.
# 如果不清空shared_pool或者使遊標失效,軟解析開啟10053事件,不會生成trace檔案。
CZH@czhpdb > ALTER SESSION SET EVENTS '10053 trace name context forever,level 1'; Session altered. CZH@czhpdb > SELECT /*+ gather_plan_statistics */ salary from test_ffs where employee_id < 100; no rows selected CZH@czhpdb > ALTER SESSION SET EVENTS '10053 trace name context off'; Session altered.
19c 10053:
# 可以從下面10053看到DK(distinct key),CLUF(clustering factor),IX_SEL,下一步將根據幾個引數計算為何CBO認為走索引cost會高於全表掃描。
*************************************** BASE STATISTICAL INFORMATION *********************** Table Stats:: Table: TEST_FFS Alias: TEST_FFSonline table stats for conventional DML (block count: 2263 row count: 219029) used on (TEST_FFS) block count: 5 -> 2263, row count: 107 -> 219136 #Rows: 219136 SSZ: 0 LGR: 0 #Blks: 2263 AvgRowLen: 69.00 NEB: 0 ChainCnt: 0.00 ScanRate: 0.00 SPC: 0 RFL: 0 RNF: 0 CBK: 0 CHR: 0 KQDFLG: 193 #IMCUs: 0 IMCRowCnt: 0 IMCJournalRowCnt: 0 #IMCBlocks: 0 IMCQuotient: 0.000000 Index Stats:: Index: IDX_TEST_FFS Col#: 1 LVLS: 1 #LB: 458 #DK: 107 LB/K: 4.00 DB/K: 1524.00 CLUF: 163174.00 NRW: 219136.00 SSZ: 0.00 LGR: 0.00 CBK: 0.00 GQL: 0.00 CHR: 0.00 KQDFLG: 8192 BSZ: 1 KKEISFLG: 1 try to generate single-table filter predicates from ORs for query block SEL$1 (#0) finally: "TEST_FFS"."EMPLOYEE_ID"<100 ======================================= SPD: BEGIN context at query block level ======================================= Query Block SEL$1 (#0) Return code in qosdSetupDirCtx4QB: NOCTX ===================================== SPD: END context at query block level ===================================== Access path analysis for TEST_FFS *************************************** SINGLE TABLE ACCESS PATH Single Table Cardinality Estimation for TEST_FFS[TEST_FFS] SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE kkecdn: Single Table Predicate:"TEST_FFS"."EMPLOYEE_ID"<100 online column stats for conventional DML used on (TEST_FFS.EMPLOYEE_ID) min: 100.00 -> 100.00, max: 206.00 -> 206.00, nnl: 0 -> 0, acl: 4 -> 0 Column (#1): EMPLOYEE_ID(NUMBER) AvgLen: 22 NDV: 107 Nulls: 0 Density: 0.009346 Min: 100.000000 Max: 206.000000 Using density: 0.009346 of col #1 as selectivity of unpopular value pred Table: TEST_FFS Alias: TEST_FFS Card: Original: 219136.000000 Rounded: 2048 Computed: 2048.000000 Non Adjusted: 2048.000000 Scan IO Cost (Disk) = 615.000000 Scan CPU Cost (Disk) = 49272938.720000 Cost of predicates: io = NOCOST, cpu = 50.000000, sel = 0.009346 flag = 2048 ("TEST_FFS"."EMPLOYEE_ID"<100) Total Scan IO Cost = 615.000000 (scan (Disk)) + 0.000000 (io filter eval) (= 0.000000 (per row) * 219136.000000 (#rows)) = 615.000000 Total Scan CPU Cost = 49272938.720000 (scan (Disk)) + 10956800.000000 (cpu filter eval) (= 50.000000 (per row) * 219136.000000 (#rows)) = 60229738.720000 Access Path: TableScan Cost: 621.167026 Resp: 621.167026 Degree: 0 Cost_io: 615.000000 Cost_cpu: 60229739 Resp_io: 615.000000 Resp_cpu: 60229739 ****** Costing Index IDX_TEST_FFS SPD: Return code in qosdDSDirSetup: NOCTX, estType = INDEX_SCAN SPD: Return code in qosdDSDirSetup: NOCTX, estType = INDEX_FILTER Using density: 0.009346 of col #1 as selectivity of unpopular value pred Access Path: index (RangeScan) Index: IDX_TEST_FFS resc_io: 1531.000000 resc_cpu: 11906445 ix_sel: 0.009346 ix_sel_with_filters: 0.009346 Cost: 1532.219121 Resp: 1532.219121 Degree: 1 Best:: AccessPath: TableScan Cost: 621.167026 Degree: 1 Resp: 621.167026 Card: 2048.000000 Bytes: 0.000000 online column stats for conventional DML used on (TEST_FFS.SALARY) min: 2100.00 -> 2100.00, max: 24000.00 -> 24000.00, nnl: 0 -> 0, acl: 4 -> 0 ***************************************
2.調整cluster factor
2.1 cluster factor聚簇因子說明
cluster factor表示索引順序與表儲存資料一致性,順序掃描索引時,如果索引鍵值掃描到鍵值對應的表資料行對應的資料塊發生變化時,則cluster factor加1,所以cluster factor最低為表資料塊,最大為表資料行,與表儲存順序高度相關,如果表是按照順序插入,則cluster factor較低,如果表資料為無序插入,則cluster factor較高,這就是為什麼同樣表資料情況下,執行計劃會有時候有差別的原因。
索引掃描成本公式:
INDEX ACCESS I/O COST=BLEVEL+CEIL(#LEAF_BLOCKS*IX_SEL)
TABLE_ACCESS I/O COST=CEIL(CLUSTERING_FACTOR*IX_SEL_WITH_FILTERS)
IX_SEL與IX_SEL_WITH_FILTERS為索引選擇率與索引帶謂詞選擇率,一般為1/(DISTINCT KEY)值,本例中走全表掃描時,IX_SEL=1/107=0.009345,則計算走索引成本為:
ACCESS INDEX COST=INDEX ACCESS I/O COST + TABLE ACCESS I/O COST=2+CEIL(458*0.009345)+CEIL(163174*0.009345)=1540
近似等於CBO預計出來的1532,是高於全表掃的COST 615的,所以選擇走了全表掃描。
2.2 調整cluster factor
重建表,order by排序,降低cluster factor
CZH@czhpdb > create table test_ffs_03 as select * from test_ffs_02 order by employee_id; Table created. CZH@czhpdb > create index idx_test_ffs_03 on test_ffs_03(employee_id); Index created. CZH@czhpdb > select clustering_factor,index_name from user_indexes where index_name='IDX_TEST_FFS_03'; CLUSTERING_FACTOR INDEX_NAME ---------------------------------------- -------------------- 1128 IDX_TEST_FFS_03
# 可以看到cluster factor明顯降低。
CZH@czhpdb > select /*+ gather_plan_statistics */ salary from test_ffs_03 where employee_id < 100; no rows selected SYS@orcl2 > select * from table(dbms_xplan.display_cursor('8fpk2b8vzn5y2',null,'allstats last')); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ SQL_ID 8fpk2b8vzn5y2, child number 0 ------------------------------------- select /*+ gather_plan_statistics */ salary from test_ffs_03 where employee_id < 100 Plan hash value: 704625359 -------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | -------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.01 | 2 | 1 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| TEST_FFS_03 | 1 | 1024 | 0 |00:00:00.01 | 2 | 1 | |* 2 | INDEX RANGE SCAN | IDX_TEST_FFS_03 | 1 | 1024 | 0 |00:00:00.01 | 2 | 1 | -------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("EMPLOYEE_ID"<100)
# 可以看到執行計劃已經選擇了更優的INDEX RANGE SCAN.
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/31439444/viewspace-2696316/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- CLUSTERING_FACTOR影響執行計劃
- 索引及排序對執行計劃的影響索引排序
- oracle cardinality對於執行計劃的影響Oracle
- not-null約束對執行計劃的影響Null
- Oracle優化案例-統計資訊對執行計劃的影響(十三)Oracle優化
- 實驗-資料分佈對執行計劃的影響.txt
- _complex_view_merging對執行計劃的影響View
- 引數Optimizer_index_cost_adj 對執行計劃的影響Index
- 【CURSOR】Oracle繫結變數、執行計劃對遊標的影響Oracle變數
- 收集統計資訊中的no_invalidate選項對執行計劃的影響
- 再說索引與Null值對於Hints及執行計劃的影響索引Null
- 不等號影響執行計劃的相關實驗
- clustering factor索引聚簇因子和執行計劃索引
- 表挪動儲存空間後,對之上的sql的執行計劃的影響的探究SQL
- try catch 對程式碼執行的效能影響
- db_file_multiblock_read_count引數對block讀取和執行計劃的影響BloC
- 執行緒數目對資料庫的影響執行緒資料庫
- 時區調整對job的執行時間的影響
- 影響執行計劃之oracle sql baseline與sql profile之互動OracleSQL
- 對一個執行計劃的疑問
- arraysize/TDU/SDU的設定對sql執行代價的影響SQL
- 載入資料順序對HASH SORT CLUSTER效能影響
- oracle 索引升降序及排序條件 對查詢計劃的影響Oracle索引排序
- MySQL選用可重複讀之前一定要想到的事情(執行計劃影響)MySql
- 執行計劃-1:獲取執行計劃
- 分析執行計劃優化SQLORACLE的執行計劃(轉)優化SQLOracle
- 獨家揭秘丨GreatSQL 的MDL鎖策略升級對執行的影響SQL
- 音樂對程式設計的影響程式設計
- 中美貿易談判對雲端計算行業的影響行業
- 淺析影響專案執行的因素
- 【執行計劃】Oracle獲取執行計劃的幾種方法Oracle
- 【Oracle】-【索引-HINT,執行計劃】-帶HINT的索引執行計劃Oracle索引
- 分析執行計劃最佳化SQLORACLE的執行計劃(轉)SQLOracle
- 執行計劃
- SQL的執行計劃SQL
- 執行計劃的理解.
- 淺析影響專案執行的因素(續)
- 雙下劃線開頭的記憶體引數對Oracle AMM行為的影響記憶體Oracle