Pipelined table function statistics and dynamic sampling
This is an extraction of Adrian Billington's article:
At some point, you might need to join a pipelined function to another rowsource (such as a table, a view, or the intermediate output of other joins within a SQL execution plan). Rowsource statistics (such as cardinality, data distribution, nulls, etc) are critical to achieving efficient execution plans, but in the case of pipelined functions (or indeed any table function), the cost-based optimizer doesn't have much information to work with.
cardinality heuristics for pipelined table functions
Up to and including Oracle Database 11g Release 1, the CBO applies a heuristic cardinality to pipelined and table functions in SQL statements and this can sometimes lead to inefficient execution plans. The default cardinality appears to be dependent on the value of the DB_BLOCK_SIZE initialization parameter, but on a database with a standard 8Kb block size Oracle uses a heuristic of 8,168 rows. I can demonstrate this quite easily with a pipelined function that pipes a subset of columns from the employees table. Using Autotrace in SQL*Plus to generate an execution plan, I see the following.
/* Files on web: cbo_setup.sql and cbo_test.sql */
SQL> SELECT *
2 FROM TABLE(pipe_employees) e;
Execution Plan
----------------------------------------------------------
Plan hash value: 1802204150
--------------------------------------------------------------------
| Id | Operation | Name | Rows |
--------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8168 |
| 1 | COLLECTION ITERATOR PICKLER FETCH| PIPE_EMPLOYEES | |
--------------------------------------------------------------------
This pipelined function actually returns 50,000 rows, so if I join this pipelined function to the departments table, I run the risk of getting a suboptimal plan.
/* File on web: cbo_test.sql */
SQL> SELECT *
2 FROM departments d
3 , TABLE(pipe_employees) e
4 WHERE d.department_id = e.department_id;
Execution Plan
----------------------------------------------------------
Plan hash value: 4098497386
----------------------------------------------------------------------
| Id | Operation | Name | Rows |
----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8168 |
| 1 | MERGE JOIN | | 8168 |
| 2 | TABLE ACCESS BY INDEX ROWID | DEPARTMENTS | 27 |
| 3 | INDEX FULL SCAN | DEPT_ID_PK | 27 |
|* 4 | SORT JOIN | | 8168 |
| 5 | COLLECTION ITERATOR PICKLER FETCH| PIPE_EMPLOYEES | |
----------------------------------------------------------------------
As predicted, this appears to be a suboptimal plan; it is unlikely that a sort-merge join will be more efficient than a hash join in this scenario. So how do I influence the CBO? For this example I could use simple access hints such as LEADING and USE_HASH to effectively override the CBO’s cost-based decision and secure a hash join between the table and pipelined function. However, for more complex SQL statements, it is quite difficult to provide all the hints necessary to “lock down” an execution plan. It is often far better to provide the CBO with better statistics with which to make its decisions.
Optimizer dynamic sampling
This feature was enhanced in Oracle Database 11g (11.1.0.7) to include sampling for table and pipelined functions;
Dynamic sampling is an extremely useful feature that enables the optimizer to take a small statistics sample of one or more objects in a query during the parse phase. You might use dynamic sampling when you haven’t gathered statistics on all of your tables in a query or when you are using transient objects such as global temporary tables. Starting with version 11.1.0.7, the Oracle database is able to use dynamic sampling for table or pipelined functions.
To see what difference this feature can make, I’ll repeat my previous query but include a DYNAMIC_SAMPLING hint for the pipe_employees function.
/* File on web: cbo_test.sql */
SQL> SELECT /*+ DYNAMIC_SAMPLING(5) */
2 *
3 FROM departments d
4 , TABLE(pipe_employees) e
5 WHERE d.department_id = e.department_id;
Execution Plan
----------------------------------------------------------
Plan hash value: 815920909
---------------------------------------------------------------------
| Id | Operation | Name | Rows |
---------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 50000 |
|* 1 | HASH JOIN | | 50000 |
| 2 | TABLE ACCESS FULL | DEPARTMENTS | 27 |
| 3 | COLLECTION ITERATOR PICKLER FETCH| PIPE_EMPLOYEES | |
---------------------------------------------------------------------
This time, the CBO has correctly computed the 50,000 rows that my function returns and has generated a more suitable plan. Note that I used the word “computed” and not “estimated” because in version 11.1.0.7 and later, the optimizer takes a 100% sample of the table or pipelined function, regardless of the dynamic sampling level being used (this is also the case in Oracle Database 11g Release 2). I used level 5, but I could have used anything between level 2 and level 10 to get exactly the same result. This means, of course, that dynamic sampling can be potentially costly or time-consuming if it is being used for queries involving high-volume or long-running pipelined functions.
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/638844/viewspace-772753/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Multi-table insert using pipelined functionFunction
- Oracle Pipelined FunctionOracleFunction
- dbms_stats.lock_table_stats與動態取樣(Dynamic Sampling)
- Oracle Pipelined Table(轉)Oracle
- Oracle Pipelined Table FunctionsOracleFunction
- Oracle Pipelined Table Functions(轉)OracleFunction
- improving performance with pipelined table functionsORMFunction
- oracle10g的dynamic samplingOracle
- Oracle Table FunctionOracleFunction
- Maintain profile parameter dynamic with function.AIFunction
- 優化器革命之-Dynamic Sampling(二)優化
- 優化器革命之-Dynamic Sampling(五)優化
- 優化器革命之-Dynamic Sampling(四)優化
- 優化器革命之-Dynamic Sampling(三)優化
- optimizer_dynamic_sampling引數的理解
- Flink Table/SQL API 規劃 —— Dynamic TableSQLAPI
- oracle動態取樣_optimizer_dynamic_samplingOracle
- oracle動態取樣_optimizer_dynamic_sampling(二)Oracle
- ptimizer_dynamic_sampling設定為4的作用。
- sql profile禁用oracle動態取樣dynamic samplingSQLOracle
- SciTech-Mathmatics-Probability+Statistics-Applications : Probability&Sampling : Sampling Distribution + Central Limit TheoremAPPMITREM
- Oracle 的管道化表函式(Pipelined Table) 轉Oracle函式
- C# return dynamic/anonymous type value as function resultC#Function
- SciTech-Mathmatics-Probability+Statistics-Population Vs. Sampling: Representative Samples + How to obtain SamplesAI
- oracle動態取樣dynamic sampling hint為何不生效一點思考Oracle
- Entity Framework Tutorial Basics(34):Table-Valued FunctionFrameworkFunction
- 我眼中的 Nginx(二):HTTP/2 dynamic table size updateNginxHTTP
- SciTech-Mathmatics-Probability+Statistics-Population:Region-Sampling of Region : Confidence Interval(置信區間)IDE
- SAP editing function is activated to modiy table content.Function
- Oracle 11g中dynamic sampling自動調節(auto-adjusted)機制Oracle
- Pipelined FunctionsFunction
- SAP Table function 執行報錯 feature not supported 該如何分析Function
- 觀察analyze table compute statistics 都對什麼物件統計了資訊 - 轉物件
- Importance samplingImport
- pipelined函式例項函式
- javascript 中function(){},new function(),new Function(),Function 摘錄JavaScriptFunction
- Database StatisticsDatabase
- DELETE STATISTICSdelete