[20190827]函式索引與選擇率.txt
[20190827]函式索引與選擇率.txt
--//一般情況下查詢條件使用函式,選擇率1%.如果建立虛擬列並且分析的情況下,能夠獲得比較準確的選擇率。
--//如果建立對應的函式索引呢,情況會怎麼呢?最近遇到的問題,讓我注意其中一些細節。
1.環境:
SCOTT@test01p> @ ver1
PORT_STRING VERSION BANNER CON_ID
------------------------------ -------------- -------------------------------------------------------------------------------- ----------
IBMPC/WIN_NT64-9.1.0 12.2.0.1.0 Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production 0
SCOTT@test01p> create table t1 as select rownum id,dbms_random.string('U',10) v1 from dual connect by level<=1e4;
Table created.
--//分析略.Method_Opt => 'FOR ALL COLUMNS SIZE 1 '
SCOTT@test01p> select * from t1 where rownum=1;
ID V1
---------- --------------------
1 KCIXTFQWHZ
2.測試:
SCOTT@test01p> alter session set statistics_level = all;
Session altered.
SCOTT@test01p> select * from t1 where lower(v1)='kcixtfqwhz';
ID V1
---------- --------------------
1 KCIXTFQWHZ
SCOTT@test01p> @ dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID ds718dy9422mr, child number 1
-------------------------------------
select * from t1 where lower(v1)='kcixtfqwhz'
Plan hash value: 3617692013
--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 10 (100)| | 1 |00:00:00.01 | 41 |
|* 1 | TABLE ACCESS FULL| T1 | 1 | 100 | 1500 | 10 (0)| 00:00:01 | 1 |00:00:00.01 | 41 |
--------------------------------------------------------------------------------------------------------------------
--//E-Rows=100,
--//選擇率100/10000 = .01.
3.如果建立對應的函式索引呢?
SCOTT@test01p> create index if_t1_v1 on t1( lower(v1));
Index created.
SCOTT@test01p> select * from t1 where lower(v1)='kcixtfqwhz';
ID V1
---------- --------------------
1 KCIXTFQWHZ
SCOTT@test01p> @ dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID ds718dy9422mr, child number 1
-------------------------------------
select * from t1 where lower(v1)='kcixtfqwhz'
Plan hash value: 544604454
---------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers | Reads |
---------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 9 (100)| | 1 |00:00:00.01 | 4 | 1 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T1 | 1 | 100 | 1500 | 9 (0)| 00:00:01 | 1 |00:00:00.01 | 4 | 1 |
|* 2 | INDEX RANGE SCAN | IF_T1_V1 | 1 | 40 | | 1 (0)| 00:00:01 | 1 |00:00:00.01 | 3 | 1 |
---------------------------------------------------------------------------------------------------------------------------------------------------
--//E-Rows=40.
--//選擇率40/10000 = .004,這個是我以前沒有注意到的,我一直以為選擇率還是1%.不知道這個預設的選擇率如何確定的.
4.繼續探究:
--//實際上建立函式索引時有1個細節不容忽視,就是該隱含欄位缺乏統計資訊,我曾經在這裡栽過跟頭.
SCOTT@test01p> SCOTT@test01p> @ tab_lh scott t1 ''
DISPLAY TABLE_NAME OF COLUMN_NAME INFORMATION.
INPUT OWNER TABLE_NAME COLUMN
SAMPLE : @ TAB_LH TABLE_NAME [COLUMN_NAME]
IF NOT INPUT COLUMN_NAME ,USE "" .
COLUMN_NAME DATA_TYPE DATA_LENGTH N NUM_DISTINCT DENSITY SAMPLE_SIZE TRANS_LOW TRANS_HIGH NUM_NULLS NUM_BUCKETS LAST_ANALYZED HISTOGRAM DATA_DEFAULT
------------- --------- ----------- - ------------ ---------- ----------- ---------- ---------- --------- ----------- ------------------- --------- -------------
ID NUMBER 22 Y 10000 .0001 10000 1 10000 0 1 2019-08-27 21:35:55 NONE
V1 VARCHAR2 4000 Y 10000 .0001 10000 AAAXTDLLNT ZZZBEIPJYY 0 1 2019-08-27 21:35:55 NONE
SYS_NC00003$ VARCHAR2 4000 Y NONE LOWER("V1")
--//必須要重新分析表.
execute sys.dbms_stats.gather_table_stats ( OwnName => nvl('',user),TabName => 't1',Estimate_Percent => NULL,Method_Opt => 'FOR ALL COLUMNS SIZE 1 ',Cascade => True ,No_Invalidate => false)
SCOTT@test01p> @ tab_lh scott t1 SYS_NC00003$
DISPLAY TABLE_NAME OF COLUMN_NAME INFORMATION.
INPUT OWNER TABLE_NAME COLUMN
SAMPLE : @ TAB_LH TABLE_NAME [COLUMN_NAME]
IF NOT INPUT COLUMN_NAME ,USE "" .
COLUMN_NAME DATA_TYPE DATA_LENGTH N NUM_DISTINCT DENSITY SAMPLE_SIZE TRANS_LOW TRANS_HIGH NUM_NULLS NUM_BUCKETS LAST_ANALYZED HISTOGRAM DATA_DEFAULT
------------ --------- ----------- - ------------ ---------- ----------- ---------- ---------- --------- ----------- ------------------- --------- ------------
SYS_NC00003$ VARCHAR2 4000 Y 10000 .0001 10000 aaaxtdllnt zzzbeipjyy 0 1 2019-08-27 21:46:06 NONE LOWER("V1")
SCOTT@test01p> select * from t1 where lower(v1)='kcixtfqwhz';
ID V1
---------- --------------------
1 KCIXTFQWHZ
SCOTT@test01p> @ dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID ds718dy9422mr, child number 1
-------------------------------------
select * from t1 where lower(v1)='kcixtfqwhz'
Plan hash value: 544604454
------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers |
------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 2 (100)| | 1 |00:00:00.01 | 4 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T1 | 1 | 1 | 26 | 2 (0)| 00:00:01 | 1 |00:00:00.01 | 4 |
|* 2 | INDEX RANGE SCAN | IF_T1_V1 | 1 | 1 | | 1 (0)| 00:00:01 | 1 |00:00:00.01 | 3 |
------------------------------------------------------------------------------------------------------------------------------------------
--//這樣E-Rows估算就比較準確了.
5.附上tab_lh.sql指令碼:
/* Formatted on 2014/10/19 20:53:41 (QP5 v5.227.12220.39754) */
PROMPT
PROMPT DISPLAY TABLE_NAME OF COLUMN_NAME INFORMATION.
PROMPT INPUT OWNER TABLE_NAME COLUMN
PROMPT SAMPLE : @ TAB_LH TABLE_NAME [COLUMN_NAME]
PROMPT IF NOT INPUT COLUMN_NAME ,USE "" .
PROMPT
column trans_low format a32
column trans_high format a32
column data_default format a20
column column_name format a24
SELECT
--owner,
-- table_name,
column_name,
data_type,
data_length,
nullable,
num_distinct,
density,
sample_size,
CASE
WHEN data_type IN ('CHAR', 'VARCHAR2')
THEN
UTL_RAW.cast_to_varchar2 (low_value)
WHEN data_type = 'NUMBER'
THEN
TO_CHAR (UTL_RAW.cast_to_number (low_value))
WHEN data_type = 'DATE'
THEN
RTRIM (
LTRIM (
TO_CHAR (
100
* ( TO_NUMBER (SUBSTR (low_value, 1, 2), 'XX')
- 100)
+ (TO_NUMBER (SUBSTR (low_value, 3, 2), 'XX') - 100),
'0000'))
|| '-'
|| LTRIM (
TO_CHAR (TO_NUMBER (SUBSTR (low_value, 5, 2), 'XX'),
'00'))
|| '-'
|| LTRIM (
TO_CHAR (TO_NUMBER (SUBSTR (low_value, 7, 2), 'XX'),
'00'))
|| ' '
|| LTRIM (
TO_CHAR (
TO_NUMBER (SUBSTR (low_value, 9, 2), 'XX') - 1,
'00'))
|| ':'
|| LTRIM (
TO_CHAR (
TO_NUMBER (SUBSTR (low_value, 11, 2), 'XX') - 1,
'00'))
|| ':'
|| LTRIM (
TO_CHAR (
TO_NUMBER (SUBSTR (low_value, 13, 2), 'XX') - 1,
'00')))
END
trans_low,
CASE
WHEN data_type IN ('CHAR', 'VARCHAR2')
THEN
UTL_RAW.cast_to_varchar2 (high_value)
WHEN data_type = 'NUMBER'
THEN
TO_CHAR (UTL_RAW.cast_to_number (high_value))
WHEN data_type = 'DATE'
THEN
RTRIM (
LTRIM (
TO_CHAR (
100
* ( TO_NUMBER (SUBSTR (high_value, 1, 2), 'XX')
- 100)
+ (TO_NUMBER (SUBSTR (high_value, 3, 2), 'XX') - 100),
'0000'))
|| '-'
|| LTRIM (
TO_CHAR (TO_NUMBER (SUBSTR (high_value, 5, 2), 'XX'),
'00'))
|| '-'
|| LTRIM (
TO_CHAR (TO_NUMBER (SUBSTR (high_value, 7, 2), 'XX'),
'00'))
|| ' '
|| LTRIM (
TO_CHAR (
TO_NUMBER (SUBSTR (high_value, 9, 2), 'XX') - 1,
'00'))
|| ':'
|| LTRIM (
TO_CHAR (
TO_NUMBER (SUBSTR (high_value, 11, 2), 'XX') - 1,
'00'))
|| ':'
|| LTRIM (
TO_CHAR (
TO_NUMBER (SUBSTR (high_value, 13, 2), 'XX') - 1,
'00')))
END
trans_high,
num_nulls,
num_buckets,
last_analyzed,
histogram,
data_default
FROM dba_tab_cols
WHERE owner = decode('&1','',user,upper('&1'))
AND table_name = upper('&2')
AND column_name = decode('&&3','',column_name,upper('&&3'))
ORDER BY column_id
/
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/267265/viewspace-2655338/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 第十三篇:成員函式與非成員函式的選擇函式
- [20120612]函式索引中使用substr函式.txt函式索引
- [20170402]函式索引standard_hash.txt函式索引
- 分割槽表全域性索引與本地索引的選擇索引
- jQuery----函式和選擇器jQuery函式
- [20180509]函式索引問題.txt函式索引
- [20241012]cursor_sharing=force與函式索引.txt函式索引
- 選擇率(selectivity)與基數(cardinality)
- 【Hello CSS】第五章-CSS的選擇器與函式CSS函式
- 啟用函式(activation functions)二三事-性質、作用與選擇函式Function
- MySQL字首索引和索引選擇性MySql索引
- [20180212]函式索引問題.txt函式索引
- Elasticsearch 中為什麼選擇倒排索引而不選擇 B 樹索引Elasticsearch索引
- 複合索引與函式索引優化一例索引函式優化
- 索引的選擇原則索引
- SVM 的核函式選擇和調參函式
- C++ 函式過載,函式模板和函式模板過載,選擇哪一個?C++函式
- [20211231]函式索引測試.txt函式索引
- 函式索引陷阱函式索引
- oracle函式索引Oracle函式索引
- 函式索引使用細節——自定義函式的索引化函式索引
- pandas索引和選擇資料索引
- 資料庫索引選擇策略資料庫索引
- 資料型別與函式索引-PostgreSQL篇資料型別函式索引SQL
- 資料型別與函式索引-MySQL篇資料型別函式索引MySql
- 資料型別與函式索引-Oracle篇資料型別函式索引Oracle
- [20190918]關於函式索引問題.txt函式索引
- [20171202]關於函式索引的狀態.txt函式索引
- 機率論12 矩與矩生成函式函式
- [20120130]函式索引與取max值的問題1.txt函式索引
- [20120131]函式索引與取max值的問題2.txt函式索引
- [20120131]函式索引與取max值的問題3.txt函式索引
- 14-多表連線、索引選擇、顯式遊標和其他索引
- Oracle之函式索引Oracle函式索引
- 說說函式索引函式索引
- 索引中使用函式索引函式
- 深度神經網路(DNN)損失函式和啟用函式的選擇神經網路DNN函式
- [20150803]使用函式索引注意的問題.txt函式索引