NULL 值與索引(二)

bfc99發表於2014-06-27

以下轉自：http://blog.csdn.net/leshami/article/details/7438397 作者：Leshami

在NULL值與索引(一)中講述了null值與索引的一些基本情況。其主要的內容為，基於允許存在null值的索引列，其索引值不會被儲存；其次
是由於這個特性導致了我們在使用is null時索引失效的情形；最後則是描述的通過為null值列新增not null約束來使得is null走索引。儘管我
們可以通過新增not null來解決is null走索引，當現實中的情況是仍然很多列根本是無法確定的，而必須保持其null特性。對於此種情形該如
何解決呢？

一、通過基於函式的索引來使得is null使用索引

[sql]view plaincopyprint?
				
				--&gt演示環境  
			
				scott@ORCL> select * from v$version where rownum<2;  
			
				BANNER  
			
				----------------------------------------------------------------  
			
				Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod  
			
				--&gt建立測試表t2  
			
				scott@ORCL> create table t2(obj_id,obj_name) as select object_id,object_name from dba_objects;  
			
				Table created.  
			
				--&gt演示表t2上不存在not null約束  
			
				scott@ORCL> desc t2  
			
				 Name                          Null?    Type  
			
				 ----------------------------- -------- --------------------  
			
				 OBJ_ID                                 NUMBER  
			
				 OBJ_NAME                               VARCHAR2(128)  
			
				--&gt為表t2建立一個普通的B樹索引  
			
				scott@ORCL> create index i_t2_obj_id on t2(obj_id);  
			
				Index created.  
			
				--&gt將表t2列obj_id<=100的obj_id置空  
			
				--&gt注：在Oracle 10g中空字串等同於null值  
			
				scott@ORCL> update t2 set obj_id='' where obj_id<=100;  
			
				99 rows updated.  
			
				--&gt下面的查詢亦表明在此時空字串等同於null值  
			
				scott@ORCL> set null unknown  
			
				scott@ORCL> select * from t2 where obj_id is null and rownum<3;  
			
				    OBJ_ID OBJ_NAME  
			
				---------- ------------------------------  
			
				unknown    ICOL$  
			
				unknown    I_USER1  
			
				--&gt收集統計資訊  
			
				scott@ORCL> exec dbms_stats.gather_table_stats('SCOTT','T2',cascade=>true);  
			
				PL/SQL procedure successfully completed.  
			
				--&gt基於null值上使用not null會使用索引掃描，等同於前面 null值與索引(一) 中的描述  
			
				scott@ORCL> select count(*) from t2 where obj_id is not null;  
			
				Execution Plan  
			
				----------------------------------------------------------  
			
				Plan hash value: 3840858596  
			
				-------------------------------------------------------------------------------------  
			
				| Id  | Operation             | Name        | Rows  | Bytes | Cost (%CPU)| Time     |  
			
				-------------------------------------------------------------------------------------  
			
				|   0 | SELECT STATEMENT      |             |     1 |     5 |     7   (0)| 00:00:01 |  
			
				|   1 |  SORT AGGREGATE       |             |     1 |     5 |            |          |  
			
				|*  2 |   INDEX FAST FULL SCAN| I_T2_OBJ_ID | 11719 | 58595 |     7   (0)| 00:00:01 |  
			
				-------------------------------------------------------------------------------------  
			
				Predicate Information (identified by operation id):  
			
				---------------------------------------------------  
			
				   2 - filter("OBJ_ID" IS NOT NULL)  
			
				--&gt列obj_id is null走全表掃描  
			
				scott@ORCL> select count(*) from t2 where obj_id is null;  
			
				Execution Plan  
			
				----------------------------------------------------------  
			
				Plan hash value: 3321871023  
			
				---------------------------------------------------------------------------  
			
				| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |  
			
				---------------------------------------------------------------------------  
			
				|   0 | SELECT STATEMENT   |      |     1 |     5 |    13   (0)| 00:00:01 |  
			
				|   1 |  SORT AGGREGATE    |      |     1 |     5 |            |          |  
			
				|*  2 |   TABLE ACCESS FULL| T2   |     1 |     5 |    13   (0)| 00:00:01 |  
			
				---------------------------------------------------------------------------  
			
				Predicate Information (identified by operation id):  
			
				---------------------------------------------------  
			
				   2 - filter("OBJ_ID" IS NULL)  
			
				--&gt建立基於函式的索引來使得is null走索引  
			
				--&gt下面使用了nvl函式來建立函式索引，即當obj_id為null值時，儲存-1     
			
				scott@ORCL> create index i_fn_t2_obj_id on t2(nvl(obj_id,-1));  
			
				Index created.  
			
				--&gt收集索引資訊  
			
				scott@ORCL> exec dbms_stats.gather_index_stats('SCOTT','I_FN_T2_OBJ_ID');  
			
				PL/SQL procedure successfully completed.  
			
				--&gt可以看到下面的執行計劃中剛剛建立的函式索引已經生效I_FN_T2_OBJ_ID  
			
				scott@ORCL> select count(*) from t2 where nvl(obj_id,-1) = -1;  
			
				Execution Plan  
			
				----------------------------------------------------------  
			
				Plan hash value: 3983750858  
			
				------------------------------------------------------------------------------------  
			
				| Id  | Operation         | Name           | Rows  | Bytes | Cost (%CPU)| Time     |  
			
				------------------------------------------------------------------------------------  
			
				|   0 | SELECT STATEMENT  |                |     1 |     5 |     1   (0)| 00:00:01 |  
			
				|   1 |  SORT AGGREGATE   |                |     1 |     5 |            |          |  
			
				|*  2 |   INDEX RANGE SCAN| I_FN_T2_OBJ_ID |   100 |   500 |     1   (0)| 00:00:01 |  
			
				------------------------------------------------------------------------------------  
			
				Predicate Information (identified by operation id):  
			
				---------------------------------------------------  
			
				   2 - access(NVL("OBJ_ID",(-1))=(-1))

二、使用偽列建立基於函式的索引來使得is null使用索引

[sql]view plaincopyprint?
				
				--&gt下面通過新增一個值為-1(可取任意值)的偽列來建立索引  
			
				scott@ORCL> create index i_new_t2_obj_id on t2(obj_id,-1);  
			
				Index created.  
			
				--&gt收集索引資訊  
			
				scott@ORCL> exec dbms_stats.gather_index_stats('SCOTT','I_NEW_T2_OBJ_ID');  
			
				PL/SQL procedure successfully completed.     
			
				--&gt從下面的查詢可以看出obj_id is null使用了剛剛建立的索引  
			
				scott@ORCL> select count(*) from t2 where obj_id is null;  
			
				Execution Plan  
			
				----------------------------------------------------------  
			
				Plan hash value: 801885198  
			
				-------------------------------------------------------------------------------------  
			
				| Id  | Operation         | Name            | Rows  | Bytes | Cost (%CPU)| Time     |  
			
				-------------------------------------------------------------------------------------  
			
				|   0 | SELECT STATEMENT  |                 |     1 |     5 |     2   (0)| 00:00:01 |  
			
				|   1 |  SORT AGGREGATE   |                 |     1 |     5 |            |          |  
			
				|*  2 |   INDEX RANGE SCAN| I_NEW_T2_OBJ_ID |    99 |   495 |     2   (0)| 00:00:01 |  
			
				-------------------------------------------------------------------------------------  
			
				Predicate Information (identified by operation id):  
			
				---------------------------------------------------  
			
				   2 - access("OBJ_ID" IS NULL)  
			
				--&gt檢視剛剛建立的所有索引的相關統計資訊     
			
				scott@ORCL> select index_name,index_type,blevel,leaf_blocks,num_rows,status,distinct_keys  
			
				  2  from user_indexes where table_name='T2';  
			
				INDEX_NAME      INDEX_TYPE                         BLEVEL LEAF_BLOCKS   NUM_ROWS STATUS   DISTINCT_KEYS  
			
				--------------- ------------------------------ ---------- ----------- ---------- -------- -------------  
			
				I_FN_T2_OBJ_ID  FUNCTION-BASED NORMAL                   1          26      11719 VALID            11621  
			
				I_NEW_T2_OBJ_ID FUNCTION-BASED NORMAL                   1          32      11719 VALID            11621  
			
				I_T2_OBJ_ID     NORMAL                                  1          25      11620 VALID            11620  
			
				--&gt從上面的結果可知：  
			
				--&gt普通的B索引(I_T2_OBJ_ID)使用的索引塊最小，因為null值沒有被儲存，NUM_ROWS與DISTINCT_KEYS即是佐證  
			
				--&gt使用NVL函式建立的索引I_FN_T2_OBJ_ID中如實的反應了null值，即11620 + null值 = 11621  
			
				--&gt使用偽列建立的索引依然屬於函式索引，其耗用的葉節點塊數最多，因為多出了一個值(-1)來儲存  
			
				--&gt儘管使用NVL建立的函式佔用的磁碟空間小於使用偽列建立的索引，當在書寫謂詞時需要帶上NVL函式，而偽列索引中謂詞直接使用is null。

三、NULL值與索引衍生特性

[sql]view plaincopyprint?
				
				--&gt由前面的種種事例再次說明NULL值不會被儲存到索引中，因此基於這個特性可以使用decode函式來壓縮索引列。  
			
				--&gt在實際應用的多數情形中，如表上有列印狀態列is_printed通常為兩種情形，已列印或未列印，假定1表示已列印，而0表示未列印。  
			
				--&gt通常情況下90%以上的單據都處於已列印狀態，而僅有10%左右的處於未列印。而經常要使用的情形是查詢未列印的單據並重新列印。  
			
				--&gt基於上述情況，可以使用點陣圖索引來解決，但此處我們討論的是B樹索引，故不考慮該情形(或者說你使用了非企業版Oracle，不支援點陣圖索引)  
			
				--&gt此處對於這類情形我們可以使用decode函式來解決這個問題  
			
				--&gt更新表上的列，使之obj_id為1的行佔絕大多數  
			
				scott@ORCL> update t2 set obj_id=1 where obj_id is not null;  
			
				11620 rows updated.  
			
				--&gt更新表，使之obj_id為0的行佔少部分  
			
				scott@ORCL> update t2 set obj_id = 0 where obj_id is null;  
			
				99 rows updated.  
			
				scott@ORCL> commit;  
			
				--&gt收集統計資訊  
			
				scott@ORCL> exec dbms_stats.gather_table_stats('SCOTT','T2',cascade=>true);  
			
				PL/SQL procedure successfully completed.  
			
				--&gt表t2上obj_id列的最終分佈  
			
				scott@ORCL> select obj_id,count(*) from t2 group by obj_id;  
			
				    OBJ_ID   COUNT(*)  
			
				---------- ----------  
			
				         1      11620  
			
				         0         99     
			
				--&gt使用decode函式建立索引  
			
				--&gt注意此處decode的使用，當obj_id非0值時，其值被賦予為null值，由於該null值不會儲存到索引，因此大部分obj_id列值為1的不會被索引  
			
				scott@ORCL> create index i_fn2_t2_obj_id on t2(decode(obj_id,0,0,null));  
			
				Index created.  
			
				--&gt收集索引上的統計資訊  
			
				scott@ORCL> exec dbms_stats.gather_index_stats('SCOTT','I_FN2_T2_OBJ_ID');  
			
				PL/SQL procedure successfully completed.  
			
				--&gt檢視新索引的執行計劃  
			
				scott@ORCL> set autot trace exp;  
			
				scott@ORCL> select count(*) from t2 where decode(obj_id,0,0,null) = 0;  
			
				Execution Plan  
			
				----------------------------------------------------------  
			
				Plan hash value: 1461308992  
			
				-------------------------------------------------------------------------------------  
			
				| Id  | Operation         | Name            | Rows  | Bytes | Cost (%CPU)| Time     |  
			
				-------------------------------------------------------------------------------------  
			
				|   0 | SELECT STATEMENT  |                 |     1 |     3 |     1   (0)| 00:00:01 |  
			
				|   1 |  SORT AGGREGATE   |                 |     1 |     3 |            |          |  
			
				|*  2 |   INDEX RANGE SCAN| I_FN2_T2_OBJ_ID |    98 |   294 |     1   (0)| 00:00:01 |  
			
				-------------------------------------------------------------------------------------  
			
				Predicate Information (identified by operation id):  
			
				---------------------------------------------------  
			
				   2 - access(DECODE("OBJ_ID",0,0,NULL)=0)  
			
				--&gt當直接使用obj_id = 0來查詢時使用的是普通的B樹索引  
			
				scott@ORCL> select count(*) from t2 where obj_id = 0;  
			
				Execution Plan  
			
				----------------------------------------------------------  
			
				Plan hash value: 1804118247  
			
				---------------------------------------------------------------------------------  
			
				| Id  | Operation         | Name        | Rows  | Bytes | Cost (%CPU)| Time     |  
			
				---------------------------------------------------------------------------------  
			
				|   0 | SELECT STATEMENT  |             |     1 |     3 |     1   (0)| 00:00:01 |  
			
				|   1 |  SORT AGGREGATE   |             |     1 |     3 |            |          |  
			
				|*  2 |   INDEX RANGE SCAN| I_T2_OBJ_ID |    99 |   297 |     1   (0)| 00:00:01 |  
			
				---------------------------------------------------------------------------------  
			
				Predicate Information (identified by operation id):  
			
				---------------------------------------------------  
			
				   2 - access("OBJ_ID"=0)     
			
				--&gt當使用obj_id = 1來查詢時走全表掃描，因為obj_id = 1佔據表90%以上，由CBO特性決定了走全表掃描     
			
				scott@ORCL> select * from t2 where obj_id = 1;  
			
				Execution Plan  
			
				----------------------------------------------------------  
			
				Plan hash value: 1513984157  
			
				--------------------------------------------------------------------------  
			
				| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |  
			
				--------------------------------------------------------------------------  
			
				|   0 | SELECT STATEMENT  |      | 11620 |   249K|    14   (8)| 00:00:01 |  
			
				|*  1 |  TABLE ACCESS FULL| T2   | 11620 |   249K|    14   (8)| 00:00:01 |  
			
				--------------------------------------------------------------------------  
			
				Predicate Information (identified by operation id):  
			
				---------------------------------------------------  
			
				   1 - filter("OBJ_ID"=1)  
			
				--&gt表t2上所有索引的統計資訊  
			
				scott@ORCL> select index_name,index_type,blevel,leaf_blocks,num_rows,status,distinct_keys  
			
				  2  from user_indexes where table_name='T2';  
			
				INDEX_NAME      INDEX_TYPE                         BLEVEL LEAF_BLOCKS   NUM_ROWS STATUS   DISTINCT_KEYS  
			
				--------------- ------------------------------ ---------- ----------- ---------- -------- -------------  
			
				I_FN_T2_OBJ_ID  FUNCTION-BASED NORMAL                   1          40      11719 VALID                2  
			
				I_NEW_T2_OBJ_ID FUNCTION-BASED NORMAL                   1          52      11719 VALID                2  
			
				I_FN2_T2_OBJ_ID FUNCTION-BASED NORMAL                   0           1         99 VALID                1  
			
				I_T2_OBJ_ID     NORMAL                                  1          40      11719 VALID                2  
			
				--&gt從上面的結果可知，索引I_FN2_T2_OBJ_ID僅僅儲存了99跳記錄，且DISTINCT_KEYS值為1個，因為所有非0值的全部被置NULL。  
			
				--&gt以上方法實現了索引壓縮，避免了較大索引維護所需的開銷，同時也提高了查詢效能。  
			
				--&gtAuthor : Robinson Cheng  
			
				--&gtBlog :   http://blog.csdn.net/robinson_0612

四、總結
   1、對於用於連線或經常被謂詞使用到的列應儘可能避免NULL值屬性，因為它容易導致索引失效。
   2、為需要使用NULL值的列新增預設值(alter table tb modify(col default 'Y'))。
   3、如果NULL值不可避免也不能使用預設值，應考慮為該常用列使用nvl函式建立索引，或使用偽列來建立索引以提高查詢效能。
   4、對於複合索引應保證索引中至少有一列不為NULL值，還是因為全部列為NULL時不被索引儲存，以保證使用is null是可以使用索引。
   5、對於複合索引應保證索引列應使用資料型別長度最小的列來新增not null約束應節省磁碟空間。

來自 “ ITPUB部落格 ” ，連結：http://blog.itpub.net/22207394/viewspace-1196365/，如需轉載，請註明出處，否則將追究法律責任。

NULL 值與索引
2017-06-28
Null索引
索引與null（二）：組合索引
2019-06-01
索引Null
索引裡的NULL值與排序小記
2011-05-05
索引Null排序
NULL與索引
2013-12-17
Null索引
索引與null（一）：單列索引
2019-03-04
索引Null
MySQL null值欄位是否使用索引的總結
2018-11-30
MySqlNull索引
淺談索引序列之是否可以儲存NULL值？
2017-02-23
索引Null
關於NULL值在索引裡的兩個疑惑
2011-12-27
Null索引
再說索引與Null值對於Hints及執行計劃的影響
2009-03-25
索引Null
mysql中null與“空值”的坑
2023-04-12
MySqlNull
MySQL裡null與空值的辨析
2013-09-19
MySqlNull
【NULL】Oracle null值介紹
2022-03-21
NullOracle
論壇藉助：最佳化sql，null值如何走索引
2016-12-14
SQLNull索引
MySQL中IS NULL、IS NOT NULL、!=不能用索引？胡扯！
2019-08-22
MySqlNull索引
為什麼索引無法使用is null和is not null
2020-12-31
索引Null
不再迷惑，無值和 NULL 值
2017-06-21
Null
not null與check is not null
2011-07-12
Null
Oracle vs PostgreSQL，研發注意事項（12） - NULL與索引
2019-06-21
OracleSQLNull索引
索引失效系列——說說is null
2011-04-25
索引Null
MySQL null值儲存，null效能影響
2019-05-23
MySqlNull
[20231024]NULL值在索引的情況.txt
2023-10-30
Null索引
TreeSet的null值與元素型別的約束
2018-05-02
Null型別
hive中的null值
2014-03-27
HiveNull
null(空值)小結
2014-10-31
Null
hive NULL值影響
2015-05-31
HiveNull
case when遇上null值
2009-10-16
Null
Tableau操作技能之二——從篩選器中移除 Null 值
2018-03-01
Null
NULL列時，如何使得IS NULL或者IS NOT NULL可以使用索引來提高查詢效率
2019-06-30
Null索引
mysql探究之null與not null
2014-10-17
MySqlNull
面試題((A)null).fun()——java中null值的強轉
2019-07-22
面試題NullJava
SQL中的空值NULL
2015-09-14
SQLNull
SQL server中的NULL值
2009-02-25
SQLServerNull
唯一索引，可以在索引列插入多個null嗎
2012-09-28
索引Null
undefined與null與?. ??
2022-05-20
UndefinedNull
MySQL案例-TIMESTAMP NOT NULL與NULL
2017-08-31
MySqlNull
null與index
2018-03-03
NullIndex
null與substr
2013-01-22
Null
NULL與排序
2007-03-09
Null排序

NULL 值與索引(二)

相關文章