[20160624]慎用nvarchar2資料型別.txt

lfree發表於2016-06-24

[20160624]慎用nvarchar2資料型別.txt

--我以前的blog都寫過謹慎使用nvarchar2型別,如果沒有國際化需求,建議不要使用.
--而且這種型別可能導致另外的問題.正好最近最佳化一條sql語句,最後才發現問題所在.

1.環境:

SYSTEM@192.168.xx.yyy:1521/orcl> @ &r/ver1
PORT_STRING                    VERSION        BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
IBMPC/WIN_NT64-9.1.0           11.2.0.1.0     Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production

UPDATE pacs_image
   SET DELFLAGS = '1'
WHERE insuid = '1.2.840.10008.1.2.1000510567.20160624.4765453.0';

--執行計劃走的是全表掃描,如果檢查可以發現索引是建立的.我一看insuid的查詢值就大概知道可能是直方圖導致的問題.

SYSTEM@192.168.xx.yyy:1521/orcl> select * from (select substr(INSUID,1,32),count(*)  from tjpacs.PACS_IMAGE group by  substr(INSUID,1,32) order by 2 desc) where rownum<=10;
SUBSTR(INSUID,1,32)                 COUNT(*)
--------------------------------- ----------
1.3.12.2.1107.5.4.4.10319.300000       40893
1.3.12.2.1107.5.4.4.10278.300000       26867
1.3.12.2.1107.5.3.58.40106.12.20       15343
1.2.840.10008.1.2.160400058.2016          25
1.2.840.10008.1.2.1000342479.201          16
1.2.840.10008.1.2.1000471555.201          16
1.2.840.10008.1.2.1000353888.201          15
1.2.840.10008.1.2.1000325399.201          15
1.2.840.10008.1.2.1000450679.201          15
1.2.840.10008.1.2.1000407902.201          14
10 rows selected.

SYSTEM@192.168.xx.yyy:1521/orcl> select count(*)   from tjpacs.PACS_IMAGE ;
  COUNT(*)
----------
    184870

--不管怎樣,查詢佔14/184870應該可以很好的使用索引,但是當我查詢直方圖資訊發現:

SYSTEM@192.168.xx.yyy:1521/orcl> select * from DBA_TAB_HISTOGRAMS where table_name='PACS_IMAGE' and column_name='INSUID';
OWNER  TABLE_NAME COLUMN_NAME                             ENDPOINT_NUMBER                     ENDPOINT_VALUE ENDPOINT_A
------ ---------- -------------------- ---------------------------------- ---------------------------------- ----------
TJPACS PACS_IMAGE INSUID                                             2862  993852307125505000000000000000000
TJPACS PACS_IMAGE INSUID                                             5416  993852307130228000000000000000000

--僅僅2個backet,有點奇怪了.

SYSTEM@192.168.xx.yyy:1521/orcl> @ &r/desc tjpacs.PACS_IMAGE
Name         Null?    Type
------------ -------- ----------------------------
FILENAME     NOT NULL NVARCHAR2(128)
HOSTNAME     NOT NULL NVARCHAR2(20)
VIRTUAL_DIR           NVARCHAR2(128)
STUDYID      NOT NULL NUMBER
GROUPNO      NOT NULL NUMBER
SERIESID     NOT NULL NUMBER
IMAGEID      NOT NULL NUMBER
INSUID                NVARCHAR2(128)
IMAGENO               NVARCHAR2(10)
IMAGEDATE             DATE
IMAGEDES              NVARCHAR2(128)
BODYPART              NVARCHAR2(128)
IMAGESTATUS           NVARCHAR2(1)
IMAGETYPE             NVARCHAR2(4)
FILESIZE              NUMBER
DELFLAGS              NVARCHAR2(1)
VIEWTAGPATH           NVARCHAR2(200)

--一看錶定義很容易明白問題使用了nvarchar2型別.轉儲看看:

SYSTEM@192.168.xx.yyy:1521/orcl> select dump(INSUID,16) c70  ,insuid c60 from tjpacs.PACS_IMAGE where rownum<=1;
C70                                                                    C60
---------------------------------------------------------------------- ------------------------------------------------------------
Typ=1 Len=110: 0,31,0,2e,0,33,0,2e,0,31,0,32,0,2e,0,32,0,2e,0,31,0,31, 1.3.12.2.1107.5.4.4.10278.30000015111823402501500000000
0,30,0,37,0,2e,0,35,0,2e,0,34,0,2e,0,34,0,2e,0,31,0,30,0,32,0,37,0,38,
0,2e,0,33,0,30,0,30,0,30,0,30,0,30,0,31,0,35,0,31,0,31,0,31,0,38,0,32,
0,33,0,34,0,30,0,32,0,35,0,30,0,31,0,35,0,30,0,30,0,30,0,30,0,30,0,30,
0,30,0,30

--可以發現nvarchar2型別儲存'1'需要2個位元組.這樣直方圖僅僅分析16個字元.

SYSTEM@192.168.xx.yyy:1521/orcl> select * from (select substr(INSUID,1,16),count(*)  from tjpacs.PACS_IMAGE group by  substr(INSUID,1,16) order by 2 desc) where rownum<=10;
SUBSTR(INSUID,1,16)  COUNT(*)
-------------------- --------
1.2.840.10008.1.       101766
1.3.12.2.1107.5.        83110
1.2.840.113619.2            4

--可以發現這樣查詢建立的直方圖就是雞肋.毫無用處,導致oracle認為重複值很多.取消直方圖定義.

BEGIN
  SYS.DBMS_STATS.GATHER_TABLE_STATS (
      OwnName        => 'TJPACS'
     ,TabName        => 'PACS_IMAGE'
    ,Estimate_Percent  => SYS.DBMS_STATS.AUTO_SAMPLE_SIZE
    ,Method_Opt        => 'FOR ALL COLUMNS SIZE auto for columns INSUID size 1 '
    ,Degree            => 4
    ,Cascade           => TRUE
    ,No_Invalidate     => FALSE);
END;
/

SELECT * FROM tjpacs.pacs_image WHERE insuid = '1.2.840.10008.1.2.1000510567.20160624.4765453.0';

SYSTEM@192.168.xx.yyy:1521/orcl> @ &r/dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  fmy1xha29r0dg, child number 0
-------------------------------------
SELECT * FROM tjpacs.pacs_image WHERE insuid =
'1.2.840.10008.1.2.1000510567.20160624.4765453.0'
Plan hash value: 3036732012
-------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name              | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                   |      1 |        |       |     4 (100)|          |      1 |00:00:00.01 |       5 |
|   1 |  TABLE ACCESS BY INDEX ROWID| PACS_IMAGE        |      1 |      1 |   487 |     4   (0)| 00:00:01 |      1 |00:00:00.01 |       5 |
|*  2 |   INDEX RANGE SCAN          | PACS_IMAGE$INSUID |      1 |      1 |       |     3   (0)| 00:00:01 |      1 |00:00:00.01 |       4 |
-------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$1 / PACS_IMAGE@SEL$1
   2 - SEL$1 / PACS_IMAGE@SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("INSUID"=U'1.2.840.10008.1.2.1000510567.20160624.4765453.0')

--OK現在可以正常使用索引了.當然如果oracle如果晚上自動分析問題就還原了.最佳化這個問題需要控制直方圖建立.
exec dbms_stats.set_table_prefs('tjpacs','.pacs_image','method_opt','FOR ALL COLUMNS SIZE AUTO FOR columns  INSUID size 1');

SYSTEM@192.168.xx.yyy:1521/orcl> exec dbms_stats.set_table_prefs('tjpacs','pacs_image','method_opt','FOR ALL COLUMNS SIZE AUTO FOR columns  INSUID size 1');
PL/SQL procedure successfully completed.

SYSTEM@192.168.xx.yyy:1521/orcl> select * from SYS.OPTSTAT_USER_PREFS$;
OBJ# PNAME          VALNUM VALCHAR                                                      CHGTIME                           SPARE1
----- ---------- ---------- ------------------------------------------------------------ --------------------------------- ------
73788 METHOD_OPT            FOR ALL COLUMNS SIZE AUTO FOR COLUMNS  INSUID SIZE 1         2016-06-24 15:35:10.005000 +08:00

--另外這樣建立的索引鍵值太長,索引佔用空間也很大.

SYSTEM@192.168.xx.yyy:1521/orcl> validate index tjpacs.PACS_IMAGE$INSUID;
Index analyzed.

SYSTEM@192.168.xx.yyy:1521/orcl> @ &r/i
    HEIGHT     BLOCKS NAME          LF_ROWS    LF_BLKS LF_ROWS_LEN LF_BLK_LEN    BR_ROWS    BR_BLKS BR_ROWS_LEN BR_BLK_LEN DEL_LF_ROWS DEL_LF_ROWS_LEN DISTINCT_KEYS35M
---------- ---------- ---------- ---------- ---------- ----------- ---------- ---------- ---------- ----------- ---------- ----------- --------------- -------------
         3       4480 PACS_IMAGE     184892       4317    20799904       8000       4316         84      372844       8032           2             244        181921
                      $INSUID


MOST_REPEATED_KEY BTREE_SPACE USED_SPACE   PCT_USED ROWS_PER_KEY BLKS_GETS_PER_ACCESS   PRE_ROWS PRE_ROWS_LEN OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
----------------- ----------- ---------- ---------- ------------ -------------------- ---------- ------------ -------------- ----------------
                3    35210688   21172748         61   1.01633126           4.00816563          0            0              0                0

--一個塊僅僅儲存4個鍵值,索引達到了35M,.而表104M.

總之:在生產系統要慎用Nvarchar2型別.不要打著什麼國際化的幌子騙人,開發應該認真瞭解這種資料型別,實際上這樣的程式就是垃圾.
再回頭非常困難.

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/267265/viewspace-2120925/,如需轉載,請註明出處,否則將追究法律責任。

相關文章