在oracle 10.2.0.4上使用字符集掃描工具Csscan(二)

2008081036發表於2014-06-17

二、執行Csscan

1、檢視csscan相關引數

HQ-CICUTESTDB-01:cicutedb>csscan help=y

Character Set Scanner v2.1 : Release 10.2.0.3.0 - Production on Tue Jun 17 03:26:22 2014

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

You can let Scanner prompt you for parameters by entering the CSSCAN        
command followed by your username/password:                                 
                                                                            
  Example: CSSCAN SYSTEM/MANAGER                                            
                                                                            
Or, you can control how Scanner runs by entering the CSSCAN command         
followed by various parameters. To specify parameters, you use keywords:    
                                                                            
  Example: CSSCAN SYSTEM/MANAGER FULL=y TOCHAR=utf8 ARRAY=1024000 PROCESS=3  
                                                                            
Keyword    Default Prompt Description                                       
---------- ------- ------ ------------------------------------------------- 
USERID             yes    username/password                                 
FULL       N       yes    scan entire database                              
USER               yes    owner of tables to be scanned                     
TABLE              yes    list of tables to scan                            
COLUMN             yes    list of columns to scan                            
EXCLUDE                   list of tables to exclude from scan               
TOCHAR             yes    new database character set name                   
FROMCHAR                  current database character set name               
TONCHAR                   new national character set name                   
FROMNCHAR                 current national character set name               
ARRAY      1024000 yes    size of array fetch buffer                        
PROCESS    1       yes    number of concurrent scan process                 
MAXBLOCKS                 split table if block size exceed MAXBLOCKS        
CAPTURE    N              capture convertible data                          
SUPPRESS                  maximum number of exceptions logged for each table
FEEDBACK                  report progress every N rows                      
BOUNDARIES                list of column size boundaries for summary report 
LASTRPT    N              generate report of the last database scan         
LOG        scan           base file name of report files                    
PARFILE                   parameter file name                               
PRESERVE   N              preserve existing scan results                    
LCSD       N       no     enable language and character set detection       
LCSDDATA   LOSSY   no     define the scope of the detection                 
HELP       N              show help screen (this screen)                    
QUERY      N              select clause to scan subset of tables or columns 
---------- ------- ------ ------------------------------------------------- 
Scanner terminated successfully.

2、執行csscan

[oracle@HQ-CICUTESTDB-01 ~]$ csscan userid="'"sys/oracle as sysdba"'" full=y fromchar=AL32UTF8 tochar=UTF8 log=cssan_check process=4 array=1024000

。。。省略
. process 1 scanning SYS.WRH$_ACTIVE_SESSION_HISTORY[AAAlypAADAAABPBAAA]
. process 2 scanning SYS.WRH$_SERVICE_WAIT_CLASS[AAAlu4AADAAAIKRAAA]
. process 3 scanning SYS.WRH$_SGASTAT[AAAltDAADAAAIRBAAA]
. process 4 scanning SYS.WRH$_ACTIVE_SESSION_HISTORY[AAAlq2AADAAAMxxAAA]

Creating Database Scan Summary Report...

Creating Individual Exception Report...

Scanner terminated successfully.

執行完畢後,會在當前目錄下生成以引數log指定命名的三個檔案,分別是:cssan_check.err  cssan_check.out  cssan_check.txt

二、Csscan結果分析

1、.out檔案
cssan_check.out - Scan log -- provides a complete list of all scanned items     cssan的執行日誌,只是記錄的cssan的執行過程

2、.txt檔案
cssan_check.txt - Database Scan Summary -- Summary of scan results     掃描結果概要
下面是一個TOCHAR=AL32UTF8的.txt檔案樣例:

HQ-CICUTESTDB-01:cicutedb>more cssan_check.txt 
Database Scan Summary Report
Time Started  : 2014-06-17 04:03:30
Time Completed: 2014-06-17 04:06:53

Process ID         Time Started       Time Completed
---------- -------------------- --------------------
         1  2014-06-17 04:03:32  2014-06-17 04:06:51
         2  2014-06-17 04:03:32  2014-06-17 04:06:51
         3  2014-06-17 04:03:32  2014-06-17 04:06:51
         4  2014-06-17 04:03:32  2014-06-17 04:06:51
---------- -------------------- --------------------

[Database Size]
Tablespace                           Used            Free           Total       Expansion
------------------------- --------------- --------------- --------------- ---------------
SYSTEM                            598.19M         401.81M       1,000.00M           1.00K
UNDOTBS1                          489.00M       3,511.00M       4,000.00M            .00K
SYSAUX                            408.13M         391.88M         800.00M            .00K
。。。
TBLSPACE_CLAIM                    127.63M       1,872.38M       2,000.00M            .00K
TBLSPACE_CSMIG                      9.19M          90.81M         100.00M            .00K
------------------------- --------------- --------------- --------------- ---------------
Total                          23,331.38M      44,568.63M      67,900.00M           1.00K

The size of the largest CLOB is 1625114 bytes

[Database Scan Parameters]
Parameter                      Value                                           
------------------------------ ------------------------------------------------
CSSCAN Version                 v2.1                                            
Instance Name                  cicutedb                                        
Database Version               10.2.0.4.0                                      
Scan type                      Full database                                   
Scan CHAR data?                YES                                             
Database character set         AL32UTF8                                        
FROMCHAR                       AL32UTF8                                        
TOCHAR                         UTF8                                            
Scan NCHAR data?               NO                                              
Array fetch buffer size        1024000                                         
Number of processes            4                                               
Capture convertible data?      NO                                              
------------------------------ ------------------------------------------------

[Scan Summary]
Some character type data in the data dictionary are not convertible to the new character set
Some character type application data are not convertible to the new character set

[Data Dictionary Conversion Summary]
Datatype                    Changeless      Convertible       Truncation            Lossy
--------------------- ---------------- ---------------- ---------------- ----------------
VARCHAR2                     4,864,310                0                0               58
CHAR                             1,104                0                0                0
LONG                           232,188                0                0                0
CLOB                            39,382                0                0                0
VARRAY                          22,432                0                0                0
--------------------- ---------------- ---------------- ---------------- ----------------
Total                        5,159,416                0                0               58
Total in percentage             99.999%           0.000%           0.000%           0.001%

The data dictionary can not be safely migrated using the CSALTER script

[Application Data Conversion Summary]
Datatype                    Changeless      Convertible       Truncation            Lossy
--------------------- ---------------- ---------------- ---------------- ----------------
VARCHAR2                 1,714,396,614                0                0           73,068
CHAR                           154,597                0                0                0
LONG                                 0                0                0                0
CLOB                            41,562                0                0                0
VARRAY                           1,575                0                0                0
--------------------- ---------------- ---------------- ---------------- ----------------
Total                    1,714,594,348                0                0           73,068
Total in percentage             99.996%           0.000%           0.000%           0.004%

[Distribution of Convertible, Truncated and Lossy Data by Table]
USER.TABLE                                              Convertible       Truncation            Lossy
-------------------------------------------------- ---------------- ---------------- ----------------
SYS.SOURCE$                                                       0                0               5
SYS.WRH$_SQLSTAT                                                  0                0               53
UKPRODRES.GCADJUSTMENTCHARGE                                      0                0               10
UKPRODRES.GCADJUSTMENTFEE                                         0                0               34
UKPRODRES.GCADJUSTMENTITEM                                        0                0               32
。。。。。。

[Distribution of Convertible, Truncated and Lossy Data by Column]
USER.TABLE|COLUMN                                       Convertible       Truncation            Lossy
-------------------------------------------------- ---------------- ---------------- ----------------
SYS.SOURCE$|SOURCE                                                0                0               5
SYS.WRH$_SQLSTAT|ACTION                                           0                0               53
UKPRODRES.GCADJUSTMENTCHARGE|PAYEE                                0                0               10
UKPRODRES.GCADJUSTMENTFEE|PAYEE                                   0                0               34
。。。。。。

[Indexes to be Rebuilt]
USER.INDEX on USER.TABLE(COLUMN)                                                         
-----------------------------------------------------------------------------------------
UKPRODRES.IDX_GCREGISTPOLICY_INSUREDNAME on UKPRODRES.GCREGISTPOLICY(INSUREDNAME)
UKPRODRES.IND_GSCLIMAIN_CLIENTCNAME on UKPRODRES.GSCLIENTMAIN(CLIENTCNAME)
。。。。。。
UKPRODRES.IND_GUPROPOSALMAIN_INSUREDNAME on UKPRODRES.GUPROPOSALMAIN(INSUREDNAME)
-----------------------------------------------------------------------------------------


The .txt file shows:

  •   Time Started / Time Completed: duration of the Csscan run. Csscan will do a fetch of all character data, so running time is in most cases at least the time to do a full export.
  •   [Database Size]:  the size of the data within the database, the Expansion column (if applicable) gives an estimation on how much more place you need in the current tablespace when going to the new characterset. The Tablespace Expansion for tablespace X is calculated as the grand total of the differences between the byte length of a string converted to the target character set and the original byte length  of this string over all strings scanned in tables in X. The distribution of values in blocks, PCTFREE, free extents, etc., are not taken into account.
  •   [Database Scan Parameters]: the parameters used to run Csscan
  •   [Scan Summary]: which gives you directly an idea if you can do a full exp/imp, use Csalter or "Alter Database Character Set" as described in point D)
  •   [Data Dictionary Conversion Summary]: gives an overview of the amount of Changeless, Convertible, Truncation or Lossy data there is in the Data Dictionary.
  •   [Application Data Conversion Summary]: gives an overview of the amount of Changeless, Convertible, Truncation or Lossy data there is in User data.
  •   [Distribution of Convertible Data per Table]: gives a breakdown on table basis.
  •   [Distribution of Convertible Data per Column]: gives a breakdown on column basis.
  •   [Indexes to be Rebuilt]: gives which indexes are going to be affected by convertible data. The name of the section is bit misleading. When using full export/import there nothing to do on those indexes. When using Cslater/alter database characterset together with a partial export/import it depends on the amount of 'convertible' data in the underlying columns. If only a few rows in the underlying columns are 'convertible' then there is nothing to do (the indexes do not need to be rebuild as such). But if you have a lot of 'convertible' data in underlying columns it might be a good idea to drop and recreate them after the import, simply for performance reasons. The only exception is an index on a CHAR/NCHAR column that you need to adapt for "truncation". In that case all key values of a CHAR/NCHAR index key have to be padded with blanks to the new length and it may be more efficient to drop and recreate the index.
  •   [Truncation Due To Character Semantics]: (not often seen) This can be seen if you use Char Semantics in the current database.The Truncation Due to Character Semantics section identifies the number of data cells that would be truncated if they were converted to the target character set (for example, by the SQL CONVERT function or another inline conversion process) before the database character set is updated with the Csalter script. If the data conversion occurs after the database character set is changed (= you use export/import for convertible data), then this section can be ignored.
    從上述的.txt樣例中可以分析:資料字典資料和使用者資料都有資料會在轉換後丟失,資料字典資料會丟失58行,佔比0.001%使用者資料會丟失73068行佔比0.004%

3、.err檔案
cssan_check.err - Individual Exceptions -- Excerpted view of scan exceptions.     

 CAPTURE=Y or CAPTURE=N引數對.err檔案有很大的影響:
CAPTURE=N:.err檔案只會記錄會lossy或truncation的行。
CAPTURE=Y:.err檔案會記錄會lossy或truncation和可轉換的行。同時這會增加csmig 下的表空間使用,特別是可轉換的行的量特別大時。SUPPRESS引數可以限制.err檔案的大小,SUPPRESS=1000最多隻會記錄一個表的的1000行。

資料丟失或被截斷的詳細記錄,包括涉及的表及列和rowid,樣例如下:
HQ-CICUTESTDB-01:cicutedb>more cssan_check.err 
Database Scan Individual Exception Report

[Database Scan Parameters]

Parameter                      Value                                           
------------------------------ ------------------------------------------------
CSSCAN Version                 v2.1                                            
Instance Name                  cicutedb                                        
Database Version               10.2.0.4.0                                      
Scan type                      Full database                                   
Scan CHAR data?                YES                                             
Database character set         AL32UTF8                                        
FROMCHAR                       AL32UTF8                                        
TOCHAR                         UTF8                                            
Scan NCHAR data?               NO                                              
Array fetch buffer size        1024000                                         
Number of processes            4                                               
Capture convertible data?      NO                                              
------------------------------ ------------------------------------------------


[Data Dictionary individual exceptions]
User  : SYS
Table : SOURCE$
Column: SOURCE
Type  : VARCHAR2(4000)
Number of Exceptions         : 5         
Max Post Conversion Data Size: 4000      


ROWID              Exception Type      Size Cell Data(first 30 bytes)     
------------------ ------------------ ----- ------------------------------
AAAABIAABAAAR2VAB+ lossy conversion         raise_application_error(-20002
AAAABIAABAAAR2VAB7 lossy conversion         raise_application_error(-20001
AAAABIAABAAAR2VACO lossy conversion         raise_application_error(-20003
AAAABIAABAAAR2VACb lossy conversion         raise_application_error(-20003
AAAABIAABAAAR2VACv lossy conversion         raise_application_error(-20004
------------------ ------------------ ----- ------------------------------


User  : SYS
Table : WRH$_SQLSTAT
Column: ACTION
Type  : VARCHAR2(64)
Number of Exceptions         : 53        
Max Post Conversion Data Size: 39        


ROWID              Exception Type      Size Cell Data(first 30 bytes)     
------------------ ------------------ ----- ------------------------------
AAAlrOAADAAAIU+AAF lossy conversion         SQL 視窗 - 新建               
AAAlrOAADAAAJXNAAS lossy conversion         SQL 視窗 - 新建               
。。。。。。                  
AAAlvJAADAAAJ0wAAN lossy conversion         SQL 視窗 - 新建               
------------------ ------------------ ----- ------------------------------


[Application data individual exceptions]
User  : UKPRODRES
Table : GPREMINDERHISTORY
Column: RECEIVERNAME
Type  : VARCHAR2(1000)
Number of Exceptions         : 1         
Max Post Conversion Data Size: 71        
。。。。。。

In the .err this is listed:

  •  [Database Scan Parameters]: the used Csscan parameters.
  •  [Data Dictionary individual exceptions]: exceptions (LOSSY,TRUNCATION and CONVERTIBLE) for Data Dictionary objects.
  •  [Application data individual exceptions]: exceptions (LOSSY,TRUNCATION and CONVERTIBLE if CAPTURE=Y) for User objects.

For each affected column it lists:

  •  User : user name 
  •  Table : table name
  •  Column: column name
  •  Type : data type of the column and the defined column length
  •  Number of Exceptions : the number of rows in this column that are lossy, convertible or truncation.
  •  Max Post Conversion Data Size: the maximum size of the data in this column after conversion to the new characterset.

一般情況下,根據[Scan Summary]的結果可以採用下面的轉換方法:

D.1) (any Oracle version) To use a (full) exp/imp into a database with the new characterset.

To use a (full) export/import into a database with the new characterset all data needs to be "changeless and convertible".

In order to use a full export/import without any data loss you need to see in the charcheck.txt file under [Scan Summary] this message::

All character type data in the data dictionary are convertible to the new character set
All character type application data are convertible to the new character set

You cannot use Csalter (10g and up) or "Alter Database Character Set" (8i/9i) if you see this as [Scan Summary].

D.2) (8i/9i only) To use "Alter Database Character Set".

To use "Alter Database Character Set" the Csscan output needs to be changeless for all CHAR VARCHAR2, CLOB and LONG data (Data Dictionary and Application Data).
In order to use "Alter Database Character Set" you need to see in the charcheck.txt file under [Scan Summary] this message::

All character type data in the data dictionary remain the same in the new character set
All character type application data remain the same in the new character set

A 'clean' Csscan run must have been completed prior to running "Alter Database Character Set". A 'clean' scan means that there is no convertible, truncation or lossy data in the database.

D.3) (10g and 11g) To use Csalter.

To use Csalter the Csscan output needs to be

* changeless for all CHAR VARCHAR2, and LONG data (Data Dictionary and Application Data)
* changeless for all Application Data CLOB
* changeless and/or convertible for all Data Dictionary CLOB

And in order to run Csalter you need to see in the charcheck.txt file under [Scan Summary] this message:

All character type application data remain the same in the new character set

and under [Data Dictionary Conversion Summary] this message:

The data dictionary can be safely migrated using the CSALTER script


參考oracle support文件:Csscan Output Explained (文件 ID 444701.1)


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/25465866/viewspace-1184874/,如需轉載,請註明出處,否則將追究法律責任。

相關文章