oracle資料庫字符集characterset遷移及變更系列一
背景
oracle資料庫字符集和應用資料密切相關,資料庫遷移也會涉及到資料庫字符集的轉換,大家常常聽到的亂碼之類的,匯出匯入表發生列長度不足的錯誤,也和資料庫字符集不無關係,本文我沿襲前文:http://blog.itpub.net/9240380/viewspace-1849340/ ,
itpub網友問題之AL32UTF8與ZHS16GBK 2種資料庫字符集database characterset,繼續學習資料庫字符集的相關知識。
結論
1,表名長度最大為30個位元組2,convert函式確實是個利器,可以驗證2種字符集是否相容,達到選擇目標資料庫字符集的作用
3,nls_lang是OS層面的環境變數,其最後部分指定客戶端採用的資料庫字符集,控制EXP及IMP和EXPDP及IMPDP的字符集
4,如果NLS_LANG及資料庫字符集不一致,會提示可能產生字符集變更的資訊,這個可能會引發資料損失
5,千萬別太輕信網上的,執行ALERT DATABASE CHARECTERSET INTERNAL_USE變更 資料庫字符集,這樣會引發隱性或未知的風險
6,nls_length_semantics預設值為BYTE,值也可為CHAR,ORACLE不建議採用後者,引發會引發效能問題以及執行時間錯誤,可能也會產生BUFFER OVERFLOW
測試
SQL> select * from v$version where rownum=1;
BANNER
----------------------------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
SQL> create table t_testlength(a int);
Table created.
SQL> select length(object_name) from user_objects where lower(object_name)='t_testlength';
LENGTH(OBJECT_NAME)
-------------------
12
SQL> alter table t_testlength rename to t_testlength_testlength_testlengt;
alter table t_testlength rename to t_testlength_testlength_testlengt
*
ERROR at line 1:
ORA-00972: identifier is too long
SQL> select length('t_testlength_testlength_testlengt') from dual;
LENGTH('T_TESTLENGTH_TESTLENGTH_TESTLENGT')
-------------------------------------------
33
SQL> select length('t_testlength_testlength_testle') from dual;
LENGTH('T_TESTLENGTH_TESTLENGTH_TESTLE')
----------------------------------------
30
可見表名長度最長為30個位元組的長度
SQL> alter table t_testlength rename to t_testlength_testlength_testle;
Table altered.
SQL> alter table t_testlength_testlength_testle rename to t_testlength_testlength_testle1;
alter table t_testlength_testlength_testle rename to t_testlength_testlength_testle1
*
ERROR at line 1:
ORA-00972: identifier is too long
--當前資料庫字符集為AL32UTF8
SQL> select parameter,value from nls_database_parameters where parameter='NLS_CHARACTERSET';
PARAMETER VALUE
------------------------------ ------------------------------
NLS_CHARACTERSET AL32UTF8
--長度刻度為位元組,還有個值也可以為CHAR
SQL> show parameter nls_length_semantics
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
nls_length_semantics string BYTE
--檢視與中文相關的資料庫字符集
SQL> select distinct parameter from v$nls_valid_values;
PARAMETER
------------------------------
CHARACTERSET
SORT
TERRITORY
LANGUAGE
SQL> select parameter,value from v$nls_valid_values where parameter='CHARACTERSET' and VALUE like '%ZH%';
PARAMETER VALUE
------------------------------ ------------------------------
CHARACTERSET ZHS16CGB231280
CHARACTERSET ZHS16MACCGB231280
CHARACTERSET ZHS16GBK
CHARACTERSET ZHS16DBCS
CHARACTERSET ZHS32GB18030
CHARACTERSET ZHT32EUC
CHARACTERSET ZHT32SOPS
CHARACTERSET ZHT16DBT
CHARACTERSET ZHT32TRIS
CHARACTERSET ZHT16DBCS
CHARACTERSET ZHT16BIG5
PARAMETER VALUE
------------------------------ ------------------------------
CHARACTERSET ZHT16CCDC
CHARACTERSET ZHT16MSWIN950
CHARACTERSET ZHT16HKSCS
CHARACTERSET ZHT16HKSCS31
CHARACTERSET ZHS16CGB231280FIXED
CHARACTERSET ZHS16GBKFIXED
CHARACTERSET ZHS16DBCSFIXED
CHARACTERSET ZHT32EUCFIXED
CHARACTERSET ZHT32TRISFIXED
CHARACTERSET ZHT16DBCSFIXED
CHARACTERSET ZHT16BIG5FIXED
22 rows selected.
好像英文字元在2種字符集佔用空間是一樣的
SQL> select 'abc',dump('abc'),dump(convert('abc','ZHS16GBK','AL32UTF8')) from dual;
'ABC' DUMP('ABC') DUMP(CONVERT('ABC','ZHS16GBK','AL32UTF8'))
------ -------------------------------------------- ------------------------------------------
abc Typ=96 Len=3: 97,98,99 Typ=1 Len=3: 97,98,99
SQL> insert into t_charset values('我們');
1 row created.
SQL> commit;
Commit complete.
SQL> select a,dump(a) from t_charset;
A DUMP(A)
------------------------------ --------------------------------------------------
我們 Typ=1 Len=9: 233,142,180,230,136,156,230,187,145
可見AL32UTF8不能正常轉化ZHS16GBK,中文會出現問題,且前者佔用位元組長度為9個,而後者變成6,造成資料損失
SQL> select a,dump(a),convert(a,'ZHS16GBK','AL32UTF8') mig_char,dump(convert(a,'ZHS16GBK','AL32UTF8')) after_dump from t_charset;
A DUMP(A) MIG_CHAR AFTER_DUMP
-------------------- -------------------------------------------------- ------------------------------ --------------------------------------------------
我們 Typ=1 Len=9: 233,142,180,230,136,156,230,187,145 Typ=1 Len=6: 230,136,145,228,187,172
看到沒,如果從AL32UTF8轉化為UTF8,不會造成資料損失,中文可以正常顯示,可見CONVERT函式是個利器,對於字符集轉化
SQL> select a,dump(a),convert(a,'UTF8','AL32UTF8') mig_char,dump(convert(a,'UTF8','AL32UTF8')) after_dump from t_charset;
A DUMP(A) MIG_CHAR AFTER_DUMP
-------------------- -------------------------------------------------- ------------------------------ --------------------------------------------------
我們 Typ=1 Len=9: 233,142,180,230,136,156,230,187,145 我們 Typ=1 Len=9: 233,142,180,230,136,156,230,187,145
SQL> desc database_properties;
Name Null? Type
----------------- -------- ------------
PROPERTY_NAME NOT NULL VARCHAR2(30)
PROPERTY_VALUE VARCHAR2(400
0)
DESCRIPTION VARCHAR2(400
0)
SQL> select count(*) from database_properties;
COUNT(*)
----------
36
SQL> select distinct property_name from database_properties;
PROPERTY_NAME
------------------------------
Flashback Timestamp TimeZone
DST_UPGRADE_STATE
NLS_CURRENCY
NLS_ISO_CURRENCY
NLS_DUAL_CURRENCY
DBTIMEZONE
DEFAULT_EDITION
NLS_CALENDAR
NLS_TIMESTAMP_FORMAT
EXPORT_VIEWS_VERSION
NO_USERID_VERIFIER_SALT
PROPERTY_NAME
------------------------------
DST_SECONDARY_TT_VERSION
NLS_TERRITORY
NLS_DATE_LANGUAGE
NLS_LENGTH_SEMANTICS
WORKLOAD_REPLAY_MODE
DEFAULT_PERMANENT_TABLESPACE
DEFAULT_TBS_TYPE
NLS_LANGUAGE
NLS_TIMESTAMP_TZ_FORMAT
NLS_NCHAR_CHARACTERSET
NLS_RDBMS_VERSION
PROPERTY_NAME
------------------------------
NLS_NUMERIC_CHARACTERS
NLS_TIME_FORMAT
NLS_NCHAR_CONV_EXCP
GLOBAL_DB_NAME
DEFAULT_TEMP_TABLESPACE
NLS_DATE_FORMAT
NLS_COMP
DICT.BASE
TDE_MASTER_KEY_ID
DST_PRIMARY_TT_VERSION
NLS_CHARACTERSET
PROPERTY_NAME
------------------------------
NLS_SORT
WORKLOAD_CAPTURE_MODE
NLS_TIME_TZ_FORMAT
36 rows selected.
SQL> select property_name,property_value from database_properties where property_name='NLS_LENGTH_SEMANTICS';
PROPERTY_NAME PROPERTY_VALUE
------------------------------ --------------------------------------------------
NLS_LENGTH_SEMANTICS BYTE
--查官方手冊,梳理上述引數nls_length_semantics的含義
Oracle? Database Globalization Support Guide 之
3 Setting Up a Globalization Support Environment
1,nls_length_semantics的可用值為:byte,char,預設值為byte
2,前值適用於單位元組資料庫字符集編碼規則,後者適用於多位元組資料庫字符集編碼規則
3,char,varchar2,long資料型別可以選取byte,也可以選擇char
4,nchar,nvarchar2,clob,nclob,只能選用char
5,如果變更nls_length_semantics不會影響已存在的資料
6,nls_length_semantics可以在資料庫級,例項級,會話級進行調整
7,sys使用者下的資料只會採用byte
8,如果在客戶端顯示指定與伺服器端不同的值,錄入資料以客戶端為準
9,為了資料相容,ORACLE不建議配置值為CHAR,因為會導致產生很多問題:執行時錯誤,BUFFER OVERFLOW
---測試下zhs16gbk儲存固定寬度的中文字元,然後轉化為AL32UTF8
SQL> conn /as sysdba
Connected.
SQL> alter system enable restricted session;
System altered.
SQL> ALTER DATABASE character set INTERNAL_USE ZHS16GBK;
Database altered
SQL> alter system disable restricted session;
System altered.
SQL> conn scott/system
Connected.
SQL> create table t_charset(a char(10));
Table created.
SQL> insert into t_charset values('夥伴');
1 row created.
SQL> commit;
Commit complete.
--可見佔用10個位元組
SQL> select a,dump(a) from t_charset;
A DUMP(A)
-------------------- --------------------------------------------------
夥伴 Typ=96 Len=10: 228,188,153,228,188,180,32,32,32,32
SQL> conn /as sysdba
Connected.
SQL> alter system enable restricted session;
System altered.
SQL> alter database character set INTERNAL_USE AL32UTF8;
Database altered.
SQL> alter system disable restricted session;
System altered.
--看到沒,轉化AL32UTF8,中文顯示不出來
SQL> conn scott/system
Connected.
SQL> select a,dump(a) from t_charset;
A DUMP(A)
-------------------- --------------------------------------------------
Typ=96 Len=10: 228,188,153,228,188,180,32,32,32,32
SQL> insert into t_charset values('夥伴');
1 row created.
SQL> commit;
Commit complete.
--可見基於同樣的中文,ZHS16GBK與AL32UTF8,編碼規則發生了變化,雖然佔用的位元組數相同,當然顯示不出來了
SQL> select a,dump(a) from t_charset;
A DUMP(A)
-------------------- ----------------------------------------------------------------------------------------------------
Typ=96 Len=10: 228,188,153,228,188,180,32,32,32,32
夥伴 Typ=96 Len=10: 230,181,188,230,172,142,229,141,179,32
----在al32utf8匯出測試表
可見匯出失敗了(這裡失敗是因為表中同時儲存2種不同字符集的表資料)
[oracle@seconary ~]$ exp file=exp_t_charset.dmp userid=scott tables=t_charset
Export: Release 11.2.0.1.0 - Production on Sun Nov 29 16:27:59 2015
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Password:
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in ZHS16GBK character set and AL16UTF16 NCHAR character set
server uses AL32UTF8 character set (possible charset conversion)
About to export specified tables via Conventional Path ...
. . exporting table T_CHARSET
EXP-00008: ORACLE error 6552 encountered
ORA-06552: PL/SQL: Compilation unit analysis terminated
ORA-06553: PLS-553: character set name is not recognized
Export terminated successfully with warnings.
[oracle@seconary ~]$
---刪除測試表,然後調整資料庫字元,再匯入測試表
SQL> drop table t_charset purge;
Table dropped.
SQL> conn /as sysdba
Connected.
SQL> alter system enable restricted session;
System altered.
SQL> alter database character set INTERNAL_USE ZHS16GBK;
Database altered.
SQL> alter system disable restricted session;
System altered.
[oracle@seconary ~]$ imp file=exp_t_charset.dmp userid=scott tables=t_charset
Import: Release 11.2.0.1.0 - Production on Sun Nov 29 16:31:43 2015
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Password:
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export file created by EXPORT:V11.02.00 via conventional path
import done in ZHS16GBK character set and AL16UTF16 NCHAR character set
. importing SCOTT's objects into SCOTT
Import terminated successfully without warnings.
---我們重新測試上述的例子,清空測試表
---al32utf8
SQL> conn scott/system
Connected.
SQL> create table t_charset(a char(10));
Table created.
SQL> insert into t_charset values('夥伴');
1 row created.
SQL> commit;
Commit complete.
SQL> select a,dump(a) from t_charset;
A DUMP(A)
-------------------- ----------------------------------------------------------------------
夥伴 Typ=96 Len=10: 230,181,188,230,172,142,229,141,179,32
可見現在表中1種資料庫字符集,EXP還是報錯,說明用alter database character set internal_use這種方式改資料庫字符集,其實具備很大的風險,很可能會把資料庫搞壞
[oracle@seconary ~]$ exp file=exp_t_charset_pure.dmp userid=scott tables=t_charset
Export: Release 11.2.0.1.0 - Production on Sun Nov 29 16:38:37 2015
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Password:
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in ZHS16GBK character set and AL16UTF16 NCHAR character set --可見匯出是以ZHS16GBK,這個提示與OS的環境變數NLS_LANG有關
server uses AL32UTF8 character set (possible charset conversion)
About to export specified tables via Conventional Path ...
. . exporting table T_CHARSET
EXP-00008: ORACLE error 6552 encountered
ORA-06552: PL/SQL: Compilation unit analysis terminated
ORA-06553: PLS-553: character set name is not recognized
Export terminated successfully with warnings.
[oracle@seconary ~]$
---檢視NLS_LANG
[oracle@seconary ~]$ env|grep LANG
NLS_LANG=AMERICAN_AMERICA.ZHS16GBK
LANG=en_US.utf8
--調整NLS_LANG
[oracle@seconary ~]$ export NLS_LANG=AMERICAN_AMERICA.AL32UTF8
[oracle@seconary ~]$ env|grep LANG
NLS_LANG=AMERICAN_AMERICA.AL32UTF8
---調整NLS_LANG中的資料庫字符集與資料庫中的字符集相同,匯出還是報錯
[oracle@seconary ~]$ exp file=exp_t_charset_pure.dmp userid=scott tables=t_charset
Export: Release 11.2.0.1.0 - Production on Sun Nov 29 16:42:41 2015
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Password:
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in AL32UTF8 character set and AL16UTF16 NCHAR character set
About to export specified tables via Conventional Path ...
. . exporting table T_CHARSET
EXP-00008: ORACLE error 6552 encountered
ORA-06552: PL/SQL: Compilation unit analysis terminated
ORA-06553: PLS-553: character set name is not recognized
Export terminated successfully with warnings.
藉此學習下nls_lang中的字符集是否配置不同,會影響EXP匯出的字符集匯出的結果
可見確實會影響EXP匯出的效果,且會影響SQLPLUS的執行
[oracle@seconary ~]$ export NLS_LANG=AMERICAN_AMERICA.ZHT16DBCSFIXED
[oracle@seconary ~]$ exp file=exp_t_charset_pure_other.dmp userid=scott tables=t_charset
BoBoBoBoBoBo: Release 11.2.0.1.0 - Production on
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
EXP-00092: BoBoBoBoBoBo@@BoBo@@BoBoBo@@BoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBo@@BoBo@@BoBoBoBoBoBoBoBo@@BoBoBoBoBoBoBoBoBo
EXP-00000: BoBoBoBoBoBo@@BoBoBoBoBoBoBoBoBoBo@@BoBoBoBoBoBoBoBoBoBoBoBoBoBoBo[oracle@seconary ~]$
[oracle@seconary ~]$
[oracle@seconary ~]$
[oracle@seconary ~]$
[oracle@seconary ~]$ sqlplus '/as sysdba'
Error 19 initializing SQL*Plus
Invalid NLS character set for this OS environment
---可見NLS_LANG確實會影響EXP資料庫字符集的選擇,所以一定要理解此引數的含義,否則EXP IMP以及EXPDP和IMPDP會產生資料損失以及產生亂碼
[oracle@seconary ~]$ export NLS_LANG=AMERICAN_AMERICA.UTF8
[oracle@seconary ~]$ exp file=exp_t_charset_pure_other.dmp userid=scott tables=t_charset
Export: Release 11.2.0.1.0 - Production on Sun Nov 29 17:24:25 2015
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Password:
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in UTF8 character set and AL16UTF16 NCHAR character set
server uses AL32UTF8 character set (possible charset conversion)
About to export specified tables via Conventional Path ...
. . exporting table T_CHARSET 1 rows exported
Export terminated successfully without warnings.
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/9240380/viewspace-1849953/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- oracle資料庫字符集characterset遷移及變更系列二Oracle資料庫
- oracle資料庫字符集characterset遷移及變更之csscan及csalter.plb系列三Oracle資料庫CSS
- oracle資料庫字符集characterset遷移及變更之csscan及csalter.plb系列四Oracle資料庫CSS
- 遷移資料時oracle字符集的轉換遷移資料時oracle字符集的轉換Oracle
- ORACLE資料庫遷移Oracle資料庫
- Oracle資料庫遷移之一:RMANOracle資料庫
- oracle10g_csscan_更變資料庫字符集及國家字符集的工具測試OracleCSS資料庫
- 資料庫-oracle-資料庫遷移資料庫Oracle
- Oracle資料庫資料遷移流程Oracle資料庫
- oracle RAC 更換儲存遷移資料Oracle
- oracle 遷移資料庫到asmOracle資料庫ASM
- 資料庫的升級及遷移資料庫
- Oracle資料庫(資料泵)遷移方案(上)Oracle資料庫
- Oracle資料庫(資料泵)遷移方案(下)Oracle資料庫
- oracle xtts資料庫遷移方法測試之一OracleTTS資料庫
- 無外網Oracle資料庫遷移Oracle資料庫
- Oracle資料庫(DataGuard)遷移方案(上)Oracle資料庫
- Oracle資料庫(DataGuard)遷移方案(中)Oracle資料庫
- Oracle資料庫(DataGuard)遷移方案(下)Oracle資料庫
- 【DATAPUMP】使用DataPump遷移Oracle資料庫Oracle資料庫
- Oracle資料庫遷移方案比較Oracle資料庫
- 【遷移】使用rman遷移資料庫資料庫
- 資料庫遷移資料庫
- ORACLE 資料遷移Oracle
- 【資料遷移】RMAN遷移資料庫到ASM(一)建立ASM磁碟組資料庫ASM
- 使用RMAN進行Oracle資料庫遷移Oracle資料庫
- ORACLE資料庫切換和遷移方案Oracle資料庫
- Oracle遷移資料庫過程記錄Oracle資料庫
- 跨平臺遷移oracle資料庫指南Oracle資料庫
- OGG資料庫遷移方案(一)資料庫
- 資料庫遷移 :理解資料庫
- laravel資料庫遷移Laravel資料庫
- Odoo遷移資料庫Odoo資料庫
- redis資料庫遷移Redis資料庫
- Cacti 遷移資料庫資料庫
- 資料庫遷移方案資料庫
- 遷移資料庫成功!資料庫
- 建立資料庫遷移資料庫