Export/Import and NLS Considerations (此文章不錯,推薦)

myhuaer發表於2005-06-01

Export/Import and NLS Considerations

BookmarkFixed font Go to End

: Note:15095.1
Subject: Export/Import and NLS Considerations
Type: FAQ
Status: PUBLISHED
Content Type: TEXT/PLAIN
Creation Date: 15-JUL-1998
Last Revision Date: 21-JUL-2004
Introduction 

------------

This note provides an in-depth overview of exp/imp and NLS, you may not need to

know all this if you just want to take a export and make sure you don't lose any

data. Please read this first:

[NOTE:227332.1] NLS considerations in Import/Export - Frequently Asked Questions

Export/Import and NLS considerations

------------------------------------

If you have exported/imported a database or table(s) and are now encountering

character set conversion problems, use the following information to confirm

whether the export/import procedure was performed correctly.

+ In Oracle9 and higher, most data is automaticaly exported in the character set

of the database that is exported. The only exception to that, is that

data-dictionary information is exported in the character set that is set in

the NLS_LANG environment variable when making the export.

In pre-Oracle9 export, ALL data is exported in the character set that is set

in the NLS_LANG environment variable when making the export.

This means that if the character set part of the NLS_LANG environment variable

during export is different than the database character set, there will be a

automatic conversion between those (for 9i and higher only for data-dictionary

data since "normal" data is exported in the database character set, regardless

of the NLS_LANG, as described above).

+ The export file contains the character set that the data was exported in, in

the first few bytes of the export file. Export stores the character set ID

(not the text string) of the "exporting character set". See [NOTE:48644.1].

This is relevant in pre-Oracle9 export files because all data will be exported

in the same character set and this way we can "see" what that character set

is, and we can even "update" these fields to work around certain problems

(only when advised by Oracle Support!).

For Oracle9 and higher export files this is less relevant because the

character set is stored in many more places in the export file, and can be

different depending on the data. So for Oracle9 and higher we can never

"update" the export file manualy.

+ An import session runs in the character set that is set in the NLS_LANG

environment variable for import session.

+ Import will check the character set of the data in the dump file and compare

it with the session's character set as defined in NLS_LANG.

+ No conversion occurs if the export data character set and the import session

character set are the same. If they are not the same, conversion is performed

from the export data character set to the import session character set prior

to the data being inserted into the database.

However, import can only perform this conversion for single-byte character

sets !

+ For imports into multibyte character set databases (i.e. UTF8) the character

set part of NLS_LANG should be set identical to that of the export session.

Otherwise an

IMP-16 "Required character set conversion (type %lu to %lu) not supported"

error will come up during import. See also [BUG:896407].

+ The import session character set should be a the same, or a superset of the

export data character set, otherwise special characters will not be correctly

converted.

+ If you import pre-oracle7 export files include the parameter 'CHARSET' when

defining the import parameter set. CHARSET identifies the character set of the

export file. The CHARSET option was developed to import older export files

which did not have stored character set ID information.

If you define CHARSET but the export file does contain character set

information (export version 7 and higher) then the value in CHARSET should

match the export file character set. If they do not match, IMP-42 will

result (so it's better not to use this parameter at all, unless you import

pre-oracle7 data).

+ After the data has been converted to the import session character set, it is

then converted to the database character set if they differ. The database

character set should be a superset (or the same) of the import's session

character set otherwise special characters will not be correctly converted.

So this means that there is a potential for 3 seperate conversions when moving

data from between databases, as described in the following graph:

------------------- for all data pre-oracle9 and data-

| db in character | export dictionary data in oracle9 and up:

| set A |---------------------&gt exp session is in character

------------------- set B as defined by NLS_LANG.

source Therefore the dump file is in

character set B. Character set

conversion may occur.

|

|

| move file over to

| another machine

|

V

destination

-------------------

| db in character | import imp session is in character

| set D |

------------------- The dump file is still in

character set B. Character set

conversion may occur.

During the import process

character set conversion

may occur between character

set C and the db's character

set D if they differ.

If you have any problems related to character set conversion in export/import

then you need to identify the following:

-- What is (was) the database character set in the source database (character

set A in the above)?

-- What was the client character set specified in NLS_LANG when the data was

exported (character set B in the above)?

-- What was the client character set specified in NLS_LANG when the data was

imported (character set C in the above) ?

-- What is the database character set of the destination database (character

set D in the above)?

Minimizing character set conversions during export/import

---------------------------------------------------------

As described, it is important to note import will do up to 3 character set

conversions depending on:

(a) character set of exported database

(b) NLS_LANG of export session (-> this is the "export file character set")

(c) NLS_LANG of import session

(d) character set of taget database.

Obviously there really is only the need for 1 conversion only (from original

database character set to target database character set). Minimizing the number

of conversions means that you minimize the potential of "loosing" data.

To minimize the number of conversions you are advised to follow these rules:

+ Set the NLS_LANG during export to the same as the character set of the

exported database -> this means no conversion takes place, all data is still

stored in the export file as it was stored in the database.

+ Set the NLS_LANG during import to the same value as during the export -> this

means that no conversion takes place in the import session.

+ If the character set of the target database is different, then the data will

be automatically converted when import inserts the data into the database, you

do not have to "set" anything for this, it's automatic.

Of course you do not have to follow these rules, but complying with these rules

does minimize the risk of loosing data during export/import.

Regarding the National Characterset:

------------------------------------

NCLOBs are always exported/imported in UCS-2/AL16UTF16.

NCHAR/NVARCHAR2s are always exported in the database's national character set.

This is something you can't influence by setting any parameters.

Further reading

---------------

usefull notes:

[NOTE:158577.1] NLS_LANG Explained (How does Client-Server Character Conversion Work?)

[NOTE:66320.1] Changing the Database Character Set or the Database National Character Set

For further NLS / Globalization information you may start here:

[NOTE:150091.1] Globalization Technology (NLS) Library index

[@more@]

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/34596/viewspace-799586/,如需轉載,請註明出處,否則將追究法律責任。

相關文章