SQL SERVER和ORACLE的排序問題
SQL SERVER和ORACLE的排序問題
Author:Rainny Zhong
Date: 2007-9-18
一,SQL SERVER的排序問題
SQL SERVER的排序由COLLATION指定.首先我們明白什麼是COLLATION?
COLLATION: The physical storage of character strings in Microsoft® SQL Server™ 2000 is controlled by collations. A collation specifies the bit patterns that represent each character and the rules by which characters are sorted and compared.
也就是說COLLATION指定SQL SERVER字元的編碼和排序,比較規則.
1.安裝SQL SERVER時指定例項的COLLATION:
當我們在安裝SQL SERVER時,會要我們指定SQL SERVER例項的COLLATION.有兩個選項:
l WINDOWS COLLATION
l SQL COLLATION
SQL COLLATION是為了相容以前的版本而保留的.所以一般是選擇WINDOWS COLLATION,其預設值是來自於WINDOWS作系統的LOCAL SETTING:
舉例:如果安裝SQL SERVER的作業系統其語言的英語,則WINDOWS COLLATION的預設值是適用於大部分歐洲及美洲字元的拉丁字元,稱為: LATIN1_GENERAL.
在選定了WINDOWS COLLATION以後,還可以選定排序規則,分別有:
(1) 二進位制排序
(2) 字典排序,其又細分為:
l 大小寫敏感
l 口音敏感
l <日> 假名敏感
l 寬度敏感
2.新建DATABASE時指定DB的COLLATION:
在新建一個DB時,我們可以指定DB級的COLLATION,如果沒有指定,預設是繼承自例項的COLLATION.
3.新建表時指定欄位的COLLATION:
除了在新建DB時可以指定COLLATION,我們還可以在新建TABLE時,指定COLUMN的COLLATION:
4.在SQL語句中指定COLLATION
You can override the server, database, or column collation by specifying a collation in the ORDER BY clause of a SELECT statement. For more information,for example:
Select * from test order by c collate SQL_Latin1_General_CP1_CS_AS
5.排序規則的最佳化級
排序規則的優先從高到低是:
n SQL語句級
n 欄位級
n DATABASE級
n SERVER級
二,ORACLE的排序
1.Linguistic Sorting and String Searching
Oracle provides the following types of sorts:
■ Binary sort
■ Monolingual linguistic sort
■ Multilingual linguistic sort
2.Using Linguistic Sorts
n Monolingual Linguistic Sorts
n Multilingual Linguistic Sorts
Oracle provides multilingual linguistic sorts so that you can sort data in more than one language in one sort. This is useful for regions or languages that have complex sorting rules and for multilingual databases. Oracle Database 10g supports all of the sort orders defined by previous releases.
For Asian language data or multilingual data, Oracle provides a sorting mechanism based on the ISO 14651 standard and the Unicode 4.0 standard. Chinese characters are ordered by the number of strokes, PinYin, or radicals.
In addition, multilingual sorts can handle canonical equivalence and supplementary characters. Canonical equivalence is a basic equivalence between characters or sequences of characters. For example, ç is equivalent to the combination of c and ,.Supplementary characters are user-defined characters or predefined characters in Unicode 4.0 that require two code points within a specific code range. You can define up to 1.1 million code points in one multilingual sort.
For example, Oracle supports a monolingual French sort (FRENCH), but you can specify a multilingual French sort (FRENCH_M). _M represents the ISO 14651 standard for multilingual sorting. The sorting order is based on the GENERIC_M sorting order and can sort diacritical marks from right to left. Oracle Corporation recommends using
a multilingual linguistic sort if the tables contain multilingual data. If the tables contain only French, then a monolingual French sort may have better performance because it uses less memory. It uses less memory because fewer characters are defined in a monolingual French sort than in a multilingual French sort. There is a tradeoff between the scope and the performance of a sort.
3.Monolingual Linguistic Sorts
(1)monolingual linguistic sort name
Pls refer to 369 page: Table A–14 (Cont.) Monolingual Linguistic Sorts
(2).example of Monolingual Linguistic Sorts
SELECT * FROM test3 ORDER BY NLSSORT(name, 'NLS_SORT=german');
4.Multilingual Linguistic Sorts
(1).multilingual linguistic sort name
Pls refer to 370 page: Table A–15 Multilingual LInguistic Sorts
(2).example of Multilingual Linguistic Sorts
SELECT * FROM test3 ORDER BY NLSSORT(name, 'NLS_SORT= FRENCH_M');
(3).Multilingual Sorting Levels
■ Primary Level Sorts
A primary level sort distinguishes between base letters, such as the difference between characters a and b. It is up to individual locales to define whether a is beforeb, b is before a, or if they are equal. The binary representation of the characters is completely irrelevant. If a character is an ignorable character, then it is assigned a primary level order (or weight) of zero, which means it is ignored at the primary level.Characters that are ignorable on other levels are given an order of zero at those levels.
■ Secondary Level Sorts
A secondary level sort distinguishes between base letters (the primary level sort) before distinguishing between diacritics on a given base letter. For example, the character Ä differs from the character A only because it has a diacritic. Thus, Ä and A are the same on the primary level because they have the same base letter (A) but differ on the secondary level.
■ Tertiary Level Sorts
A tertiary level sort distinguishes between base letters (primary level sort), diacritics (secondary level sort), and case (upper case and lower case). It can also include special characters such as +, -, and *.
The following are examples of tertiary level sorts:
■ Characters a and A are equal on the primary and secondary levels but different on the tertiary level because they have different cases.
■ Characters ä and A are equal on the primary level and different on the secondary and tertiary levels.
■ The primary and secondary level orders for the dash character - is 0. That is, it is ignored on the primary and secondary levels. If a dash is compared with another character whose primary level order is nonzero, for example, u, then no result for
the primary level is available because u is not compared with anything. In this case, Oracle finds a difference between - and u only at the tertiary level.
5.Linguistic Sort Parameters
n NLS_SORT
設定級別:Initialization parameter, environment variable, ALTER SESSION, and SQL functions
n NLS_COMP
設定級別:Initialization parameter, environment variable, and ALTER SESSION
事實上,這兩個引數在9I和10G都有,區別在於NLS_COMP,我們先來說說這兩個引數的作用:
Ø NLS_SORT這個引數是用於設定ORDER BY子句是否按字串的二進位制值(大小寫敏感)還是按某種語言的拼音字母表排序,如英語的英文字母表(大小寫不敏感).
Ø NLS_COMP這個引數是用於設定ORACLE中有關字串比較是否大小寫敏感的,它應用於WHERE 條件的=, LIKE 和IN 和ORDER BY 子句
NLS_SORT引數的設值,9I和10G都是差不多的.區別在於NLS_COMP的取值:
(1)在9I中,NLS_COMP只可設定兩個值:
Ø BINARY:按字元的二進位制值比較
Ø ANSI:按字元的美國標準編碼值進行比較
所以,在9I中沒辦法做到在WHERE條件的=,LIKE,IN謂詞比較中大小寫不敏感,只可以做得到在ORDER BY子句中大小寫不敏感
(2)在10G中,NLS_COMP多了一個引數值:
Ø BINARY:按字元的二進位制值比較
Ø ANSI:按字元的美國標準編碼值進行比較
Ø LINGUISTIC:當設定此值後,WHERE條件的=,LIKE,IN的比較取決於NLS_SORT引數值,舉例:當我們設定NLS_COMP=LINGUISTIC,NLS_SORT=BINARY_CI,則表示WHERE條件中字串的比較是按BINARY_CI來進行比較的(大小寫不敏感),所以在10G中,做到了真正的大小寫敏感.
怎麼來設定這兩個引數?
可以透過兩種方法來設定這兩個引數:
在WINDOWS的命令列或LINUX的終端中:
Ø WINDOWS: SET NLS_SORT=, SET NLS_COMP=
Ø LINUX: export NLS_SORT=, export NLS_COMP=
但是,不管是9I還是10G,我們都只可以在會話級(SESSION)更改這兩個引數,也就是說作用域是當前會話
6.Case-Insensitive and Accent-Insensitive Linguistic Sorts
預設地,ORACLE都是大小寫繁感的,要做到大小寫不繁感,有下列方案:
n 使用upper(),lower()函式
舉例:select * from test order by upper(c);
Select * from test where upper(c)=’A’;
缺點:必須建函式索引,否則會導致排序欄位的索引不可用
n 使用NLS_UPPER()和NLS_LOWER()函式
舉例:SELECT word FROM test1
WHERE NLS_UPPER(word, 'NLS_SORT = XGERMAN') = 'GROSSE';
優點:The NLS_UPPER()and NLS_LOWER() change the case of strings based on a specific linguistic sort definition. This enables you to perform case-insensitive searches regardless of the language being used.
缺點:如果不建指定語言的索引,將不會使用欄位的索引
n 使用linguistic sorts
In Oracle Database 10g, Oracle provides case-insensitive and accent-insensitive options for linguistic sorts. Oracle provides the following types of monolingual and multilingual linguistic sorts:
■ Linguistic sorts that use information about base letters, diacritics, punctuation, and case. These are the standard monolingual and multilingual linguistic sorts that are described in "Using Linguistic Sorts" on page 5-2.
■ Linguistic sorts that use information about base letters, diacritics, and punctuation. This type of sort is called case-insensitive.
■ Linguistic sorts that use information about base letters only. This type of sort is called accent-insensitive. (Accent is another word for diacritic.) An accent-insensitive sort is always case-insensitive as well.
7.Examples of Case-Insensitive and Accent-Insensitive Sorts
(1)透過修改NLS_SORT引數實現:
SQL> ALTER SESSION SET NLS_COMP=LINGUISTIC;
SQL> ALTER SESSION SET NLS_SORT=BINARY_CI;
SELECT * FROM test2 ORDER BY letter;
(2)透過NLSSORT函式在SQL語句級實現:
SELECT * FROM test3 ORDER BY NLSSORT(name, 'NLS_SORT= BINARY_CI');
8.Performing Linguistic Comparisons
透過設定NLS_COMP 和NLS_SORT引數來實現。
(1)Linguistic Comparison Examples
Example 8–1 Binary Comparison Binary Sort
SQL> ALTER SESSION SET NLS_COMP=BINARY;
SQL> ALTER SESSION SET NLS_SORT=BINARY;
SQL> SELECT ename FROM emp1 WHERE ename LIKE 'McC%e';
Example 8–2 Linguistic Comparison Binary Case-Insensitive Sort
SQL> ALTER SESSION SET NLS_COMP=LINGUISTIC;
SQL> ALTER SESSION SET NLS_SORT=BINARY_CI;
SQL> SELECT ename FROM emp1 WHERE ename LIKE 'McC%e';
Example 8–3 Linguistic Comparison Binary Accent-Insensitive Sort
SQL> ALTER SESSION SET NLS_COMP=LINGUISTIC;
SQL> ALTER SESSION SET NLS_SORT=BINARY_AI;
SQL> SELECT ename FROM emp1 WHERE ename LIKE 'McC%e';
Example 8–4 Linguistic Comparisons Using XSPANISH
SQL> ALTER SESSION SET NLS_COMP=LINGUISTIC;
SQL> ALTER SESSION SET NLS_SORT=XSPANISH;
SQL> SELECT ename FROM emp3 WHERE ename LIKE 'C%';
9.Using Linguistic Indexes
建立基於語方的索引。
舉例:
CREATE INDEX nls_index ON my_table (NLSSORT(name, 'NLS_SORT = German'));
(1)Linguistic Indexes for Multiple Languages
■ Build a linguistic index for each language that the application supports.
CREATE INDEX french_index ON employees (NLSSORT(employee_id, 'NLS_
SORT=FRENCH'));
CREATE INDEX german_index ON employees (NLSSORT(employee_id, 'NLS_
SORT=GERMAN'));
■ Build a single linguistic index for all languages.
在表中建立一個欄位,儲存你所需要的語言:LANG_COL
This requires a language column
(LANG_COL in "Example: Setting Up a French Linguistic Index" on page 5-19) to be used as a parameter of the NLSSORT function. The language column contains NLS_LANGUAGE values for the data in the column on which the index is built. The following example builds a single linguistic index for multiple languages. With this index, the rows with the same values for NLS_LANGUAGE are sorted together:
CREATE INDEX i ON t (NLSSORT(col, 'NLS_SORT=' || LANG_COL));
Queries choose an index based on the argument of the NLSSORT function specified in the ORDER BY clause.
■ Build a single linguistic index for all languages using one of the multilingual linguistic sorts such as GENERIC_M or FRENCH_M.
CREATE INDEX i on t (NLSSORT(col,'NLS_SORT=GENERIC_M');
(2)Requirements for Using Linguistic Indexes
n 適當地設定NLS_SORT
n 如果索引欄位沒有設定NOT NULL約束,則在WHERE子句中指定NOT NULL過濾條件
舉例:WHERE NLSSORT(column_name) IS NOT NULL
[@more@]來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/684234/viewspace-1025005/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- ms sql server排序SQLServer排序
- MySql/Oracle和SQL Server的分頁查MySqlOracleServer
- 淺談SQL Server中的快照問題SQLServer
- PostgreSQL、Oracle/MySQL和SQL Server的MVCC實現原理方式OracleMySqlServerMVC
- SQL Server 查詢超時問題排查SQLServer
- Oracle 調優確定存在問題的SQLOracleSQL
- SQL Server AlwaysOn的Oracle等價技術SQLServerOracle
- SQL Server 中將字串按數字排序SQLServer字串排序
- SQL Server database mail問題診斷一例SQLServerDatabaseAI
- SQL Server資料庫恢復常見問題SQLServer資料庫
- SQL SERVER資料庫datediff函式引發的效能問題SQLServer資料庫函式
- 與堆和堆排序相關的問題排序
- 兩款工具解決SQL Server遷移問題DJSQLServer
- SQL Server技術問題之遊標優缺點SQLServer
- SQL Server 資料庫開發中的十大問題VYSQLServer資料庫
- Oracle\MS SQL Server Update多表關聯更新OracleSQLServer
- SQL Server中datetimeset轉換datetime型別問題淺析SQLServer型別
- [提問交流]OneThink支援 sql server?SQLServer
- AWS CEO炮轟Oracle和SQL Server,背後是給Aurora鋪路!OracleSQLServer
- Oracle DBLink跨資料庫訪問SQL server資料同步 踩坑實錄Oracle資料庫SQLServer
- SQL Server常見問題介紹及快速解決建議SQLServer
- SQL Server中的日期和時間:DATEADD()SQLServer
- SQL Server 的xp_cmdshell和bcp使用SQLServer
- SQL語句中的AND和OR執行順序問題SQL
- SQL Server 資料訪問策略:CLRMESQLServer
- SQL Server 2016 的各版本和支援的功能SQLServer
- SQL Server資料庫中的資料型別隱式轉換問題SQLServer資料庫資料型別
- Sql Server資料庫類似正規表示式的字元處理問題SQLServer資料庫字元
- SQL Server中count(*)和Count(1)的區別SQLServer
- SQL中rownum和order by的執行順序的問題SQL
- 恆訊科技分析:如何解決SQL Server CPU使用率過高的問題?SQLServer
- SQL Server資料庫檢視login所授予的具體許可權問題SQLServer資料庫
- SQL Server 資料訪問策略:即席SQLCUSQLServer
- Oracle OCP(02):條件和排序Oracle排序
- sql serverSQLServer
- 單連結串列的排序問題排序
- 這就是選擇排序的問題排序
- 【SQL Server】本地備份和還原SQLServer
- Oracle優化案例-從執行計劃定位SQL問題(三)Oracle優化SQL