Oracle學習系列—關於字元數字混合排序和中文排序的問題

bq_wang發表於2008-02-13

關於漢字排序的問題源於曾經做過的114號碼百事通查詢,當時要求提供按筆畫、部首、拼音的多種查詢方式,拼音排序直接按名稱欄位進行排序時總有些是是而非的錯誤,百思不得其解;同時筆畫、部首排序也摸不著頭腦。。。

數字和英文混合排序

1,3,A,B,101,301,AB,BA

要求按照數字,字元分別進行排序,同時在統一類別資料中,首先按數字ASC,再次按長度排序,要求結果如下

1,101,3,301,A,AB,B,BA 或者

1,3,101,301,A,B,AB,BA

但是很奇怪,第一個排序似乎在合併兩個檢視之後,排序規則有點亂了。。。


CREATE TABLE TestNumberChar

(

SortName VARCHAR2(20)

)

INSERT INTO TestNumberChar VALUES('1');

INSERT INTO TestNumberChar VALUES('3');

INSERT INTO TestNumberChar VALUES('A');

INSERT INTO TestNumberChar VALUES('B');

INSERT INTO TestNumberChar VALUES('101');

INSERT INTO TestNumberChar VALUES('301');

INSERT INTO TestNumberChar VALUES('AB');

INSERT INTO TestNumberChar VALUES('BA');

COMMIT;

CREATE VIEW vSortNumber as

SELECT * FROM TestNumberChar

WHERE SUBSTR(SortName,1,1) IN ('0','1','2','3','4','5','6','7','8','9')

ORDER BY LENGTH(SortName),SortName

CREATE VIEW vSortChar as

SELECT * FROM TestNumberChar

WHERE SUBSTR(SortName,1,1) NOT IN ('0','1','2','3','4','5','6','7','8','9')

ORDER BY LENGTH(SortName),SortName

SELECT * FROM vSortNumber

UNION

SELECT * FROM vSortChar

Order

Sort Result

1

1

2

101

3

3

4

301

5

A

6

AB

7

B

8

BA

SELECT * FROM TestNumberChar

ORDER BY TO_NUMBER

(

SUBSTR

(

SortName,1,LENGTH(SortName)-NVL

(

LENGTH

(

REPLACE

(

TRANSLATE(SortName,'0123456789.','00000000000'),'0'

)

),0

)

)

)

SELECT

SortName A,

TRANSLATE(SortName,'0123456789.','00000000000') B ,

REPLACE(TRANSLATE(SortName,'0123456789.','00000000000'),'0') C,

LENGTH(REPLACE(TRANSLATE(SortName,'0123456789.','00000000000'),

'0')) D,

NVL(LENGTH(REPLACE(TRANSLATE(SortName,'0123456789.',

'00000000000'),'0')),0) E,

LENGTH(SortName)-NVL(LENGTH(REPLACE(TRANSLATE(SortName,

'0123456789.','00000000000'),'0')),0) F,

SUBSTR(SortName,1,LENGTH(SortName)-

NVL(LENGTH(REPLACE(TRANSLATE(SortName,'0123456789.',

'00000000000'),'0')),0)) G,

TO_NUMBER(SUBSTR(SortName,1,

LENGTH(SortName)-NVL(LENGTH(REPLACE(

TRANSLATE(SortName,'0123456789.','00000000000'),'0')),0))) H

FROM TestNumberChar

ORDER BY TO_NUMBER(SUBSTR(SortName,1,

LENGTH(SortName)-NVL(LENGTH(REPLACE(TRANSLATE(SortName,

'0123456789.','00000000000'),'0')),0)))

Sort

A

B

C

D

E

F

G

H

1

1

0

0

1

1

1

2

3

0

0

1

3

3

3

101

000

0

3

101

101

4

301

000

0

3

301

301

5

A

A

A

1

1

0

6

B

B

B

1

1

0

7

BA

BA

BA

2

2

0

8

AB

AB

AB

2

2

0

關於漢字字元排序

CREATE TABLE TestHZ

(

SortName VARCHAR2(20)

)

INSERT INTO TestHZ VALUES ('');

INSERT INTO TestHZ VALUES ('');

INSERT INTO TestHZ VALUES ('');

INSERT INTO TestHZ VALUES ('');

INSERT INTO TestHZ VALUES ('');

INSERT INTO TestHZ VALUES ('');

INSERT INTO TestHZ VALUES ('');

COMMIT;

The NLSSORT Function

The NLSSORT function enables you to use any linguistic sort for an ORDER BY clause. It replaces a character string with the equivalent sort string used by the linguistic sort mechanism so that sorting the replacement strings produces the desired sorting sequence. For a binary sort, the sort string is the same as the input string.

The kind of linguistic sort used by an ORDER BY clause is determined by the NLS_SORT session parameter, but it can be overridden by explicitly using the NLSSORT function.

SELECT * FROM TestHZ

--按拼音排序
SELECT * FROM TestHZ

ORDER BY SortName

--按筆畫(第一順序)、部首(第二順序)排序
SELECT * FROM TestHZ

ORDER BY NLSSORT(SortName,'NLS_SORT = SCHINESE_STROKE_M')
--
按部首(第一順序)、拼音(第二順序)排序

SELECT * FROM TestHZ

ORDER BY NLSSORT(SortName,'NLS_SORT = SCHINESE_RADICAL_M')
--
按拼音排序

SELECT * FROM TestHZ

ORDER BY NLSSORT(SortName,'NLS_SORT = SCHINESE_PINYIN_M')

粗略的看一下結果,Sort BySort By NLSSORT(XXX,’NLS_SORT = SCHINESE_PINYIN_M’)排序結果差不多,實際上,排序不同主要是針對一些比較生僻的漢字而言的,如“銠”。具體問題出在哪裡,暫時也不清楚。。。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/6517/viewspace-145517/,如需轉載,請註明出處,否則將追究法律責任。

相關文章