PostgreSQL10.0preview功能增強-國際化功能增強,支援ICU(InternationalComponentsforUnicode)

德哥發表於2017-03-30

背景

ICU是一個成熟的，被廣泛使用的跨平臺一致性全球化支援庫。使用沒有任何限制的開源許可，可以被商業、開源軟體隨意使用。

ICU is a mature, widely used set of C/C++ and Java libraries providing Unicode and Globalization support for software applications.   
ICU is widely portable and gives applications the same results on all platforms and between C/C++ and Java software.  
  
ICU is released under a nonrestrictive open source license that is suitable for use with both commercial software and with other open source or free software.

ICU的好處是與UNICODE標準最為貼近，而且可以使用ICU，軟體可以做到跨平臺保持一致性（只要是在ICU支援的平臺中）。

ICU支援的功能如下，包括unicode和文字的轉換，本土化的排序、時間日期格式支援，時區換算，規則表示式的unicode支援，等等。

Code Page Conversion: Convert text data to or from Unicode and nearly any other character set or encoding. ICU`s conversion tables are based on charset data collected by IBM over the course of many decades, and is the most complete available anywhere.  
  
Collation: Compare strings according to the conventions and standards of a particular language, region or country. ICU`s collation is based on the Unicode Collation Algorithm plus locale-specific comparison rules from the Common Locale Data Repository, a comprehensive source for this type of data.  
  
Formatting: Format numbers, dates, times and currency amounts according the conventions of a chosen locale. This includes translating month and day names into the selected language, choosing appropriate abbreviations, ordering fields correctly, etc. This data also comes from the Common Locale Data Repository.  
  
Time Calculations: Multiple types of calendars are provided beyond the traditional Gregorian calendar. A thorough set of timezone calculation APIs are provided.  
  
Unicode Support: ICU closely tracks the Unicode standard, providing easy access to all of the many Unicode character properties, Unicode Normalization, Case Folding and other fundamental operations as specified by the Unicode Standard.  
  
Regular Expression: ICU`s regular expressions fully support Unicode while providing very competitive performance.  
  
Bidi: support for handling text containing a mixture of left to right (English) and right to left (Arabic or Hebrew) data.  
  
Text Boundaries: Locate the positions of words, sentences, paragraphs within a range of text, or identify locations that would be suitable for line wrapping when displaying the text.

PostgreSQL 以前的全球化是通過glibc庫來支援，受到glibc版本的影響，在更換平臺時，可能影響排序或者本土化的結果。（例如windows, linux, freebsd等跨平臺使用時）。

10.0開始，支援ICU了，在安裝PG軟體的機器上安裝好ICU庫，同時在configure時開啟–with-icu，就可以使用ICU4C了。

pg_collation新增了一個欄位collprovider表示libc或者icu. 增加一個collversion欄位，記錄當時使用的ICU版本，run time時檢查，確保版本一致。

ICU support  
  
Add a column collprovider to pg_collation that determines which library  
provides the collation data.  The existing choices are default and libc,  
and this adds an icu choice, which uses the ICU4C library.  
  
The pg_locale_t type is changed to a union that contains the  
provider-specific locale handles.  Users of locale information are  
changed to look into that struct for the appropriate handle to use.  
  
Also add a collversion column that records the version of the collation  
when it is created, and check at run time whether it is still the same.  
This detects potentially incompatible library upgrades that can corrupt  
indexes and other structures.  This is currently only supported by  
ICU-provided collations.  
  
initdb initializes the default collation set as before from the   
`locale-a` output but also adds all available ICU locales with a "-x-icu"  
appended.  
  
Currently, ICU-provided collations can only be explicitly named  
collations.  The global database locales are still always libc-provided.  
  
ICU support is enabled by configure --with-icu.  
  
Reviewed-by: Thomas Munro <thomas.munro@enterprisedb.com>  
Reviewed-by: Andreas Karlsson <andreas@proxel.se>

例子

  11 CREATE TABLE collate_test1 (  
  12     a int,  
  13     b text COLLATE "en-x-icu" NOT NULL  
  14 );  
  15   
  16 d collate_test1  
  17   
  18 CREATE TABLE collate_test_fail (  
  19     a int,  
  20     b text COLLATE "ja_JP.eucjp-x-icu"  
  21 );  
  22   
  23 CREATE TABLE collate_test_fail (  
  24     a int,  
  25     b text COLLATE "foo-x-icu"  
  26 );  
  27   
  28 CREATE TABLE collate_test_fail (  
  29     a int COLLATE "en-x-icu",  
  30     b text  
  31 );  
  32   
  33 CREATE TABLE collate_test_like (  
  34     LIKE collate_test1  
  35 );  
  36   
  
  92 -- constant expression folding  
  93 SELECT `bbc` COLLATE "en-x-icu" > `äbc` COLLATE "en-x-icu" AS "true";  
  94 SELECT `bbc` COLLATE "sv-x-icu" > `äbc` COLLATE "sv-x-icu" AS "false";  
  95   
  96 -- upper/lower  
  97   
  98 CREATE TABLE collate_test10 (  
  99     a int,  
 100     x text COLLATE "en-x-icu",  
 101     y text COLLATE "tr-x-icu"  
 102 );

這個patch的討論，詳見郵件組，本文末尾URL。

PostgreSQL社群的作風非常嚴謹，一個patch可能在郵件組中討論幾個月甚至幾年，根據大家的意見反覆的修正，patch合併到master已經非常成熟，所以PostgreSQL的穩定性也是遠近聞名的。

參考

https://wiki.postgresql.org/wiki/Todo:ICU

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=eccfef81e1f73ee41f1d8bfe4fa4e80576945048

http://site.icu-project.org/

PostgreSQL10.0preview功能增強-增加ProcArrayGroupUpdate等待事件
2017-04-22
SQLView事件
PostgreSQL10.0preview功能增強-OLAP增強向量聚集索引(列儲存擴充套件)
2017-03-14
SQLView索引套件
Oracle12c功能增強新特性之管理功能的增強
2016-06-11
Oracle
PostgreSQL10.0preview功能增強-邏輯複製支援並行COPY初始化資料
2017-03-28
SQLView並行
PostgreSQL10.0preview功能增強-CLOGoldestXID跟蹤
2017-03-28
SQLViewGo
openGauss DSS功能增強
2024-03-28
PostgreSQL10.0preview功能增強-JSON內容全文檢索
2017-03-14
SQLViewJSON
PostgreSQL10.0preview功能增強-後臺執行(pg_background)
2017-03-24
SQLView
.NET Framework 新功能和增強的功能
2008-09-08
Framework
博雲容器雲升級，強化支援IPv6及多種功能增強
2019-12-30
PostgreSQL10.0preview功能增強-自由定義統計資訊維度
2017-03-30
SQLView
PostgreSQL10.0preview效能增強-hashindexmetapagecache、高併發增強
2017-03-24
SQLViewIndex
Windows7安全增強功能
2018-02-18
Windows
PostgreSQL10.0preview效能增強-支援64bitatomic
2017-04-22
SQLView
PostgreSQL10.0preview功能增強-觸發器函式內建中間表
2017-04-02
SQLView觸發器函式
功能強大！帶你走近Smartbi增強分析模組
2022-03-31
ORACLE 12C RMAN 功能增強
2016-09-22
Oracle
蘋果智慧：iOS 18 AI增強功能
2024-06-09
蘋果iOSAI
PostgreSQL10.0preview功能增強-兩段式索引(約束欄位+附加欄位)
2017-03-14
SQLView索引
PostgreSQL10.0preview功能增強-邏輯訂閱端控制引數解說
2017-04-22
SQLView
PostgreSQL10.0preview效能增強-分割槽表效能增強(plan階段加速)
2017-03-14
SQLView
MySQL慢日誌功能分析及優化增強
2018-10-19
MySql優化
VirtualBox安裝增強功能報錯
2019-02-24
VBox安裝增強功能報錯
2017-10-11
Spring 4.3 的新功能和增強
2016-06-22
Spring
SQL 2005 得到增強的功能
2008-09-26
SQL
PostgreSQL10.0preview功能增強-序列隔離級別預加鎖閾值可控
2017-04-22
SQLView
CrossApp V1.1.5 全新推出優化和增強功能
2015-05-28
ROSAPP優化
Ubuntu在Vbox中安裝增強功能
2021-01-01
Ubuntu
Oracle11新特性——分割槽功能增強
2012-12-25
Oracle
11gr2增強CREATE OR REPLACE TYPE功能
2012-07-06
增強互動吸引力 TikTok推出遊戲化廣告功能
2020-07-28
遊戲
PostgreSQL10.0preview功能增強-客戶端ACL（pg_hba.conf動態檢視）
2017-03-24
SQLView客戶端
PostgreSQL10.0preview功能增強-回滾範圍可精細控制(事務、語句級)
2017-03-24
SQLView
.NET 7 Preview 3新增了這些增強功能
2022-04-15
View
Mybatis 中如何優雅的增強日誌功能？
2021-03-29
MyBatis
VS Code剛剛增強Java開發功能 - foojay
2021-05-20
Java
不用typsescript也能使用型別增強功能
2021-03-04
型別

PostgreSQL10.0preview功能增強-國際化功能增強,支援ICU(InternationalComponentsforUnicode)

標籤

背景

參考

相關文章