PostgreSQL10.0preview效能增強-hashindexmetapagecache、高併發增強
標籤
PostgreSQL , 10.0 , hash index
背景
hash index是PostgreSQL中一個非常老的索引訪問方法,也是非常經典的索引。
hash index中儲存的是索引欄位的hash value,而不是原始值,btree索引中儲存的是原始值。
因此,當欄位非常大時,btree索引可能無法使用。
例如
postgres=# create table test_hash_btree(c1 text);
CREATE TABLE
postgres=# insert into test_hash_btree values (repeat(random()::text,10000));
INSERT 0 1
postgres=# create index idx on test_hash_btree (c1);
CREATE INDEX
postgres=# insert into test_hash_btree values (repeat(random()::text,100000));
ERROR: index row requires 19504 bytes, maximum size is 8191
postgres=# drop index idx;
DROP INDEX
postgres=# create index idx on test_hash_btree using hash(c1);
WARNING: hash indexes are not WAL-logged and their use is discouraged
CREATE INDEX
postgres=# insert into test_hash_btree values (repeat(random()::text,100000));
INSERT 0 1
postgres=# insert into test_hash_btree values (repeat(random()::text,10000000));
INSERT 0 1
這種情況下可以使用hash index,而正因為 hashindex中儲存 的是雜湊,所以它只能用作等值查詢。
除非有一種雜湊函式能保證雜湊後的值和雜湊前的值順序具備一致性。否則hash index是無法支援排序、>,<,>=,<=的查詢的。
雜湊索引的結構
雜湊索引包含4種頁,meta page, primary bucket page, overflow page, bitmap page.
metapage(0號頁) , 包含了HASH索引的控制資訊,指導如何找到其他頁面(每個bucket的primary page),以及當前儲存概貌。其他索引的0號頁基本都是這一個套路。
primary bucket page,hash index將儲存劃分為多個bucket(邏輯概念),每個bucket中包含若干page(每個bucket的page數量不需要一致),當插入資料時,根據計算得到的雜湊,通過對映演算法,對映到某個bucket,也就是說資料首先知道應該插入哪個bucket中,然後插入bucket中的primary page,如果primary page空間不足時,會擴充套件overflow page,資料寫入overflow page.
在page中,資料是有序儲存(TREE),page內支援二分查詢(binary search),而page與page之間是不保證順序的,所以hash index不支援order by。
overflow page,是bucket裡面的頁,當primary page沒有足夠空間時,擴充套件的塊稱為overflow page.
bimap page,記錄primary , overflow page是否為空可以被重用。
注意bucket, page都沒有提供收縮功能,即無法從OS中收縮空間,但是提供了reuse(通過bitmap page跟蹤).
10.0 雜湊索引增強
1. Cache Hash Index meta page
快取hash index meta page頁到backend process的私有記憶體中,減少meta page的訪問。
2. Concurrent Hash Indexes
大幅提升查詢效能,超過BTREE接近一倍。
Patch_Ver/Client count 1 8 16 32 64 72 80 88 96 128
HEAD-Btree 19397 122488 194433 344524 519536 527365 597368 559381 614321 609102
HEAD-Hindex 18539 141905 218635 363068 512067 522018 492103 484372 440265 393231
Patch 22504 146937 235948 419268 637871 637595 674042 669278 683704 639967
% improvement between HEAD-Hash index vs Patch and HEAD-Btree index vs
Patch-Hash index is:
Head-Hash vs Patch 21.38 3.5 7.9 15.47 24.56 22.14 36.97 38.17 55.29 62.74
Head-Btree vs. Patch 16.01 19.96 21.35 21.69 22.77 20.9 12.83 19.64 11.29 5.06
This data shows that patch improves the performance of hash index upto
62.74 and it also makes hash-index faster than btree-index by ~20% (most
client counts show the performance improvement in the range of 15~20%.
For the matter of comparison with btree, I think the impact of performance
improvement of hash index will be more when the data doesn`t fit shared
buffers and the performance data for same is as below:
Data doesn`t fits in shared buffers
scale_factor - 3000
shared_buffers - 8GB
Client_Count/Patch 16 64 96
Head-Btree 170042 463721 520656
Patch-Hash 227528 603594 659287
% diff 33.8 30.16 26.62
The performance with hash-index is ~30% better than Btree. Note, that for
now, I have not taken the data for HEAD- Hash index. I think there will
many more cases like when hash index is on char (20) column where the
performance of hash-index can be much better than btree-index for equal to
searches.
Note that this patch is a very-much WIP patch and I am posting it mainly to
facilitate the discussion. Currently, it doesn`t have any code to perform
incomplete splits, the logic for locking/pins during Insert is yet to be
done and many more things
這個patch的討論,詳見郵件組,本文末尾URL。
PostgreSQL社群的作風非常嚴謹,一個patch可能在郵件組中討論幾個月甚至幾年,根據大家的意見反覆的修正,patch合併到master已經非常成熟,所以PostgreSQL的穩定性也是遠近聞名的。
參考
https://commitfest.postgresql.org/13/715/
https://commitfest.postgresql.org/10/647/
src/backend/access/hash/README
相關文章
- PostgreSQL10.0preview效能增強-分割槽表效能增強(plan階段加速)SQLView
- PostgreSQL10.0preview效能增強-間接索引(secondaryindex)SQLView索引Index
- PostgreSQL10.0preview效能增強-支援64bitatomicSQLView
- PostgreSQL10.0preview效能增強-OLAP提速框架,FasterExpressionEvaluationFramework(含JIT)SQLView框架ASTExpressFramework
- 震精-PostgreSQL10.0preview效能增強-WARM提升一倍效能SQLView
- PostgreSQL10.0preview功能增強-OLAP增強向量聚集索引(列儲存擴充套件)SQLView索引套件
- PostgreSQL10.0preview功能增強-國際化功能增強,支援ICU(InternationalComponentsforUnicode)SQLViewUnicode
- Oracle利用Windows的高階特性增強效能OracleWindows
- PostgreSQL10.0preview功能增強-增加ProcArrayGroupUpdate等待事件SQLView事件
- PostgreSQL10.0preview效能增強-pg_xactalign(cacheline對齊)SQLView
- Java高階特性增強-鎖Java
- PostgreSQL10.0preview功能增強-觸發器函式內建中間表SQLView觸發器函式
- PostgreSQL10.0preview功能增強-CLOGoldestXID跟蹤SQLViewGo
- MySQL 8 複製效能的增強MySql
- PostgreSQL10.0preview功能增強-JSON內容全文檢索SQLViewJSON
- PostgreSQL10.0preview功能增強-後臺執行(pg_background)SQLView
- 集合框架-增強for框架
- 增強字串 (轉)字串
- PostgreSQL10.0preview效能增強-(多維分析)更快,更省記憶體hashedaggregationwithgroupingsetsSQLView記憶體Gse
- PostgreSQL10.0preview功能增強-自由定義統計資訊維度SQLView
- OpenCL 增強單work-item kernel效能策略
- TotalFinder for MacFinder增強工具Mac
- 增強的 COMMITMIT
- 影像增強(Image enhancement)
- PostgreSQL10.0preview安全增強-任意wal副本數,金融級高可用與可靠性並存需求SQLView
- PostgreSQL10.0preview效能增強-hash,nestloopjoin優化(聰明的優化器是這樣的)SQLViewOOP優化
- Oracle12c功能增強 新特性之管理功能的增強Oracle
- 小米盒子3增強版評測 頂級高配強勢迴歸
- 用歸納偏置來增強你的模型效能模型
- 微軟Xbox One更新 控制皮膚效能增強微軟
- PostgreSQL10.0preview功能增強-兩段式索引(約束欄位+附加欄位)SQLView索引
- PostgreSQL10.0preview功能增強-邏輯訂閱端控制引數解說SQLView
- 小米盒子3增強版評測 小米盒子3 增強版區別
- TotalFinder for Mac Finder增強工具Mac
- Mybatis-Plus增強包MyBatis
- Java排序理解增強版Java排序
- TotalFinder Mac(Finder增強工具)Mac
- Mac鍵盤增強工具Mac