PostgreSQL支援Hash索引,本節簡單介紹Hash索引的結構等相關資訊.

邏輯結構
可以把Hash Index理解為一個Hash Table,每個Hash bucket儲存根據Hash Function計算得到的對應的索引條目,為了節省空間,Hash索引條目只儲存Hash Code(即Hash Value) + TID而不儲存Hash Key(即索引鍵值),掃描索引後還必須讀取相應的資料錶行,因此Index Only Scan不適用於Hash Index.


testdb=# drop table if exists t_idx1;
DROP TABLE
testdb=# create table t_idx1(id int,c1 varchar(20));
CREATE TABLE
testdb=# create index idx_t_idx1_id on t_idx1 using hash(id);
CREATE INDEX
testdb=# insert into t_idx1 select generate_series(1,100000);
INSERT 0 100000
testdb=# analyze t_idx1;
ANALYZE
testdb=# explain verbose select * from t_idx1 where id = 1;
                                     QUERY PLAN                                     
------------------------------------------------------------------------------------
 Index Scan using idx_t_idx1_id on public.t_idx1  (cost=0.00..8.02 rows=1 width=62)
   Output: id, c1
   Index Cond: (t_idx1.id = 1)
(3 rows)
testdb=# -- 不能實現Index Only Scan
testdb=# explain verbose select id from t_idx1 where id = 100;
                                    QUERY PLAN                                     
-----------------------------------------------------------------------------------
 Index Scan using idx_t_idx1_id on public.t_idx1  (cost=0.00..8.02 rows=1 width=4)
   Output: id
   Index Cond: (t_idx1.id = 100)
(3 rows)

而普通的B-Tree索引是可以Index Only Scan的:


testdb=# create table t_idx2(id int,c1 varchar(20));
CREATE TABLE
testdb=# insert into t_idx2 select generate_series(1,100000);
INSERT 0 100000
testdb=# create index idx_t_idx2_id on t_idx2 using btree(id);
CREATE INDEX
testdb=# analyze t_idx2;
ANALYZE
testdb=# explain verbose select id from t_idx2 where id = 100;
                                       QUERY PLAN                                       
----------------------------------------------------------------------------------------
 Index Only Scan using idx_t_idx2_id on public.t_idx2  (cost=0.29..8.31 rows=1 width=4)
   Output: id
   Index Cond: (t_idx2.id = 100)
(3 rows)

物理結構
下圖是Hash Index的物理儲存結構:

有四種頁面,分別是Meta page,Bucket Page,Overflow page和Bitmap page.

頁面型別	說明
Meta page	page number zero, which contains information on what is inside the index.
Bucket pages	main pages of the index, which store data as «hash code — TID» pairs.
Overflow pages	structured the same way as bucket pages and used when one page is insufficient for a bucket
Bitmap pages	which keep track of overflow pages that are currently clear and can be reused for other buckets

使用pageinspect外掛可檢視index中的相關資訊


testdb=# select hash_page_type(get_raw_page('idx_t_idx1_id',0));
 hash_page_type 
----------------
 metapage
(1 row)
testdb=# select hash_page_type(get_raw_page('idx_t_idx1_id',1));
 hash_page_type 
----------------
 bucket
(1 row)
testdb=# \x
Expanded display is on.
testdb=# select * from hash_page_stats(get_raw_page('idx_t_idx1_id',1));
-[ RECORD 1 ]---+-----------
live_items      | 189
dead_items      | 0
page_size       | 8192
free_size       | 4368
hasho_prevblkno | 256
hasho_nextblkno | 4294967295
hasho_bucket    | 0
hasho_flag      | 2
hasho_page_id   | 65408
testdb=# select * from hash_page_stats(get_raw_page('idx_t_idx1_id',2));
-[ RECORD 1 ]---+-----------
live_items      | 201
dead_items      | 0
page_size       | 8192
free_size       | 4128
hasho_prevblkno | 257
hasho_nextblkno | 4294967295
hasho_bucket    | 1
hasho_flag      | 2
hasho_page_id   | 65408

參考資料
Indexes in PostgreSQL — 3 (Hash)
Hash Indexing README
hashfunc.c

PostgreSQL DBA(43) - Index(Hash)

相關文章