震精-PostgreSQL10.0preview效能增強-WARM提升一倍效能

德哥發表於2017-03-24

標籤

PostgreSQL , 10.0 , WARM , 寫放大 , 索引寫放大


背景

目前,PostgreSQL的MVCC是多版本來實現的,當更新資料時,產生新的版本。(社群正在著手增加基於回滾段的儲存引擎)

由於索引儲存的是KEY+CTID(行號),當tuple的新版本與舊版本不在同一個資料塊(BLOCK)的時候,索引也要隨之變化,當新版本在同一個塊裡面時,則發生HOT UPDATE,索引的值不需要更新,但是因為產生了一條新的記錄,所以也需要插入一條索引item,垃圾回收時,將其回收,因此產生了寫放大。

(HOT指,舊的tuple頭部,CHAIN指向新的TUPLE。)

但是HOT總不能覆蓋100%的更新,當tuple新版本不在同一個BLOCK時,即使索引的欄位值未發生變化,也需要更新索引,因為行號變化了。

這個問題在UBER的某篇文件中反映過

《為PostgreSQL討說法 – 淺析《UBER ENGINEERING SWITCHED FROM POSTGRES TO MYSQL》》

PostgreSQL 10.0會將這個問題解決掉,有兩個手段。

1. 增加間接索引的功能。

《PostgreSQL 10.0 preview 效能增強 – 間接索引(secondary index)》

2. 增加WARM的特性,也就是本文要說的。

索引放大的問題

使用pageinspect觀察page的變化。

postgres=# create extension pageinspect ;  
CREATE EXTENSION  

建立一個表,3列,每列一個索引。

postgres=# create table tbl(id int primary key, c1 int, c2 int);  
CREATE TABLE  
postgres=# create index idx1 on tbl (c1);  
^[[ACREATE INDEX  
postgres=# create index idx2 on tbl (c2);  
CREATE INDEX  

插入一條記錄

postgres=# insert into tbl values (1,1,1);  
INSERT 0 1  

更新一個欄位的值,其他欄位的值不變

postgres=# begin;  
BEGIN  
postgres=# update tbl set c1=2 where id=1 returning ctid,*;  
 ctid  | id | c1 | c2   
-------+----+----+----  
 (0,2) |  1 |  2 |  1  
(1 row)  
UPDATE 1  

ctid=0,2,因此發生了HOT,記錄沒有出現在其他PAGE。

在另一個會話,觀察三個索引的leaf page,可以看到,資料沒有變化的索引,也插入了一條ITEM

postgres=# select * from bt_page_items(`idx2`,1);  
 itemoffset | ctid  | itemlen | nulls | vars |          data             
------------+-------+---------+-------+------+-------------------------  
          1 | (0,2) |      16 | f     | f    | 01 00 00 00 00 00 00 00  
          2 | (0,1) |      16 | f     | f    | 01 00 00 00 00 00 00 00  
(2 rows)  
postgres=# select * from bt_page_items(`idx1`,1);  
 itemoffset | ctid  | itemlen | nulls | vars |          data             
------------+-------+---------+-------+------+-------------------------  
          1 | (0,1) |      16 | f     | f    | 01 00 00 00 00 00 00 00  
          2 | (0,2) |      16 | f     | f    | 02 00 00 00 00 00 00 00  
(2 rows)  
postgres=# select * from bt_page_items(`tbl_pkey`,1);  
 itemoffset | ctid  | itemlen | nulls | vars |          data             
------------+-------+---------+-------+------+-------------------------  
          1 | (0,2) |      16 | f     | f    | 01 00 00 00 00 00 00 00  
          2 | (0,1) |      16 | f     | f    | 01 00 00 00 00 00 00 00  
(2 rows)  

使用vacuum回收垃圾(或者等其自動回收),垃圾回收時,如果要刪除heap表中的dead tuple,首先要刪除索引對應的item。

postgres=# end;  
COMMIT  
  
postgres=# vacuum verbose tbl;  
INFO:  vacuuming "public.tbl"  
INFO:  scanned index "tbl_pkey" to remove 1 row versions  
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.  
INFO:  scanned index "idx1" to remove 1 row versions  
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.  
INFO:  scanned index "idx2" to remove 1 row versions  
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.  
INFO:  "tbl": removed 1 row versions in 1 pages  
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.  
INFO:  index "tbl_pkey" now contains 1 row versions in 2 pages  
DETAIL:  1 index row versions were removed.  
0 index pages have been deleted, 0 are currently reusable.  
CPU 0.00s/0.00u sec elapsed 0.00 sec.  
INFO:  index "idx1" now contains 1 row versions in 2 pages  
DETAIL:  1 index row versions were removed.  
0 index pages have been deleted, 0 are currently reusable.  
CPU 0.00s/0.00u sec elapsed 0.00 sec.  
INFO:  index "idx2" now contains 1 row versions in 2 pages  
DETAIL:  1 index row versions were removed.  
0 index pages have been deleted, 0 are currently reusable.  
CPU 0.00s/0.00u sec elapsed 0.00 sec.  
INFO:  "tbl": found 1 removable, 1 nonremovable row versions in 1 out of 1 pages  
DETAIL:  0 dead row versions cannot be removed yet.  
There were 0 unused item pointers.  
Skipped 0 pages due to buffer pins.  
0 pages are entirely empty.  
CPU 0.00s/0.00u sec elapsed 0.00 sec.  
VACUUM  
  
  
ostgres=# select * from bt_page_items(`idx1`,1);  
 itemoffset | ctid  | itemlen | nulls | vars |          data             
------------+-------+---------+-------+------+-------------------------  
          1 | (0,2) |      16 | f     | f    | 02 00 00 00 00 00 00 00  
(1 row)  
  
postgres=# select * from bt_page_items(`idx2`,1);  
 itemoffset | ctid  | itemlen | nulls | vars |          data             
------------+-------+---------+-------+------+-------------------------  
          1 | (0,2) |      16 | f     | f    | 01 00 00 00 00 00 00 00  
(1 row)  
  
postgres=# select * from bt_page_items(`tbl_pkey`,1);  
 itemoffset | ctid  | itemlen | nulls | vars |          data             
------------+-------+---------+-------+------+-------------------------  
          1 | (0,2) |      16 | f     | f    | 01 00 00 00 00 00 00 00  
(1 row)  

寫放大就是這樣產生的,所以10.0的兩個特性可以解決這樣的問題。

間接索引

之前講過,這裡就不重複了

《PostgreSQL 10.0 preview 效能增強 – 間接索引(secondary index)》

WARM

warm使用chain和recheck來解決寫放大的問題。

建表與索引如下

CREATE TABLE test (col1 int, col2 int, col3 int, col4 text);  
CREATE INDEX testindx_col1 ON test (col1);  
CREATE INDEX testindx_col2 ON test (col2);  
CREATE INDEX testindx_col3 ON test (col3);  
  
INSERT INTO test VALUES (1, 11, 111, `foo`);  

當前的索引條目

testindx_col1: (1)    ===> (0,1)  
testindx_col2: (11)   ===> (0,1)  
testindx_col3: (111)  ===> (0,1)  

WARM上場,更新col1=2,指向0,2,此時沒有變更的索引,不需要更新,依舊指向0,1,而發生變化的索引,也指向0,1,通過chain指向0,2。

Now if a transaction T1 UPDATEs the table such as, "UPDATE test SET col1 =  
2". This does not satisfy the HOT property since index column `col1` is  
being updated. But as per WARM algorithm, since only testindx_col1`s index  
key has changed, we insert a new index tuple only in testindx_col1. Say the  
new heap tuple has CTID (0,2). So testindx_col1 will have a new index tuple  
with key (2), but still pointing to CTID (0,1)  
  
testindx_col1: (1)    ==>  (0,1)  
               (2)    ===> (0,1)  
testindx_col2: (11)   ===> (0,1)  
testindx_col3: (111)  ===> (0,1)  
  
The heap at this point has two tuples, linked by CTID chain.  
(0,1)* ---> (0,2)  

不管怎麼更新,都通過chain來指向,所以不需要修改沒有發生變化的索引。

If T3 later updates col2 and T4 further updates col3,  
T3: UPDATE test SET col2 = 12;  
T4: UPDATE test SET col3 = 112;  
  
The heap will look like:  
(0,1)* ---> (0,2) ---> (0,3) ---> (0,4)  
  
And the index pointers will be:  
  
testindx_col1: (1)   ===> (0,1)  
                       (2)   ===> (0,1)  
testindx_col2: (11)  ===> (0,1)  
                       (12)  ===> (0,1)  
testindx_col3: (111) ===> (0,1)  
                       (112) ===> (0,1)  

通過FLAG表示當前行是否有chain。

當查詢到chain tuple時,順藤摸瓜,根據事務快照,判斷可見性。

為什麼需要recheck, 因為warm的引入發生值變化的索引,值與heap root lp中儲存的值不一樣了,因此需要recheck.

warm的引入,對有較多索引的表的更新,效能提升是非常明顯的。

pic

這個patch的討論,詳見郵件組,本文末尾URL。

PostgreSQL社群的作風非常嚴謹,一個patch可能在郵件組中討論幾個月甚至幾年,根據大家的意見反覆的修正,patch合併到master已經非常成熟,所以PostgreSQL的穩定性也是遠近聞名的。

參考

https://commitfest.postgresql.org/13/775/

https://www.postgresql.org/message-id/flat/CABOikdMNy6yowA+wTGK9RVd8iw+CzqHeQSGpW7Yka_4RSZ_LOQ@mail.gmail.com#CABOikdMNy6yowA+wTGK9RVd8iw+CzqHeQSGpW7Yka_4RSZ_LOQ@mail.gmail.com

https://www.postgresql.org/message-id/CABOikdMop5Rb_RnS2xFdAXMZGSqcJ-P-BY2ruMd%2BbuUkJ4iDPw@mail.gmail.com


相關文章