HybridDBforPostgreSQL列存表(AO表)的膨脹、垃圾檢查與空間收縮

德哥發表於2017-11-12

標籤

PostgreSQL , Greenplum , 垃圾檢測 , 膨脹 , 列存表 , gp_appendonly_compaction_threshold


背景

Greenplum支援行儲存(堆儲存)與AO儲存,堆儲存的垃圾回收和膨脹檢測方法請參考:

《如何檢測、清理Greenplum膨脹、垃圾 – 阿里雲HybridDB for PG最佳實踐》

對於AO儲存,雖然是appendonly,但實際上GP是支援DELETE和UPDATE的,被刪除或更新的行,通過BITMAP來標記。

AO儲存是塊級組織,當一個塊內的資料大部分都被刪除或更新掉時,掃描它浪費的成本實際上是很高的。

如何檢查AO表的膨脹,收縮AO表呢?

如何檢視錶的儲存結構

pg_class.relstorage表示這個物件是什麼儲存:

postgres=# select distinct relstorage from pg_class ;  
 relstorage   
------------  
 a  -- 行儲存AO表  
 h  -- heap堆表、索引  
 x  -- 外部表(external table)  
 v  -- 檢視  
 c  -- 列儲存AO表  
(5 rows)  

查詢當前資料庫有哪些AO表:

postgres=# select t2.nspname, t1.relname from pg_class t1, pg_namespace t2 where t1.relnamespace=t2.oid and relstorage in (`c`, `a`);  
 nspname  |      relname        
----------+-------------------  
 postgres | tbl_tag  
 postgres | tbl_pos_1_prt_p1  
 postgres | tbl_pos_1_prt_p2  
 postgres | tbl_pos_1_prt_p3  
 postgres | tbl_pos_1_prt_p4  
 postgres | tbl_pos_1_prt_p5  
 postgres | tbl_pos_1_prt_p6  
 postgres | tbl_pos_1_prt_p7  
 postgres | tbl_pos_1_prt_p8  
 postgres | tbl_pos_1_prt_p9  
 postgres | tbl_pos_1_prt_p10  
 postgres | tbl_pos  
 postgres | xx_czrk_qm_col  
 postgres | ao1  
(14 rows)  

查詢當前資料庫有哪些堆表:

select t2.nspname, t1.relname from pg_class t1, pg_namespace t2 where t1.relnamespace=t2.oid and relstorage in (`h`) and relkind=`r`;  

檢視AO表的膨脹率(有多少垃圾)

使用這個函式可以檢視AO表的膨脹率

postgres=# df+ gp_toolkit.__gp_aovisimap_compaction_info  
List of functions  
-[ RECORD 1 ]-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
Schema              | gp_toolkit  
Name                | __gp_aovisimap_compaction_info  
Result data type    | SETOF record  
Argument data types | ao_oid oid, OUT content integer, OUT datafile integer, OUT compaction_possible boolean, OUT hidden_tupcount bigint, OUT total_tupcount bigint, OUT percent_hidden numeric  
Type                | normal  
Data access         | no sql  
Volatility          | volatile  
Owner               | dege.zzz  
Language            | plpgsql  
Source code         |   
                    | DECLARE  
                    |     hinfo_row RECORD;  
                    |     threshold float;  
                    | BEGIN  
                    |     EXECUTE `show gp_appendonly_compaction_threshold` INTO threshold;  
                    |     FOR hinfo_row IN SELECT gp_segment_id,  
                    |     gp_toolkit.__gp_aovisimap_hidden_typed(ao_oid)::gp_toolkit.__gp_aovisimap_hidden_t  
                    |     FROM gp_dist_random(`gp_id`) LOOP  
                    |         content := hinfo_row.gp_segment_id;  
                    |         datafile := (hinfo_row.__gp_aovisimap_hidden_typed).seg;  
                    |         hidden_tupcount := (hinfo_row.__gp_aovisimap_hidden_typed).hidden;  
                    |         total_tupcount := (hinfo_row.__gp_aovisimap_hidden_typed).total;  
                    |         compaction_possible := false;  
                    |         IF total_tupcount > 0 THEN  
                    |             percent_hidden := (100 * hidden_tupcount / total_tupcount::numeric)::numeric(5,2);  
                    |         ELSE  
                    |             percent_hidden := 0::numeric(5,2);  
                    |         END IF;  
                    |         IF percent_hidden > threshold THEN  
                    |             compaction_possible := true;  
                    |         END IF;  
                    |         RETURN NEXT;  
                    |     END LOOP;  
                    |     RAISE NOTICE `gp_appendonly_compaction_threshold = %`, threshold;  
                    |     RETURN;  
                    | END;  
                    |   
Description         |   

例子

postgres=# select * from gp_toolkit.__gp_aovisimap_compaction_info(`postgres.ao1`::regclass);  
NOTICE:  gp_appendonly_compaction_threshold = 10  
  
  
 content | datafile | compaction_possible | hidden_tupcount | total_tupcount | percent_hidden   
---------+----------+---------------------+-----------------+----------------+----------------  
       2 |        1 | f                   |              21 |         208283 |           0.01  
       9 |        1 | f                   |              25 |         208303 |           0.01  
      16 |        1 | f                   |              16 |         208352 |           0.01  
       4 |        1 | f                   |              23 |         208356 |           0.01  
       6 |        1 | f                   |              23 |         208386 |           0.01  
       3 |        1 | f                   |              14 |         208333 |           0.01  
       1 |        1 | f                   |              14 |         208329 |           0.01  
      14 |        1 | f                   |              12 |         208350 |           0.01  
      15 |        1 | f                   |              24 |         208346 |           0.01  
       7 |        1 | f                   |              22 |         208329 |           0.01  
       8 |        1 | f                   |              18 |         208334 |           0.01  
       0 |        1 | f                   |              21 |         208314 |           0.01  
      18 |        1 | f                   |              16 |         208417 |           0.01  
      11 |        1 | f                   |              24 |         208337 |           0.01  
      17 |        1 | f                   |              31 |         208380 |           0.01  
      12 |        1 | f                   |              12 |         208367 |           0.01  
      13 |        1 | f                   |              22 |         208365 |           0.01  
       5 |        1 | f                   |              22 |         208367 |           0.01  
      10 |        1 | f                   |              18 |         208347 |           0.01  
      20 |        1 | f                   |              17 |         208384 |           0.01  
      27 |        1 | f                   |              22 |         208348 |           0.01  
      19 |        1 | f                   |              31 |         208425 |           0.01  
      23 |        1 | f                   |              28 |         208344 |           0.01  
      26 |        1 | f                   |              14 |         208339 |           0.01  
      25 |        1 | f                   |              21 |         208386 |           0.01  
      24 |        1 | f                   |              14 |         208332 |           0.01  
      21 |        1 | f                   |              28 |         208360 |           0.01  
      33 |        1 | f                   |              18 |         208354 |           0.01  
      31 |        1 | f                   |              23 |         208335 |           0.01  
      22 |        1 | f                   |              17 |         208309 |           0.01  
      28 |        1 | f                   |              21 |         208314 |           0.01  
      29 |        1 | f                   |              23 |         208329 |           0.01  
      30 |        1 | f                   |              22 |         208342 |           0.01  
      35 |        1 | f                   |              18 |         208305 |           0.01  
      34 |        1 | f                   |              26 |         208363 |           0.01  
      32 |        1 | f                   |              25 |         208396 |           0.01  
      36 |        1 | f                   |              23 |         208323 |           0.01  
      38 |        1 | f                   |              22 |         208367 |           0.01  
      37 |        1 | f                   |              12 |         208334 |           0.01  
      39 |        1 | f                   |              12 |         208389 |           0.01  
      41 |        1 | f                   |              16 |         208373 |           0.01  
      42 |        1 | f                   |              30 |         208376 |           0.01  
      40 |        1 | f                   |              31 |         208397 |           0.01  
      43 |        1 | f                   |              16 |         208378 |           0.01  
      46 |        1 | f                   |              29 |         208371 |           0.01  
      47 |        1 | f                   |              16 |         208382 |           0.01  
      45 |        1 | f                   |              17 |         208368 |           0.01  
      44 |        1 | f                   |              29 |         208381 |           0.01  
(48 rows)  

解讀:

1、content:對應gp_configuration.content,表示greenplum每個節點的唯一編號。

2、datafile:這條記錄對應的這個表的其中一個資料檔案的編號,每個資料檔案假設1GB。

3、compaction_possible:這個資料檔案是否可以被收縮。(通過gp_appendonly_compaction_threshold引數和percent_hidden值判斷)。

4、hidden_tupcount:有多少條記錄已更新或刪除(不可見)。

5、total_tupcount:總共有多少條記錄(包括已更新或刪除的記錄)。

6、percent_hidden:不可見記錄的佔比。如果這個佔比大於gp_appendonly_compaction_threshold引數,那麼執行vacuum時,會收縮這個資料檔案。

檢查系統中膨脹率超過N的AO表

膨脹率超過千分之2的AO表:

select * from (  
  select t2.nspname, t1.relname, (gp_toolkit.__gp_aovisimap_compaction_info(t1.oid)).*   
  from pg_class t1, pg_namespace t2 where t1.relnamespace=t2.oid and relstorage in (`c`, `a`)   
) t   
where t.percent_hidden > 0.2;  

返回結果

 nspname  |      relname      | content | datafile | compaction_possible | hidden_tupcount | total_tupcount | percent_hidden   
----------+-------------------+---------+----------+---------------------+-----------------+----------------+----------------  
 postgres | tbl_pos_1_prt_p1  |       1 |        1 | t                   |        20833382 |       20833382 |         100.00  
 postgres | tbl_pos_1_prt_p1  |       7 |        1 | t                   |        20833495 |       20833628 |         100.00  
 postgres | tbl_pos_1_prt_p1  |       5 |        1 | t                   |        20833628 |       20833495 |         100.00  
 postgres | tbl_pos_1_prt_p1  |       3 |        1 | t                   |        20833469 |       20833469 |         100.00  
....  

顯然膨脹了100%,驗證如下(1條記錄有效,佔用了15GB空間):

postgres=# select pg_size_pretty(pg_relation_size(`tbl_pos_1_prt_p1`));  
 pg_size_pretty   
----------------  
 15 GB  
(1 row)  
  
postgres=# select count(*) from tbl_pos_1_prt_p1;  
 count   
-------  
     1  
(1 row)  

vacuum可以直接收縮(因為膨脹率大於gp_appendonly_compaction_threshold引數):

postgres=# vacuum tbl_pos_1_prt_p1;  
VACUUM  
postgres=# select count(*) from tbl_pos_1_prt_p1;  
 count   
-------  
     1  
(1 row)  
  
postgres=# select pg_size_pretty(pg_relation_size(`tbl_pos_1_prt_p1`));  
 pg_size_pretty   
----------------  
 40 bytes  
(1 row)  

VACUUM後,只佔用40位元組。

收縮已膨脹的AO表

方法有三:

1、執行VACUUM。(當膨脹率大於gp_appendonly_compaction_threshold引數時),為共享鎖。

2、執行VACUUM FULL。(不管gp_appendonly_compaction_threshold引數的設定,都會回收垃圾空間。),為DDL鎖。

3、執行重分佈。(不管gp_appendonly_compaction_threshold引數,都會回收垃圾空間。),為DDL鎖。

alter table <table_name> set with (reorganize=true) distributed randomly;    
    
或    
    
alter table <table_name> set with (reorganize=true) distributed by (<column_names1>,<column_names2>....)    

set distribute可以回收索引的膨脹空間。set distribute 載入的鎖與DDL鎖類似,是排它鎖。建議在沒有業務的時候執行,不要堵塞業務。

同時set distribute只要分佈條件不變,就是在節點內完成的,不會涉及資料的重分佈。

建議的操作流程:

1 記錄下表的分佈列    
    
2 執行set distribute (REORGANIZE=true)    
    
如果是隨機分佈,則設定為隨機分佈    

參考

http://greenplum.org/docs/510/admin_guide/managing/maintain.html

If the ratio of hidden rows to total rows in a segment file is less than a threshold value (10, by default),   
the segment file is not compacted.   
  
The threshold value can be configured with the gp_appendonly_compaction_threshold server configuration parameter.   
  
VACUUM FULL ignores the value of gp_appendonly_compaction_threshold and rewrites the segment file regardless of the ratio.  
  
You can use the __gp_aovisimap_compaction_info() function in the the gp_toolkit schema to investigate   
the effectiveness of a VACUUM operation on append-optimized tables.  

《如何檢測、清理Greenplum膨脹、垃圾 – 阿里雲HybridDB for PG最佳實踐》


相關文章