一天學會PostgreSQL應用開發與管理-8PostgreSQL管理

德哥發表於2017-04-12

本章大綱

一、許可權體系

1 邏輯結構

2 許可權體系

3 schema使用 , 特別注意

4 使用者

5 public

6 如何檢視和解讀一個物件的當前許可權狀態

二、索引介紹

1 索引有什麼用?

2 索引的型別

3 索引合併掃描

4 表膨脹檢查

5 檢查膨脹

6 索引維護

三、系統配置

1 儲存、檔案系統規劃

2 網路規劃

3 CPU評估

4 核心配置

5 資源限制

6 防火牆配置

四、資料庫初始化

1 initdb 介紹

2 postgresql.conf引數配置

3 pg_hba.conf資料庫防火牆配置

五、PostgreSQL控制

1 環境變數配置

2 停庫

3 重啟

4 啟動

5 Linux自啟動服務

六、資料庫備份和還原

1 邏輯備份

2 邏輯還原

3 配置歸檔、流複製

4 建立流複製使用者

5 配置流複製防火牆

6 遠端物理基礎備份

7 物理增量備份

8 PITR(時間點恢復)

PITR 時間點恢復 例子

七、資料庫效能

1 引數優化

2 效能檢視 pg_stat*

3 TOP SQL

4 AWR

5 auto_explain

6 explain解讀

八、資料庫壓測

1 pg_bench介紹

2 tpc-b測試

3 自定義測試

九、日常維護

第六章:PostgreSQL 管理

1. 許可權體系

邏輯結構

最上層是例項,例項中允許建立多個資料庫,每個資料庫中可以建立多個schema,每個schema下面可以建立多個物件。

物件包括表、物化檢視、操作符、索引、檢視、序列、函式、… 等等。

pic1

在資料庫中所有的許可權都和角色(使用者)掛鉤,public是一個特殊角色,代表所有人。

超級使用者是有允許任意操作物件的,普通使用者只能操作自己建立的物件。

另外有一些物件是有賦予給public角色預設許可權的,所以建好之後,所以人都有這些預設許可權。

許可權體系

pic2

例項級別的許可權由pg_hba.conf來控制,例如 :

# TYPE  DATABASE        USER            ADDRESS                 METHOD    
# "local" is for Unix domain socket connections only    
local   all             all                                     trust    
# IPv4 local connections:    
host    all             all             127.0.0.1/32            trust    
host all postgres 0.0.0.0/0 reject    
host all all 0.0.0.0/0 md5    

配置解釋

允許任何本地使用者無密碼連線任何資料庫      
      
不允許postgres使用者從任何外部地址連線任何資料庫      
      
允許其他任何使用者從外部地址通過密碼連線任何資料庫      

資料庫級別的許可權,包括允許連線資料庫,允許在資料庫中建立schema。

預設情況下,資料庫在建立後,允許public角色連線,即允許任何人連線。

預設情況下,資料庫在建立後,不允許除了超級使用者和owner之外的任何人在資料庫中建立schema。

預設情況下,資料庫在建立後,會自動建立名為public 的schema,這個schema的all許可權已經賦予給public角色,即允許任何人在裡面建立物件。

schema級別的許可權,包括允許檢視schema中的物件,允許在schema中建立物件。

預設情況下新建的schema的許可權不會賦予給public角色,因此除了超級使用者和owner,任何人都沒有許可權檢視schema中的物件或者在schema中新建物件。

schema使用 , 特別注意

According to the SQL standard, the owner of a schema always owns all objects within it.     
    
PostgreSQL allows schemas to contain objects owned by users other than the schema owner.     
    
This can happen only if the schema owner grants the CREATE privilege on his schema to someone else,     
or a superuser chooses to create objects in it.    
    
schema的owner預設是該schema下的所有物件的owner,但是PostgreSQL又允許使用者在別人的schema下建立物件,所以一個物件可能屬於兩個owner,而且schema 的owner有 drop物件的許可權。      
    
對於兩個owner都有drop的許可權,這個我個人認為是一個BUG。      
    
所以千萬不要把自己的物件建立到別人的schema下面,那很危險。      

物件級別的許可權,每種型別的物件許可權屬性都不一樣,具體請參考

http://www.postgresql.org/docs/9.5/static/sql-grant.html

以表為例,可以有SELECT | INSERT | UPDATE | DELETE | TRUNCATE | REFERENCES | TRIGGER這些許可權。

GRANT { { SELECT | INSERT | UPDATE | DELETE | TRUNCATE | REFERENCES | TRIGGER }    
    [, ...] | ALL [ PRIVILEGES ] }    
    ON { [ TABLE ] table_name [, ...]    
         | ALL TABLES IN SCHEMA schema_name [, ...] }    
    TO role_specification [, ...] [ WITH GRANT OPTION ]    
    
GRANT { { SELECT | INSERT | UPDATE | REFERENCES } ( column_name [, ...] )    
    [, ...] | ALL [ PRIVILEGES ] ( column_name [, ...] ) }    
    ON [ TABLE ] table_name [, ...]    
    TO role_specification [, ...] [ WITH GRANT OPTION ]    
    
GRANT { { USAGE | SELECT | UPDATE }    
    [, ...] | ALL [ PRIVILEGES ] }    
    ON { SEQUENCE sequence_name [, ...]    
         | ALL SEQUENCES IN SCHEMA schema_name [, ...] }    
    TO role_specification [, ...] [ WITH GRANT OPTION ]    
    
GRANT { { CREATE | CONNECT | TEMPORARY | TEMP } [, ...] | ALL [ PRIVILEGES ] }    
    ON DATABASE database_name [, ...]    
    TO role_specification [, ...] [ WITH GRANT OPTION ]    
    
GRANT { USAGE | ALL [ PRIVILEGES ] }    
    ON DOMAIN domain_name [, ...]    
    TO role_specification [, ...] [ WITH GRANT OPTION ]    
    
GRANT { USAGE | ALL [ PRIVILEGES ] }    
    ON FOREIGN DATA WRAPPER fdw_name [, ...]    
    TO role_specification [, ...] [ WITH GRANT OPTION ]    
    
GRANT { USAGE | ALL [ PRIVILEGES ] }    
    ON FOREIGN SERVER server_name [, ...]    
    TO role_specification [, ...] [ WITH GRANT OPTION ]    
    
GRANT { EXECUTE | ALL [ PRIVILEGES ] }    
    ON { FUNCTION function_name ( [ [ argmode ] [ arg_name ] arg_type [, ...] ] ) [, ...]    
         | ALL FUNCTIONS IN SCHEMA schema_name [, ...] }    
    TO role_specification [, ...] [ WITH GRANT OPTION ]    
    
GRANT { USAGE | ALL [ PRIVILEGES ] }    
    ON LANGUAGE lang_name [, ...]    
    TO role_specification [, ...] [ WITH GRANT OPTION ]    
    
GRANT { { SELECT | UPDATE } [, ...] | ALL [ PRIVILEGES ] }    
    ON LARGE OBJECT loid [, ...]    
    TO role_specification [, ...] [ WITH GRANT OPTION ]    
    
GRANT { { CREATE | USAGE } [, ...] | ALL [ PRIVILEGES ] }    
    ON SCHEMA schema_name [, ...]    
    TO role_specification [, ...] [ WITH GRANT OPTION ]    
    
GRANT { CREATE | ALL [ PRIVILEGES ] }    
    ON TABLESPACE tablespace_name [, ...]    
    TO role_specification [, ...] [ WITH GRANT OPTION ]    
    
GRANT { USAGE | ALL [ PRIVILEGES ] }    
    ON TYPE type_name [, ...]    
    TO role_specification [, ...] [ WITH GRANT OPTION ]    
    
where role_specification can be:    
    
    [ GROUP ] role_name    
  | PUBLIC    
  | CURRENT_USER    
  | SESSION_USER    
    
GRANT role_name [, ...] TO role_name [, ...] [ WITH ADMIN OPTION ]    

簡單介紹一下grant的一些通用選項

WITH ADMIN OPTION表示被賦予許可權的使用者,拿到對應的許可權後,還能將對應的許可權賦予給其他人,否則只能自己有這個許可權,但是不能再賦予給其他人。

使用者

使用者,角色在PostgreSQL是一個概念。

public

public角色,代表所有人的意思。

如何檢視和解讀一個物件的當前許可權狀態

以表為例 :

select relname,relacl from pg_class where relkind=`r`;    

或者執行

SELECT n.nspname as "Schema",    
  c.relname as "Name",    
  CASE c.relkind WHEN `r` THEN `table` WHEN `v` THEN `view` WHEN `m` THEN `materialized view` WHEN `S` THEN `sequence` WHEN `f` THEN `foreign table` END as "Type",    
  pg_catalog.array_to_string(c.relacl, E`
`) AS "Access privileges",    
  pg_catalog.array_to_string(ARRAY(    
    SELECT attname || E`:
  ` || pg_catalog.array_to_string(attacl, E`
  `)    
    FROM pg_catalog.pg_attribute a    
    WHERE attrelid = c.oid AND NOT attisdropped AND attacl IS NOT NULL    
  ), E`
`) AS "Column privileges",    
  pg_catalog.array_to_string(ARRAY(    
    SELECT polname    
    || CASE WHEN polcmd != `*` THEN    
           E` (` || polcmd || E`):`    
       ELSE E`:`     
       END    
    || CASE WHEN polqual IS NOT NULL THEN    
           E`
  (u): ` || pg_catalog.pg_get_expr(polqual, polrelid)    
       ELSE E``    
       END    
    || CASE WHEN polwithcheck IS NOT NULL THEN    
           E`
  (c): ` || pg_catalog.pg_get_expr(polwithcheck, polrelid)    
       ELSE E``    
       END    || CASE WHEN polroles <> `{0}` THEN    
           E`
  to: ` || pg_catalog.array_to_string(    
               ARRAY(    
                   SELECT rolname    
                   FROM pg_catalog.pg_roles    
                   WHERE oid = ANY (polroles)    
                   ORDER BY 1    
               ), E`, `)    
       ELSE E``    
       END    
    FROM pg_catalog.pg_policy pol    
    WHERE polrelid = c.oid), E`
`)    
    AS "Policies"    
FROM pg_catalog.pg_class c    
     LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace    
WHERE c.relkind IN (`r`, `v`, `m`, `S`, `f`)    
  AND n.nspname !~ `^pg_` AND pg_catalog.pg_table_is_visible(c.oid)    
ORDER BY 1, 2;    

得到許可權說明如下

 Schema |      Name       |   Type   |       Access privileges        | Column privileges | Policies     
--------+-----------------+----------+--------------------------------+-------------------+----------    
 public | sbtest1         | table    | postgres=arwdDxt/postgres     +|                   |     
        |                 |          | digoal=a*r*w*d*D*x*t*/postgres |                   |     
 public | sbtest10        | table    | postgres=arwdDxt/postgres      |                   |     
 public | sbtest10_id_seq | sequence |                                |                   |     
 public | sbtest11        | table    | postgres=arwdDxt/postgres      |                   |     
 public | sbtest11_id_seq | sequence |                                |                   |     
 public | sbtest12        | table    | postgres=arwdDxt/postgres      |                   |     
 public | sbtest12_id_seq | sequence |                                |                   |     

解釋一下 Access privileges

rolename=xxx 其中rolename就是被賦予許可權的使用者名稱,即許可權被賦予給誰了?

=xxx 表示這個許可權賦予給了public角色,即所有人

/yyyy 表示是誰賦予的這個許可權?

許可權的含義如下

rolename=xxxx -- privileges granted to a role    
        =xxxx -- privileges granted to PUBLIC    
    
            r -- SELECT ("read")    
            w -- UPDATE ("write")    
            a -- INSERT ("append")    
            d -- DELETE    
            D -- TRUNCATE    
            x -- REFERENCES    
            t -- TRIGGER    
            X -- EXECUTE    
            U -- USAGE    
            C -- CREATE    
            c -- CONNECT    
            T -- TEMPORARY    
      arwdDxt -- ALL PRIVILEGES (for tables, varies for other objects)    
            * -- grant option for preceding privilege    
    
        /yyyy -- role that granted this privilege    

例子

賦予許可權的人是postgres使用者, sbtest2表的select許可權被賦予給了digoal使用者。

postgres=# grant select on sbtest2 to digoal;    
GRANT    
postgres=# dp+ sbtest2    
                                  Access privileges    
 Schema |  Name   | Type  |     Access privileges     | Column privileges | Policies     
--------+---------+-------+---------------------------+-------------------+----------    
 public | sbtest2 | table | postgres=arwdDxt/postgres+|                   |     
        |         |       | digoal=r/postgres         |                   |     
(1 row)    

回收許可權一定要針對已有的許可權來,如果你發現這裡的許可權還在,那照著許可權回收即可。

例如

revoke select on sbtest2 from digoal;      

2. 索引介紹

索引有什麼用?

加速TUPLE定位    
    
主鍵, 唯一約束    
    
排序    

索引的型別

B-Tree:支援排序、範圍查詢、精確查詢;適合所有資料型別,單個索引條目不能超過索引頁的1/3。

hash:支援等值查詢;適合超長欄位。

gin:倒排索引,支援等值、包含、相交、等查詢;適合多值型別(陣列、全文檢索等),任意欄位組合查詢。

gist:R-Tree索引,支援包含,相交,距離,點面判斷等查詢;適合幾何型別、範圍型別、全文檢索、異構型別等。

sp-gist:空間分割槽(平衡)r-tree,支援包含,相交,距離,點面判斷等查詢;適合幾何型別、範圍型別、全文檢索、異構型別等。

brin:塊級索引,適合物理儲存與列值存在較好相關性的欄位。比如時序資料、物聯網感測資料、FEED資料等。支援範圍查詢、等值查詢。

rum:擴充套件索引介面,支援全文檢索,支援附加標量型別的全文檢索,支援帶位置關係的全文檢索。

索引合併掃描

多個索引,PostgreSQL會使用bitmapAnd或bitmapOr合併掃描。

pic

表膨脹檢查

SELECT    
  current_database() AS db, schemaname, tablename, reltuples::bigint AS tups, relpages::bigint AS pages, otta,    
  ROUND(CASE WHEN otta=0 OR sml.relpages=0 OR sml.relpages=otta THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat,    
  CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages,    
  CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes,    
  CASE WHEN relpages < otta THEN $$0 bytes$$::text ELSE (bs*(relpages-otta))::bigint || $$ bytes$$ END AS wastedsize,    
  iname, ituples::bigint AS itups, ipages::bigint AS ipages, iotta,    
  ROUND(CASE WHEN iotta=0 OR ipages=0 OR ipages=iotta THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat,    
  CASE WHEN ipages < iotta THEN 0 ELSE ipages::bigint - iotta END AS wastedipages,    
  CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes,    
  CASE WHEN ipages < iotta THEN $$0 bytes$$ ELSE (bs*(ipages-iotta))::bigint || $$ bytes$$ END AS wastedisize,    
  CASE WHEN relpages < otta THEN    
    CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta::bigint) END    
    ELSE CASE WHEN ipages < iotta THEN bs*(relpages-otta::bigint)    
      ELSE bs*(relpages-otta::bigint + ipages-iotta::bigint) END    
  END AS totalwastedbytes    
FROM (    
  SELECT    
    nn.nspname AS schemaname,    
    cc.relname AS tablename,    
    COALESCE(cc.reltuples,0) AS reltuples,    
    COALESCE(cc.relpages,0) AS relpages,    
    COALESCE(bs,0) AS bs,    
    COALESCE(CEIL((cc.reltuples*((datahdr+ma-    
      (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)),0) AS otta,    
    COALESCE(c2.relname,$$?$$) AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,    
    COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all cols    
  FROM    
     pg_class cc    
  JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname <> $$information_schema$$    
  LEFT JOIN    
  (    
    SELECT    
      ma,bs,foo.nspname,foo.relname,    
      (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,    
      (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2    
    FROM (    
      SELECT    
        ns.nspname, tbl.relname, hdr, ma, bs,    
        SUM((1-coalesce(null_frac,0))*coalesce(avg_width, 2048)) AS datawidth,    
        MAX(coalesce(null_frac,0)) AS maxfracsum,    
        hdr+(    
          SELECT 1+count(*)/8    
          FROM pg_stats s2    
          WHERE null_frac<>0 AND s2.schemaname = ns.nspname AND s2.tablename = tbl.relname    
        ) AS nullhdr    
      FROM pg_attribute att     
      JOIN pg_class tbl ON att.attrelid = tbl.oid    
      JOIN pg_namespace ns ON ns.oid = tbl.relnamespace     
      LEFT JOIN pg_stats s ON s.schemaname=ns.nspname    
      AND s.tablename = tbl.relname    
      AND s.inherited=false    
      AND s.attname=att.attname,    
      (    
        SELECT    
          (SELECT current_setting($$block_size$$)::numeric) AS bs,    
            CASE WHEN SUBSTRING(SPLIT_PART(v, $$ $$, 2) FROM $$#"[0-9]+.[0-9]+#"%$$ for $$#$$)    
              IN ($$8.0$$,$$8.1$$,$$8.2$$) THEN 27 ELSE 23 END AS hdr,    
          CASE WHEN v ~ $$mingw32$$ OR v ~ $$64-bit$$ THEN 8 ELSE 4 END AS ma    
        FROM (SELECT version() AS v) AS foo    
      ) AS constants    
      WHERE att.attnum > 0 AND tbl.relkind=$$r$$    
      GROUP BY 1,2,3,4,5    
    ) AS foo    
  ) AS rs    
  ON cc.relname = rs.relname AND nn.nspname = rs.nspname    
  LEFT JOIN pg_index i ON indrelid = cc.oid    
  LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid    
) AS sml order by wastedbytes desc limit 5    

檢查膨脹

SELECT    
  current_database() AS db, schemaname, tablename, reltuples::bigint AS tups, relpages::bigint AS pages, otta,    
  ROUND(CASE WHEN otta=0 OR sml.relpages=0 OR sml.relpages=otta THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat,    
  CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages,    
  CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes,    
  CASE WHEN relpages < otta THEN $$0 bytes$$::text ELSE (bs*(relpages-otta))::bigint || $$ bytes$$ END AS wastedsize,    
  iname, ituples::bigint AS itups, ipages::bigint AS ipages, iotta,    
  ROUND(CASE WHEN iotta=0 OR ipages=0 OR ipages=iotta THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat,    
  CASE WHEN ipages < iotta THEN 0 ELSE ipages::bigint - iotta END AS wastedipages,    
  CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes,    
  CASE WHEN ipages < iotta THEN $$0 bytes$$ ELSE (bs*(ipages-iotta))::bigint || $$ bytes$$ END AS wastedisize,    
  CASE WHEN relpages < otta THEN    
    CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta::bigint) END    
    ELSE CASE WHEN ipages < iotta THEN bs*(relpages-otta::bigint)    
      ELSE bs*(relpages-otta::bigint + ipages-iotta::bigint) END    
  END AS totalwastedbytes    
FROM (    
  SELECT    
    nn.nspname AS schemaname,    
    cc.relname AS tablename,    
    COALESCE(cc.reltuples,0) AS reltuples,    
    COALESCE(cc.relpages,0) AS relpages,    
    COALESCE(bs,0) AS bs,    
    COALESCE(CEIL((cc.reltuples*((datahdr+ma-    
      (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)),0) AS otta,    
    COALESCE(c2.relname,$$?$$) AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,    
    COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all cols    
  FROM    
     pg_class cc    
  JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname <> $$information_schema$$    
  LEFT JOIN    
  (    
    SELECT    
      ma,bs,foo.nspname,foo.relname,    
      (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,    
      (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2    
    FROM (    
      SELECT    
        ns.nspname, tbl.relname, hdr, ma, bs,    
        SUM((1-coalesce(null_frac,0))*coalesce(avg_width, 2048)) AS datawidth,    
        MAX(coalesce(null_frac,0)) AS maxfracsum,    
        hdr+(    
          SELECT 1+count(*)/8    
          FROM pg_stats s2    
          WHERE null_frac<>0 AND s2.schemaname = ns.nspname AND s2.tablename = tbl.relname    
        ) AS nullhdr    
      FROM pg_attribute att     
      JOIN pg_class tbl ON att.attrelid = tbl.oid    
      JOIN pg_namespace ns ON ns.oid = tbl.relnamespace     
      LEFT JOIN pg_stats s ON s.schemaname=ns.nspname    
      AND s.tablename = tbl.relname    
      AND s.inherited=false    
      AND s.attname=att.attname,    
      (    
        SELECT    
          (SELECT current_setting($$block_size$$)::numeric) AS bs,    
            CASE WHEN SUBSTRING(SPLIT_PART(v, $$ $$, 2) FROM $$#"[0-9]+.[0-9]+#"%$$ for $$#$$)    
              IN ($$8.0$$,$$8.1$$,$$8.2$$) THEN 27 ELSE 23 END AS hdr,    
          CASE WHEN v ~ $$mingw32$$ OR v ~ $$64-bit$$ THEN 8 ELSE 4 END AS ma    
        FROM (SELECT version() AS v) AS foo    
      ) AS constants    
      WHERE att.attnum > 0 AND tbl.relkind=$$r$$    
      GROUP BY 1,2,3,4,5    
    ) AS foo    
  ) AS rs    
  ON cc.relname = rs.relname AND nn.nspname = rs.nspname    
  LEFT JOIN pg_index i ON indrelid = cc.oid    
  LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid    
) AS sml order by wastedibytes desc limit 5    

索引維護

1、新建索引

set maintenance_work_mem =`4GB`;  -- 越大,建立索引越快    
set lock_timeout = `1s`;  -- 避免長時間鎖等待    
    
create index [CONCURRENTLY] xxx on xxx using xxx (colxxx);  -- 根據實際情況,是否使用 CONCURRENTLY(不堵塞讀寫)。    

2、刪除舊索引

set lock_timeout = `1s`;  -- 避免長時間鎖等待    
    
drop index oldindex;    

3、PKUK的維護,與之類似,增加一步驟。

ALTER TABLE tbl ADD    
    [ CONSTRAINT constraint_name ]    
    { UNIQUE | PRIMARY KEY } USING INDEX index_name    
    [ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]    

3. 系統配置

1、儲存、檔案系統規劃

索引盤:對應索引表空間。

空間評估、IOPS評估、頻寬評估。建議採用SSD,檔案系統採用ext4或xfs。

資料盤:對應資料檔案表空間。

空間評估、IOPS評估、頻寬評估。建議採用SSD,檔案系統採用ext4或xfs。

重做日誌盤:對應WAL目錄。

空間評估、IOPS評估、頻寬評估。建議採用SSD,檔案系統採用ext4或xfs。

歸檔盤:對應WAL歸檔目錄。

空間評估、IOPS評估、頻寬評估。可以採用機械盤,檔案系統採用ext4或xfs或ZFS,可以開啟檔案系統壓縮功能。

備份盤:對應資料庫備份目錄。

空間評估、IOPS評估、頻寬評估。可以採用機械盤,檔案系統採用ext4或xfs或ZFS,可以開啟檔案系統壓縮功能。

日誌盤(pg_log):對應資料庫審計、錯誤日誌目錄。

空間評估、IOPS評估、頻寬評估。可以採用機械盤,檔案系統採用ext4或xfs或ZFS,可以開啟檔案系統壓縮功能。

2、網路規劃

頻寬評估,網段規劃,防火牆規劃。

不建議使用公網。

3、CPU評估

根據業務需求,評估CPU主頻,核數。建議實測效能指標。

4、核心配置

/etc/sysctl.conf

# add by digoal.zhou    
fs.aio-max-nr = 1048576    
fs.file-max = 76724600    
kernel.core_pattern= /data01/corefiles/core_%e_%u_%t_%s.%p             
# /data01/corefiles事先建好,許可權777,如果是軟連結,對應的目錄修改為777    
kernel.sem = 4096 2147483647 2147483646 512000        
# 訊號量, ipcs -l 或 -u 檢視,每16個程式一組,每組訊號量需要17個訊號量。    
kernel.shmall = 107374182          
# 所有共享記憶體段相加大小限制(建議記憶體的80%)    
kernel.shmmax = 274877906944       
# 最大單個共享記憶體段大小(建議為記憶體一半), >9.2的版本已大幅降低共享記憶體的使用    
kernel.shmmni = 819200             
# 一共能生成多少共享記憶體段,每個PG資料庫叢集至少2個共享記憶體段    
net.core.netdev_max_backlog = 10000    
net.core.rmem_default = 262144           
# The default setting of the socket receive buffer in bytes.    
net.core.rmem_max = 4194304              
# The maximum receive socket buffer size in bytes    
net.core.wmem_default = 262144           
# The default setting (in bytes) of the socket send buffer.    
net.core.wmem_max = 4194304              
# The maximum send socket buffer size in bytes.    
net.core.somaxconn = 4096    
net.ipv4.tcp_max_syn_backlog = 4096    
net.ipv4.tcp_keepalive_intvl = 20    
net.ipv4.tcp_keepalive_probes = 3    
net.ipv4.tcp_keepalive_time = 60    
net.ipv4.tcp_mem = 8388608 12582912 16777216    
net.ipv4.tcp_fin_timeout = 5    
net.ipv4.tcp_synack_retries = 2    
net.ipv4.tcp_syncookies = 1        
# 開啟SYN Cookies。當出現SYN等待佇列溢位時,啟用cookie來處理,可防範少量的SYN攻擊    
net.ipv4.tcp_timestamps = 1        
# 減少time_wait    
net.ipv4.tcp_tw_recycle = 0        
# 如果=1則開啟TCP連線中TIME-WAIT套接字的快速回收,但是NAT環境可能導致連線失敗,建議服務端關閉它    
net.ipv4.tcp_tw_reuse = 1          
# 開啟重用。允許將TIME-WAIT套接字重新用於新的TCP連線    
net.ipv4.tcp_max_tw_buckets = 262144    
net.ipv4.tcp_rmem = 8192 87380 16777216    
net.ipv4.tcp_wmem = 8192 65536 16777216    
net.nf_conntrack_max = 1200000    
net.netfilter.nf_conntrack_max = 1200000    
vm.dirty_background_bytes = 409600000           
#  系統髒頁到達這個值,系統後臺刷髒頁排程程式 pdflush(或其他) 自動將(dirty_expire_centisecs/100)秒前的髒頁刷到磁碟    
vm.dirty_expire_centisecs = 3000                 
#  比這個值老的髒頁,將被刷到磁碟。3000表示30秒。    
vm.dirty_ratio = 95                              
#  如果系統程式刷髒頁太慢,使得系統髒頁超過記憶體 95 % 時,則使用者程式如果有寫磁碟的操作(如fsync, fdatasync等呼叫),則需要主動把系統髒頁刷出。    
#  有效防止使用者程式刷髒頁,在單機多例項,並且使用CGROUP限制單例項IOPS的情況下非常有效。      
vm.dirty_writeback_centisecs = 100                
#  pdflush(或其他)後臺刷髒頁程式的喚醒間隔, 100表示1秒。    
vm.mmap_min_addr = 65536    
vm.overcommit_memory = 0         
#  在分配記憶體時,允許少量over malloc, 如果設定為 1, 則認為總是有足夠的記憶體,記憶體較少的測試環境可以使用 1 .      
vm.overcommit_ratio = 90         
#  當overcommit_memory = 2 時,用於參與計算允許指派的記憶體大小。    
vm.swappiness = 0                
#  關閉交換分割槽    
vm.zone_reclaim_mode = 0         
# 禁用 numa, 或者在vmlinux中禁止.     
net.ipv4.ip_local_port_range = 40000 65535        
# 本地自動分配的TCP, UDP埠號範圍    
fs.nr_open=20480000    
# 單個程式允許開啟的檔案控制程式碼上限    
    
# 以下引數請注意    
# vm.extra_free_kbytes = 4096000    
# vm.min_free_kbytes = 2097152    
# 如果是小記憶體機器,以上兩個值不建議設定    
# vm.nr_hugepages = 66536        
#  建議shared buffer設定超過64GB時 使用大頁,頁大小 /proc/meminfo Hugepagesize    
# vm.lowmem_reserve_ratio = 1 1 1    
# 對於記憶體大於64G時,建議設定,否則建議預設值 256 256 32    

sysctl -p

5、資源限制

/etc/security/limits.conf

* soft    nofile  1024000    
* hard    nofile  1024000    
* soft    nproc   unlimited    
* hard    nproc   unlimited    
* soft    core    unlimited    
* hard    core    unlimited    
* soft    memlock unlimited    
* hard    memlock unlimited    

6、防火牆配置

例子

# 私有網段    
-A INPUT -s 192.168.0.0/16 -j ACCEPT    
-A INPUT -s 10.0.0.0/8 -j ACCEPT    
-A INPUT -s 172.16.0.0/16 -j ACCEPT    

4. 資料庫初始化

1、initdb 介紹

initdb initializes a PostgreSQL database cluster.    
    
Usage:    
  initdb [OPTION]... [DATADIR]    
    
Options:    
  -A, --auth=METHOD         default authentication method for local connections    
      --auth-host=METHOD    default authentication method for local TCP/IP connections    
      --auth-local=METHOD   default authentication method for local-socket connections    
 [-D, --pgdata=]DATADIR     location for this database cluster    
  -E, --encoding=ENCODING   set default encoding for new databases    
      --locale=LOCALE       set default locale for new databases    
      --lc-collate=, --lc-ctype=, --lc-messages=LOCALE    
      --lc-monetary=, --lc-numeric=, --lc-time=LOCALE    
                            set default locale in the respective category for    
                            new databases (default taken from environment)    
      --no-locale           equivalent to --locale=C    
      --pwfile=FILE         read password for the new superuser from file    
  -T, --text-search-config=CFG    
                            default text search configuration    
  -U, --username=NAME       database superuser name    
  -W, --pwprompt            prompt for a password for the new superuser    
  -X, --xlogdir=XLOGDIR     location for the transaction log directory    
    
Less commonly used options:    
  -d, --debug               generate lots of debugging output    
  -k, --data-checksums      use data page checksums    
  -L DIRECTORY              where to find the input files    
  -n, --noclean             do not clean up after errors    
  -N, --nosync              do not wait for changes to be written safely to disk    
  -s, --show                show internal settings    
  -S, --sync-only           only sync data directory    
    
Other options:    
  -V, --version             output version information, then exit    
  -?, --help                show this help, then exit    
    
If the data directory is not specified, the environment variable PGDATA    
is used.    

例子

initdb -D $PGDATA -E UTF8 --locale=C -U postgres -X $PG_XLOG -W    

2、postgresql.conf引數配置

例子

listen_addresses = `0.0.0.0`      
port = 1921      
max_connections = 200      
tcp_keepalives_idle = 60      
tcp_keepalives_interval = 10      
tcp_keepalives_count = 6      
shared_buffers = 512MB      
maintenance_work_mem = 64MB      
dynamic_shared_memory_type = windows      
vacuum_cost_delay = 0      
bgwriter_delay = 10ms      
bgwriter_lru_maxpages = 1000      
bgwriter_lru_multiplier = 5.0      
bgwriter_flush_after = 0      
old_snapshot_threshold = -1      
wal_level = minimal      
synchronous_commit = off      
full_page_writes = on      
wal_buffers = 64MB      
wal_writer_delay = 10ms      
wal_writer_flush_after = 4MB      
checkpoint_timeout = 35min      
max_wal_size = 2GB      
min_wal_size = 80MB      
checkpoint_completion_target = 0.1      
checkpoint_flush_after = 0      
random_page_cost = 1.5      
log_destination = `csvlog`      
logging_collector = on      
log_directory = `pg_log`      
log_truncate_on_rotation = on      
log_checkpoints = on      
log_connections = on      
log_disconnections = on      
log_error_verbosity = verbose      
log_temp_files = 8192      
log_timezone = `Asia/Hong_Kong`      
autovacuum = on      
log_autovacuum_min_duration = 0      
autovacuum_naptime = 20s      
autovacuum_vacuum_scale_factor = 0.05      
autovacuum_freeze_max_age = 1500000000      
autovacuum_multixact_freeze_max_age = 1600000000      
autovacuum_vacuum_cost_delay = 0      
vacuum_freeze_table_age = 1400000000      
vacuum_multixact_freeze_table_age = 1500000000      
datestyle = `iso, mdy`      
timezone = `Asia/Hong_Kong`      
lc_messages = `C`      
lc_monetary = `C`      
lc_numeric = `C`      
lc_time = `C`      
default_text_search_config = `pg_catalog.english`      

3、pg_hba.conf資料庫防火牆配置

例子

host all all 0.0.0.0/0 md5      

5. PostgreSQL控制

1、環境變數配置

/home/digoal/.bash_profile

export PGPORT=1921      
export PGDATA=/home/digoal/pgdata      
export LANG=en_US.utf8      
export PGHOME=/home/digoal/pgsql9.6      
export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH      
export PATH=$PGHOME/bin:$PATH:.      
export MANPATH=$PGHOME/share/man:$MANPATH      
export PGHOST=$PGDATA      
export PGUSER=postgres      
export PGDATABASE=postgres    

2、停庫

pg_ctl stop -m fast -D $PGDATA    

3、重啟

pg_ctl restart -m fast -D $PGDATA    

4、啟動

pg_ctl start -D $PGDATA    

5、Linux自啟動服務

vi /etc/rc.local    
    
su - digoal -c "pg_ctl start -D $PGDATA"    

6. 資料庫備份和還原

1、邏輯備份

pg_dump --help    
    
pg_dump dumps a database as a text file or to other formats.    
    
Usage:    
  pg_dump [OPTION]... [DBNAME]    
    
General options:    
  -f, --file=FILENAME          output file or directory name    
  -F, --format=c|d|t|p         output file format (custom, directory, tar,    
                               plain text (default))    
  -j, --jobs=NUM               use this many parallel jobs to dump    
  -v, --verbose                verbose mode    
  -V, --version                output version information, then exit    
  -Z, --compress=0-9           compression level for compressed formats    
  --lock-wait-timeout=TIMEOUT  fail after waiting TIMEOUT for a table lock    
  -?, --help                   show this help, then exit    
    
Options controlling the output content:    
  -a, --data-only              dump only the data, not the schema    
  -b, --blobs                  include large objects in dump    
  -c, --clean                  clean (drop) database objects before recreating    
  -C, --create                 include commands to create database in dump    
  -E, --encoding=ENCODING      dump the data in encoding ENCODING    
  -n, --schema=SCHEMA          dump the named schema(s) only    
  -N, --exclude-schema=SCHEMA  do NOT dump the named schema(s)    
  -o, --oids                   include OIDs in dump    
  -O, --no-owner               skip restoration of object ownership in    
                               plain-text format    
  -s, --schema-only            dump only the schema, no data    
  -S, --superuser=NAME         superuser user name to use in plain-text format    
  -t, --table=TABLE            dump the named table(s) only    
  -T, --exclude-table=TABLE    do NOT dump the named table(s)    
  -x, --no-privileges          do not dump privileges (grant/revoke)    
  --binary-upgrade             for use by upgrade utilities only    
  --column-inserts             dump data as INSERT commands with column names    
  --disable-dollar-quoting     disable dollar quoting, use SQL standard quoting    
  --disable-triggers           disable triggers during data-only restore    
  --enable-row-security        enable row security (dump only content user has    
                               access to)    
  --exclude-table-data=TABLE   do NOT dump data for the named table(s)    
  --if-exists                  use IF EXISTS when dropping objects    
  --inserts                    dump data as INSERT commands, rather than COPY    
  --no-security-labels         do not dump security label assignments    
  --no-synchronized-snapshots  do not use synchronized snapshots in parallel jobs    
  --no-tablespaces             do not dump tablespace assignments    
  --no-unlogged-table-data     do not dump unlogged table data    
  --quote-all-identifiers      quote all identifiers, even if not key words    
  --section=SECTION            dump named section (pre-data, data, or post-data)    
  --serializable-deferrable    wait until the dump can run without anomalies    
  --snapshot=SNAPSHOT          use given snapshot for the dump    
  --strict-names               require table and/or schema include patterns to    
                               match at least one entity each    
  --use-set-session-authorization    
                               use SET SESSION AUTHORIZATION commands instead of    
                               ALTER OWNER commands to set ownership    
    
Connection options:    
  -d, --dbname=DBNAME      database to dump    
  -h, --host=HOSTNAME      database server host or socket directory    
  -p, --port=PORT          database server port number    
  -U, --username=NAME      connect as specified database user    
  -w, --no-password        never prompt for password    
  -W, --password           force password prompt (should happen automatically)    
  --role=ROLENAME          do SET ROLE before dump    
    
If no database name is supplied, then the PGDATABASE environment    
variable value is used.    

《PostgreSQL 最佳實踐 – 線上邏輯備份與恢復介紹》

例子

備份digoal庫, DDL中不包含表空間. 所以恢復時不需要提前建立對應的表空間.

pg_dump -f ./digoal.dmp -F p -C -E UTF8 --no-tablespaces -h 127.0.0.1 -p 1999 -U postgres digoal        

2、邏輯還原

pg_restore --help    
    
pg_restore restores a PostgreSQL database from an archive created by pg_dump.    
    
Usage:    
  pg_restore [OPTION]... [FILE]    
    
General options:    
  -d, --dbname=NAME        connect to database name    
  -f, --file=FILENAME      output file name    
  -F, --format=c|d|t       backup file format (should be automatic)    
  -l, --list               print summarized TOC of the archive    
  -v, --verbose            verbose mode    
  -V, --version            output version information, then exit    
  -?, --help               show this help, then exit    
    
Options controlling the restore:    
  -a, --data-only              restore only the data, no schema    
  -c, --clean                  clean (drop) database objects before recreating    
  -C, --create                 create the target database    
  -e, --exit-on-error          exit on error, default is to continue    
  -I, --index=NAME             restore named index    
  -j, --jobs=NUM               use this many parallel jobs to restore    
  -L, --use-list=FILENAME      use table of contents from this file for    
                               selecting/ordering output    
  -n, --schema=NAME            restore only objects in this schema    
  -O, --no-owner               skip restoration of object ownership    
  -P, --function=NAME(args)    restore named function    
  -s, --schema-only            restore only the schema, no data    
  -S, --superuser=NAME         superuser user name to use for disabling triggers    
  -t, --table=NAME             restore named relation (table, view, etc.)    
  -T, --trigger=NAME           restore named trigger    
  -x, --no-privileges          skip restoration of access privileges (grant/revoke)    
  -1, --single-transaction     restore as a single transaction    
  --disable-triggers           disable triggers during data-only restore    
  --enable-row-security        enable row security    
  --if-exists                  use IF EXISTS when dropping objects    
  --no-data-for-failed-tables  do not restore data of tables that could not be    
                               created    
  --no-security-labels         do not restore security labels    
  --no-tablespaces             do not restore tablespace assignments    
  --section=SECTION            restore named section (pre-data, data, or post-data)    
  --strict-names               require table and/or schema include patterns to    
                               match at least one entity each    
  --use-set-session-authorization    
                               use SET SESSION AUTHORIZATION commands instead of    
                               ALTER OWNER commands to set ownership    
    
Connection options:    
  -h, --host=HOSTNAME      database server host or socket directory    
  -p, --port=PORT          database server port number    
  -U, --username=NAME      connect as specified database user    
  -w, --no-password        never prompt for password    
  -W, --password           force password prompt (should happen automatically)    
  --role=ROLENAME          do SET ROLE before restore    
    
The options -I, -n, -P, -t, -T, and --section can be combined and specified    
multiple times to select multiple objects.    
    
If no input file name is supplied, then standard input is used.    

如果備份為-F p格式,備份輸出為文字,直接執行即可。

psql postgres postgres -f ./digoal.dmp      

3、配置歸檔、流複製

要支援物理備份,務必開啟歸檔,同時建議開啟流複製。

配置歸檔目錄

# mkdir -p /disk1/digoal/arch    
# chown digoal:digoal /disk1/digoal/arch    

配置流複製和歸檔

vi postgresql.conf

max_worker_processes = 16      
    
max_wal_senders = 8    
    
archive_command = `test ! -f /disk1/digoal/arch/%f && cp %p /disk1/digoal/arch/%f`    

重啟資料庫

pg_ctl restart -m fast -D $PGDATA    

4、建立流複製使用者

postgres=# create role rep replication login encrypted password `rep123`;    
CREATE ROLE    

5、配置流複製防火牆

vi $PGDATA/pg_hba.conf

host replication rep 0.0.0.0/0 md5    

pg_ctl reload -D $PGDATA

6、遠端物理基礎備份

pg_basebackup --help    
pg_basebackup takes a base backup of a running PostgreSQL server.    
    
Usage:    
  pg_basebackup [OPTION]...    
    
Options controlling the output:    
  -D, --pgdata=DIRECTORY receive base backup into directory    
  -F, --format=p|t       output format (plain (default), tar)    
  -r, --max-rate=RATE    maximum transfer rate to transfer data directory    
                         (in kB/s, or use suffix "k" or "M")    
  -R, --write-recovery-conf    
                         write recovery.conf after backup    
  -S, --slot=SLOTNAME    replication slot to use    
  -T, --tablespace-mapping=OLDDIR=NEWDIR    
                         relocate tablespace in OLDDIR to NEWDIR    
  -x, --xlog             include required WAL files in backup (fetch mode)    
  -X, --xlog-method=fetch|stream    
                         include required WAL files with specified method    
      --xlogdir=XLOGDIR  location for the transaction log directory    
  -z, --gzip             compress tar output    
  -Z, --compress=0-9     compress tar output with given compression level    
    
General options:    
  -c, --checkpoint=fast|spread    
                         set fast or spread checkpointing    
  -l, --label=LABEL      set backup label    
  -P, --progress         show progress information    
  -v, --verbose          output verbose messages    
  -V, --version          output version information, then exit    
  -?, --help             show this help, then exit    
    
Connection options:    
  -d, --dbname=CONNSTR   connection string    
  -h, --host=HOSTNAME    database server host or socket directory    
  -p, --port=PORT        database server port number    
  -s, --status-interval=INTERVAL    
                         time between status packets sent to server (in seconds)    
  -U, --username=NAME    connect as specified database user    
  -w, --no-password      never prompt for password    
  -W, --password         force password prompt (should happen automatically)    

建立備份目錄,備份可以通過流複製協議,備份到遠端。

# mkdir -p /disk1/digoal/backup    
# chown digoal:digoal /disk1/digoal/backup    

例子 1,通過流複製使用者,以及流複製備份

export PGPASSWORD=rep123    
    
pg_basebackup -D /disk1/digoal/backup -F t -z -h 資料庫IP -p 資料庫埠 -U rep     

例子 2,使用cp備份

步驟如下

select pg_start_backup(`test`);    
    
copy 資料檔案,表空間等目錄    
    
select pg_stop_backup();    

7、物理增量備份

物理增量備份分為兩種,

一種是資料檔案的增量備份,需要等10.0或者使用pg_rman, pg_probackup

https://github.com/postgrespro/pg_probackup

https://github.com/ossc-db/pg_rman

另一種是歸檔增量備份。

將歸檔檔案拷貝到備份儲存即可。

8、PITR(時間點恢復)

1. 如果是異機備份,首先要部署PostgreSQL軟體環境,建議部署的PostgreSQL軟體版本與備份的資料檔案一致。編譯引數一致。

如何檢視編譯引數?在源庫執行如下命令

pg_config    

部署軟體時,還需要部署源庫所有的PostgreSQL外掛,並且需要確保外掛版本一致。

2. 構建恢復目錄,需要有足夠的空間。

3. 解壓資料檔案,歸檔到各自的目錄。

如果有表空間,將表空間使用軟鏈連線到$PGDATA/pg_tblspc,或者將表空間解壓到$PGDATA/pg_tblspc。

4. 修改postgresql.conf

主要修改引數,包括監聽埠,shared buffer, preload library 等,防止埠衝突導致啟動失敗。

5. 配置$PGDATA/recovery.conf,同時設定恢復目標

主要配置,restore_command。(如果是搭建流複製備庫,則需要配置的是#primary_conninfo = “ # e.g. `host=localhost port=5432`。)

#restore_command = ``           # e.g. `cp /mnt/server/archivedir/%f %p`    

設定恢復到哪個時間點、XID或者target name。(如果是搭建流複製備庫,則需要配置recovery_target_timeline = `latest`)

#recovery_target_name = ``      # e.g. `daily backup 2011-01-26`    
#    
#recovery_target_time = ``      # e.g. `2004-07-14 22:39:00 EST`    
#    
#recovery_target_xid = ``    
#    
#recovery_target_inclusive = true    

6. 啟動恢復庫

pg_ctl start -D 恢復庫$PGDATA    

PITR 時間點恢復 例子

資料變更操作(主庫)

postgres=# select pg_create_restore_point(`digoal`);      
 pg_create_restore_point       
-------------------------      
 1D6/FB17EC08      
(1 row)      

首先要建立一個記錄還原點XID的表。記錄XID,時間,以及描述資訊。(來代替pg_create_restore_point 系統函式的功能)

postgres=> create table restore_point(id serial primary key, xid int8, crt_time timestamp default now(), point text);      
CREATE TABLE      

建立一個函式,代替pg_create_restore_point的功能,插入還原點。

postgres=> create or replace function create_restore_point(i_point text) returns void as $$      
declare       
begin      
  insert into restore_point(xid,point) values (txid_current(),i_point);      
end;      
$$ language plpgsql strict;      
CREATE FUNCTION      

插入一個還原點

postgres=> select create_restore_point(`digoal`);      
 create_restore_point       
----------------------      
       
(1 row)      

查詢這個表的資訊:

postgres=> select * from restore_point;      
 id |  xid   |          crt_time          | point        
----+--------+----------------------------+--------      
  1 | 561426 | 2015-06-19 09:18:57.525475 | digoal      
(1 row)      
postgres=> select * from restore_point where point=`digoal`;      
 id |  xid   |          crt_time          | point        
----+--------+----------------------------+--------      
  1 | 561426 | 2015-06-19 09:18:57.525475 | digoal      
(1 row)      

接下來要模擬一下還原:

postgres=> create table test(id int,info text);      
CREATE TABLE      
postgres=> insert into test select generate_series(1,1000),md5(random()::text);      
INSERT 0 1000      

記錄當前雜湊值。用於恢復後的比對。

postgres=> select sum(hashtext(t.*::text)) from test t;      
     sum            
--------------      
 -69739904784      
(1 row)      

接下來我要做一筆刪除操作,在刪除前,我先建立一條還原點資訊。

postgres=> select create_restore_point(`before delete test`);      
 create_restore_point       
----------------------      
       
(1 row)      
postgres=> delete from test;      
DELETE 1000      
postgres=> select * from restore_point where point=`before delete test`;      
 id |  xid   |          crt_time          |       point              
----+--------+----------------------------+--------------------      
  2 | 561574 | 2015-06-19 09:45:28.030295 | before delete test      
(1 row)      

我只需要恢復到561574 即可。接下來就是模擬恢復了。

主動產生一些XLOG,觸發歸檔。

postgres=> select pg_xlogfile_name(pg_current_xlog_location());      
     pg_xlogfile_name           
--------------------------      
 000000010000000200000041      
(1 row)      
postgres=> insert into test select generate_series(1,100000);      
INSERT 0 100000      
postgres=> insert into test select generate_series(1,100000);      
INSERT 0 100000      
postgres=> select pg_xlogfile_name(pg_current_xlog_location());      
     pg_xlogfile_name           
--------------------------      
 000000010000000200000042      
(1 row)      

下載阿里雲RDS的備份和歸檔到本地。

在本地需要安裝一個postgresql, 與阿里雲RDS的編譯配置引數一致(例如資料塊的大小),最好使用的模組也一致,這裡沒有用到其他模組,所以無所謂。

通過pg_settings來看一下RDS的配置資訊,版本資訊,方便我們在本地進行恢復。

postgres=> select name,setting,unit from pg_settings where category=`Preset Options`;      
         name          | setting | unit       
-----------------------+---------+------      
 block_size            | 8192    |        
 data_checksums        | on      |       
 integer_datetimes     | on      |       
 max_function_args     | 100     |       
 max_identifier_length | 63      |       
 max_index_keys        | 32      |       
 segment_size          | 131072  | 8kB      
 server_version        | 9.4.1   |       
 server_version_num    | 90401   |       
 wal_block_size        | 8192    |       
 wal_segment_size      | 2048    | 8kB      
(11 rows)      
postgres=> select version();      
                                                   version                                                          
--------------------------------------------------------------------------------------------------------------      
 PostgreSQL 9.4.1 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6 20110731 (Red Hat 4.4.6-3), 64-bit      
(1 row)      

本地編譯安裝PostgreSQL 9.4.1,編譯引數與RDS一致。阿里雲RDS這幾個引數都是預設的。

  --with-blocksize=BLOCKSIZE      
                          set table block size in kB [8]      
  --with-segsize=SEGSIZE  set table segment size in GB [1]      
  --with-wal-blocksize=BLOCKSIZE      
                          set WAL block size in kB [8]      
  --with-wal-segsize=SEGSIZE      
                          set WAL segment size in MB [16]      
      
      
# useradd digoal      
# su - digoal      
$ vi .bash_profile      
# add by digoal      
export PS1="$USER@`/bin/hostname -s`-> "      
export PGPORT=1931      
export PGDATA=/home/digoal/pg_root      
export LANG=en_US.utf8      
export PGHOME=/home/digoal/pgsql9.4.1      
export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH      
export DATE=`date +"%Y%m%d%H%M"`      
export PATH=$PGHOME/bin:$PATH:.      
export MANPATH=$PGHOME/share/man:$MANPATH      
export PGHOST=$PGDATA      
export PGDATABASE=postgres      
alias rm=`rm -i`      
alias ll=`ls -lh`      
unalias vi      
      
# wget https://ftp.postgresql.org/pub/source/v9.4.1/postgresql-9.4.1.tar.bz2      
# tar -jxvf postgresql-9.4.1.tar.bz2      
# cd postgresql-9.4.1      
# ./configure --prefix=/home/digoal/pgsql9.4.1 --with-pgport=1931 --with-perl --with-python --with-tcl --with-openssl --with-pam --with-ldap --with-libxml --with-libxslt --enable-thread-safety --enable-debug      
# gmake world && gmake install-world      

下載備份和歸檔檔案,解壓:

基礎備份選擇需要恢復的時間點之前的一個備份,歸檔則選擇在此之後的所有歸檔檔案。

total 453M      
-rw-r--r-- 1 root   root    17M Jun 19 10:23 000000010000000200000040.tar.gz      
-rw-r--r-- 1 root   root    17M Jun 19 10:23 000000010000000200000041.tar.gz      
-rw-r--r-- 1 root   root   404M Jun 19 10:23 hins668881_xtra_20150618232331.tar.gz      
      
# mkdir /home/digoal/pg_root      
# mv hins668881_xtra_20150618232331.tar.gz /home/digoal/pg_root      
# tar -zxvf hins668881_xtra_20150618232331.tar.gz      
[root@db-172-16-3-150 ~]# tar -xvf 000000010000000200000040.tar.gz       
000000010000000200000040      
[root@db-172-16-3-150 ~]# tar -xvf 000000010000000200000041.tar.gz       
000000010000000200000041      
[root@db-172-16-3-150 ~]# mv 000000010000000200000040 /home/digoal/      
[root@db-172-16-3-150 ~]# mv 000000010000000200000041 /home/digoal/      
[root@db-172-16-3-150 ~]# chown -R digoal:digoal /home/digoal      
[root@db-172-16-3-150 ~]# chmod -R 700 /home/digoal/pg_root      

修改配置檔案,註釋掉阿里雲RDS PG的一些定製引數。

$ vi postgresql.conf      
    
port=1931      
      
註釋RDS自定義的GUC引數      
#rds_enable_proxy=on      
#rds_available_extensions=`plpgsql,pg_stat_statements,btree_gin,btree_gist,chkpass,citext,cube,dblink,dict_int,earthdistance,hstore,intagg,intarray,isn,ltree,pgcrypto,pgrowlocks,pg_prewarm,pg_trgm,postgres_fdw,sslinfo,tablefunc,tsearch2,unaccent,postgis,postgis_topology,fuzzystrmatch,postgis_tiger_geocoder,plperl,pltcl,plv8,plls,plcoffee,"uuid-ossp"`      
#rds_enable_admin_user_as_super=on      

配置recovery.conf

開啟hot_standby,恢復到目標點後暫停,如果確認已經到達,使用resume啟用。

$ vi recovery.conf      
standby_mode = `on`      
restore_command = `cp /home/digoal/%f %p`                 
recovery_target_xid=`561574`  # 使用我們自建的恢復點的XID      
recovery_target_inclusive=true      
pause_at_recovery_target=true      

啟動資料庫

$ pg_ctl start      

檢查是否恢復到指定XID

digoal@db-172-16-3-150-> psql -h 127.0.0.1 -p 1931      
psql (9.4.1)      
Type "help" for help.      
      
postgres=> dt      
             List of relations      
 Schema |      Name       | Type  | Owner        
--------+-----------------+-------+--------      
 public | ha_health_check | table | aurora      
 public | login_log       | table | digoal      
 public | restore_point   | table | digoal      
 public | session         | table | digoal      
 public | tbl_small       | table | digoal      
 public | test            | table | digoal      
 public | userinfo        | table | digoal      
(7 rows)      

檢查,已經恢復到DELETE test表的資料之前了。

postgres=> select count(*) from test;      
 count       
-------      
  1000      
(1 row)      
postgres=> select sum(hashtext(t.*::text)) from test t;      
     sum            
--------------      
 -69739904784      
(1 row)      

7. 資料庫效能

1、引數優化

假設 512GB 記憶體, SSD.

listen_addresses = `0.0.0.0`    
port = 1921    
max_connections = 5000    
unix_socket_directories = `.`  # 確保unix socket處於安全目錄    
tcp_keepalives_idle = 60       # 避免網路層裝置主動斷開空閒連線的問題    
tcp_keepalives_interval = 10   # 避免網路層裝置主動斷開空閒連線的問題    
tcp_keepalives_count = 10      # 避免網路層裝置主動斷開空閒連線的問題    
shared_buffers = 128GB         # 1/4 實體記憶體    
maintenance_work_mem = 4GB     # 加速建立索引,VACUUM垃圾回收,平時最多消耗 maintenance_work_mem * autovacuum_max_workers 記憶體    
dynamic_shared_memory_type = posix    
vacuum_cost_delay = 0          # SSD, 建議設定為0, 不打斷vacuum    
bgwriter_delay = 10ms          # 快速刷shared buffer髒頁    
bgwriter_lru_maxpages = 1000       
bgwriter_lru_multiplier = 10.0    
bgwriter_flush_after = 0       # IO很好的機器,不需要考慮平滑排程。否則建議設定為大於0    
max_worker_processes = 128    
max_parallel_workers_per_gather = 0    
old_snapshot_threshold = -1    
backend_flush_after = 0        # IO很好的機器,不需要考慮平滑排程    
wal_level = replica    
synchronous_commit = off       # 非同步提交    
full_page_writes = on          # COW檔案系統,或者對齊並支援BLOCK_SIZE大小原子寫的SSD,可以關閉,提高效能,減少WAL的寫入量。    
wal_buffers = 1GB    
wal_writer_delay = 10ms    
wal_writer_flush_after = 0     # IO很好的機器,不需要考慮平滑排程    
checkpoint_timeout = 30min     # 不建議頻繁做檢查點,否則XLOG會產生很多的FULL PAGE WRITE。    
max_wal_size = 256GB           # 建議設定為SHARED BUFFER的2倍    
min_wal_size = 32GB    
checkpoint_completion_target = 0.05    # 硬碟好的情況下,可以讓檢查點快速結束,恢復時也可以快速達到一致狀態。    
checkpoint_flush_after = 0             # IO很好的機器,不需要考慮平滑排程    
archive_mode = on    
archive_command = `/bin/date`          #  後期再修改,如  `test ! -f /disk1/digoal/arch/%f && cp %p /disk1/digoal/arch/%f`    
max_wal_senders = 8    
random_page_cost = 1.3                 # IO很好的機器,不需要考慮離散和順序掃描的成本差異    
parallel_tuple_cost = 0    
parallel_setup_cost = 0    
min_parallel_relation_size = 0    
effective_cache_size = 300GB           # 看著辦,減掉資料庫shared buffer, maintenance work mem, work mem, wal shared buffer,剩下的都是OS可用的CACHE。    
force_parallel_mode = off    
log_destination = `csvlog`    
logging_collector = on    
log_truncate_on_rotation = on    
log_checkpoints = on    
log_connections = on    
log_disconnections = on    
log_error_verbosity = verbose    
log_timezone = `PRC`    
autovacuum = on    
log_autovacuum_min_duration = 0    
autovacuum_max_workers = 16             # CPU核多,並且IO好的情況下,可多點,但是注意 16*autovacuum mem ,會消耗較多記憶體,所以記憶體也要有基礎。      
autovacuum_naptime = 20s    
autovacuum_vacuum_scale_factor = 0.05   # 垃圾版本超過5%時,觸發垃圾回收    
vacuum_freeze_min_age = 50000000        # 年齡超過5000萬的記錄,都設定為凍結年齡    
vacuum_multixact_freeze_min_age = 5000000       
autovacuum_freeze_max_age = 1600000000      # 表的年齡超過16億時,強制觸發vacuum freeze(即使沒有開啟autovacuum)    
vacuum_freeze_table_age = 1200000000        # 表的年齡超過12億時,autovacuum 觸發vacuum freeze    
vacuum_multixact_freeze_table_age = 1100000000        
datestyle = `iso, mdy`    
timezone = `PRC`    
lc_messages = `C`    
lc_monetary = `C`    
lc_numeric = `C`    
lc_time = `C`    
default_text_search_config = `pg_catalog.english`    
shared_preload_libraries=`pg_stat_statements`    

2、效能檢視 pg_stat*

                     List of relations    
   Schema   |            Name             | Type |  Owner       
------------+-----------------------------+------+----------    
 pg_catalog | pg_stat_activity            | view | postgres    
 pg_catalog | pg_stat_all_indexes         | view | postgres    
 pg_catalog | pg_stat_all_tables          | view | postgres    
 pg_catalog | pg_stat_archiver            | view | postgres    
 pg_catalog | pg_stat_bgwriter            | view | postgres    
 pg_catalog | pg_stat_database            | view | postgres    
 pg_catalog | pg_stat_database_conflicts  | view | postgres    
 pg_catalog | pg_stat_progress_vacuum     | view | postgres    
 pg_catalog | pg_stat_replication         | view | postgres    
 pg_catalog | pg_stat_ssl                 | view | postgres    
 pg_catalog | pg_stat_sys_indexes         | view | postgres    
 pg_catalog | pg_stat_sys_tables          | view | postgres    
 pg_catalog | pg_stat_user_functions      | view | postgres    
 pg_catalog | pg_stat_user_indexes        | view | postgres    
 pg_catalog | pg_stat_user_tables         | view | postgres    
 pg_catalog | pg_stat_wal_receiver        | view | postgres    
 pg_catalog | pg_stat_xact_all_tables     | view | postgres    
 pg_catalog | pg_stat_xact_sys_tables     | view | postgres    
 pg_catalog | pg_stat_xact_user_functions | view | postgres    
 pg_catalog | pg_stat_xact_user_tables    | view | postgres    
 pg_catalog | pg_statio_all_indexes       | view | postgres    
 pg_catalog | pg_statio_all_sequences     | view | postgres    
 pg_catalog | pg_statio_all_tables        | view | postgres    
 pg_catalog | pg_statio_sys_indexes       | view | postgres    
 pg_catalog | pg_statio_sys_sequences     | view | postgres    
 pg_catalog | pg_statio_sys_tables        | view | postgres    
 pg_catalog | pg_statio_user_indexes      | view | postgres    
 pg_catalog | pg_statio_user_sequences    | view | postgres    
 pg_catalog | pg_statio_user_tables       | view | postgres    
 pg_catalog | pg_stats                    | view | postgres    
(30 rows)    

3、TOP SQL

pg_stat_statements外掛,統計SQL呼叫的資源消耗。

create extension pg_stat_statements;    
    
shared_preload_libraries = `pg_stat_statements`;    

TOP 5 CPU_TIME SQL

select c.rolname,b.datname,a.total_time/a.calls per_call_time,a.* from pg_stat_statements a, pg_database b, pg_authid c where a.userid=c.oid and a.dbid=b.oid order by a.total_time desc limit 5;    

4、AWR

《PostgreSQL AWR報告(for 阿里雲ApsaraDB PgSQL)》

5、auto_explain

列印超時SQL當時的執行計劃。

shared_preload_libraries = `auto_explain`    
    
配置    
auto_explain.log_analyze = on                       
auto_explain.log_nested_statements = on             
auto_explain.log_buffers = on               
auto_explain.log_min_duration = `1s`          
auto_explain.log_timing = on                
auto_explain.log_verbose = on    

6、explain解讀

explain 語法

pic

explain 例子

pic

explain 節點

pic

8. 資料庫壓測

pg_bench介紹

pgbench --help    
    
pgbench is a benchmarking tool for PostgreSQL.    
    
Usage:    
  pgbench [OPTION]... [DBNAME]    
    
Initialization options:    
  -i, --initialize         invokes initialization mode    
  -F, --fillfactor=NUM     set fill factor    
  -n, --no-vacuum          do not run VACUUM after initialization    
  -q, --quiet              quiet logging (one message each 5 seconds)    
  -s, --scale=NUM          scaling factor    
  --foreign-keys           create foreign key constraints between tables    
  --index-tablespace=TABLESPACE    
                           create indexes in the specified tablespace    
  --tablespace=TABLESPACE  create tables in the specified tablespace    
  --unlogged-tables        create tables as unlogged tables    
    
Options to select what to run:    
  -b, --builtin=NAME[@W]   add builtin script NAME weighted at W (default: 1)    
                           (use "-b list" to list available scripts)    
  -f, --file=FILENAME[@W]  add script FILENAME weighted at W (default: 1)    
  -N, --skip-some-updates  skip updates of pgbench_tellers and pgbench_branches    
                           (same as "-b simple-update")    
  -S, --select-only        perform SELECT-only transactions    
                           (same as "-b select-only")    
    
Benchmarking options:    
  -c, --client=NUM         number of concurrent database clients (default: 1)    
  -C, --connect            establish new connection for each transaction    
  -D, --define=VARNAME=VALUE    
                           define variable for use by custom script    
  -j, --jobs=NUM           number of threads (default: 1)    
  -l, --log                write transaction times to log file    
  -L, --latency-limit=NUM  count transactions lasting more than NUM ms as late    
  -M, --protocol=simple|extended|prepared    
                           protocol for submitting queries (default: simple)    
  -n, --no-vacuum          do not run VACUUM before tests    
  -P, --progress=NUM       show thread progress report every NUM seconds    
  -r, --report-latencies   report average latency per command    
  -R, --rate=NUM           target rate in transactions per second    
  -s, --scale=NUM          report this scale factor in output    
  -t, --transactions=NUM   number of transactions each client runs (default: 10)    
  -T, --time=NUM           duration of benchmark test in seconds    
  -v, --vacuum-all         vacuum all four standard tables before tests    
  --aggregate-interval=NUM aggregate data over NUM seconds    
  --progress-timestamp     use Unix epoch timestamps for progress    
  --sampling-rate=NUM      fraction of transactions to log (e.g., 0.01 for 1%)    
    
Common options:    
  -d, --debug              print debugging output    
  -h, --host=HOSTNAME      database server host or socket directory    
  -p, --port=PORT          database server port number    
  -U, --username=USERNAME  connect as specified database user    
  -V, --version            output version information, then exit    
  -?, --help               show this help, then exit    

tpc-b測試

  set aid random(1, 100000 * :scale)    
  set bid random(1, 1 * :scale)    
  set tid random(1, 10 * :scale)    
  set delta random(-5000, 5000)    
  BEGIN;    
  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;    
  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;    
  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;    
  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;    
  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);    
  END;    

初始化資料, 1000萬記錄。

export PGPASSWORD=密碼    
    
pgbench -i -s 100    

測試

export PGPASSWORD=密碼    
    
pgbench -M prepared -n -r -P 1 -c 連線數 -j 執行緒數 -h 主機IP -p 埠 -U 使用者 資料庫名 -T 時間(秒)    

自定義測試

建立測試表

create table test(id int primary key, info text, crt_time timestamp);    

建立指令碼

vi test.sql    
    
set id random(1,100000000)    
insert into test (id,info,crt_time) values (:id, md5(random()::text), now()) on conflict (id) do update set info=excluded.info, crt_time=excluded.crt_time;    

壓測

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 32 -j 32 -T 100    
progress: 1.0 s, 246533.1 tps, lat 0.128 ms stddev 0.536    
progress: 2.0 s, 254653.8 tps, lat 0.125 ms stddev 0.333    
progress: 3.0 s, 232433.6 tps, lat 0.137 ms stddev 0.812    
progress: 4.0 s, 258225.7 tps, lat 0.123 ms stddev 0.250    

9. 日常維護

1、AWR健康報告

2、垃圾回收

3、膨脹索引重建

4、TOP SQL效能優化

5、錯誤日誌分析

6、安全風險評估(SQL隱碼攻擊、暴力破解、密碼週期修改)

7、備份可用性校驗

8、歷史備份和歸檔清理

9、歷史資料清理

10、實時監控系統建設

11、HA建設、監控、演練

12、容災建設、監控、演練

13、審計系統建設

14、業務邏輯優化

15、核心優化

16、版本升級、BUG FIX

17、社群交流


相關文章