PostgreSQL批量、單步寫入-row,statement觸發器(中間表)、CTE幾種用法效能對比

德哥發表於2018-10-05

標籤

PostgreSQL , 觸發器 , row , statement , statement 觸發器中間表 , CTE , 效能對比


背景

資料庫寫入方式包括:

1、單條insert, autocommit

insert into xx values ();  

2、批量(單QUERY)

insert into xx values (),(),...();  

3、分組提交

begin;  
insert into xx values ();  
...  
insert into xx values ();  
end;  

INSERT協議包括extended, prepared, simple等。

當需要對每行寫入的資料進行邏輯處理時,可以使用以下三種方法:

postgres=# h create trigger  
Command:     CREATE TRIGGER  
Description: define a new trigger  
Syntax:  
CREATE [ CONSTRAINT ] TRIGGER name { BEFORE | AFTER | INSTEAD OF } { event [ OR ... ] }  
    ON table_name  
    [ FROM referenced_table_name ]  
    [ NOT DEFERRABLE | [ DEFERRABLE ] [ INITIALLY IMMEDIATE | INITIALLY DEFERRED ] ]  
    [ REFERENCING { { OLD | NEW } TABLE [ AS ] transition_relation_name } [ ... ] ]  
    [ FOR [ EACH ] { ROW | STATEMENT } ]  
    [ WHEN ( condition ) ]  
    EXECUTE PROCEDURE function_name ( arguments )  
  
where event can be one of:  
  
    INSERT  
    UPDATE [ OF column_name [, ... ] ]  
    DELETE  
    TRUNCATE  
  
postgres=# h create rule  
Command:     CREATE RULE  
Description: define a new rewrite rule  
Syntax:  
CREATE [ OR REPLACE ] RULE name AS ON event  
    TO table_name [ WHERE condition ]  
    DO [ ALSO | INSTEAD ] { NOTHING | command | ( command ; command ... ) }  
  
where event can be one of:  
  
    SELECT | INSERT | UPDATE | DELETE  

1、行級觸發器

2、語句級觸發器

https://www.postgresql.org/docs/devel/static/plpgsql-trigger.html

《PostgreSQL 10.0 preview 功能增強 – 觸發器函式內建中間表》

3、CTE寫法

with tmp as (insert into xxx values xxx returning *),  
t1 as (邏輯處理1 from tmp where ...),  
...,  
tn as (邏輯處理n from tmp where ...)  
邏輯處理n+1 from tmp where ...;  

下面針對以上三種方法,分別介紹單步、批量插入時的效能差異,以便業務上針對性的做出選擇。

專案背景參考

《PostgreSQL 流式處理應用實踐 – 二手商品實時歸類(非同步訊息notify/listen、閱後即焚)》

DEMO

1、建表

create table test(id int, info text, c1 int , c2 int, crt_time timestamp);  

2、邏輯處理的結果寫入下面的表

create table t_result (like test);  

無觸發器時,批量寫入效能

postgres=# insert into test select 1,`test`,random()*100,random()*100,now() from generate_series(1,1000000);  
INSERT 0 1000000  
Time: 1118.051 ms (00:01.118)  

行級觸發器 – 多個觸發器的批量寫入效能

1、觸發器函式

create or replace function tg1() returns trigger as $$  
declare  
begin  
  if NEW.c1<=1 and NEW.c2<=1 then   
    insert into t_result values (NEW.*);  
  end if;  
  return null;  
end;  
$$ language plpgsql strict;  

2、2個觸發器的效能

create trigger tg1 after insert on test for each row execute procedure tg1();  
create trigger tg2 after insert on test for each row execute procedure tg1();  
  
  
postgres=# insert into test select 1,`test`,random()*100,random()*100,now() from generate_series(1,1000000);  
INSERT 0 1000000  
Time: 4879.985 ms (00:04.880)  

3、4個觸發器的效能

create trigger tg3 after insert on test for each row execute procedure tg1();  
create trigger tg4 after insert on test for each row execute procedure tg1();  
  
postgres=# insert into test select 1,`test`,random()*100,random()*100,now() from generate_series(1,1000000);  
INSERT 0 1000000  
Time: 8776.416 ms (00:08.776)  

4、6個觸發器的效能

create trigger tg5 after insert on test for each row execute procedure tg1();  
create trigger tg6 after insert on test for each row execute procedure tg1();  
  
postgres=# insert into test select 1,`test`,random()*100,random()*100,now() from generate_series(1,1000000);  
INSERT 0 1000000  
Time: 12648.707 ms (00:12.649)  

5、將觸發器合併成1個,所有邏輯放到一個觸發器中,批量寫入效能

drop trigger tg1 ON test;  
drop trigger tg2 ON test;  
drop trigger tg3 ON test;  
drop trigger tg4 ON test;  
drop trigger tg5 ON test;  
drop trigger tg6 ON test;  
  
  
create or replace function tg1() returns trigger as $$  
declare  
begin  
  if NEW.c1<=1 and NEW.c2<=1 then   
    insert into t_result values (NEW.*);  
  end if;  
  if NEW.c1<=1 and NEW.c2<=1 then   
    insert into t_result values (NEW.*);  
  end if;  
  if NEW.c1<=1 and NEW.c2<=1 then   
    insert into t_result values (NEW.*);  
  end if;  
  if NEW.c1<=1 and NEW.c2<=1 then   
    insert into t_result values (NEW.*);  
  end if;  
  if NEW.c1<=1 and NEW.c2<=1 then   
    insert into t_result values (NEW.*);  
  end if;  
  if NEW.c1<=1 and NEW.c2<=1 then   
    insert into t_result values (NEW.*);  
  end if;  
  return null;  
end;  
$$ language plpgsql strict;  
  
  
create trigger tg1 after insert on test for each row execute procedure tg1();  
  
  
postgres=# insert into test select 1,`test`,random()*100,random()*100,now() from generate_series(1,1000000);  
INSERT 0 1000000  
Time: 5042.071 ms (00:05.042)  

語句級觸發器 – 1個觸發器,批量寫入,6個規則的批量寫入效能

https://www.postgresql.org/docs/devel/static/plpgsql-trigger.html

1、建立觸發器函式,使用中間表,邏輯處理放到中間表中

create or replace function tg1() returns trigger as $$  
declare  
begin  
  insert into t_result select * from new_table where c1<=1 and c2<=1;  
  insert into t_result select * from new_table where c1<=1 and c2<=1;  
  insert into t_result select * from new_table where c1<=1 and c2<=1;  
  insert into t_result select * from new_table where c1<=1 and c2<=1;  
  insert into t_result select * from new_table where c1<=1 and c2<=1;  
  insert into t_result select * from new_table where c1<=1 and c2<=1;  
  return null;  
end;  
$$ language plpgsql strict;  
  
drop trigger tg1 ON test;  
  
create trigger tg1 after insert on test REFERENCING NEW TABLE AS new_table for each STATEMENT execute procedure tg1();  

2、6個規則的效能

postgres=# insert into test select 1,`test`,random()*100,random()*100,now() from generate_series(1,1000000);  
INSERT 0 1000000  
Time: 1847.532 ms (00:01.848)  

CTE – 批量寫入,6個規則效能

with new_table as (insert into test select 1,`test`,random()*100,random()*100,now() from generate_series(1,1000000) returning *),  
t1 as (insert into t_result select * from new_table where c1<=1 and c2<=1),  
t2 as (insert into t_result select * from new_table where c1<=1 and c2<=1),  
t3 as (insert into t_result select * from new_table where c1<=1 and c2<=1),  
t4 as (insert into t_result select * from new_table where c1<=1 and c2<=1),  
t5 as (insert into t_result select * from new_table where c1<=1 and c2<=1)  
insert into t_result select * from new_table where c1<=1 and c2<=1;  
  
INSERT 0 222  
Time: 2833.217 ms (00:02.833)  

單條INSERT,“行級、語句級觸發器、CTE用法”6個規則,效能對比

1、測試指令碼

vi test.sql  
  
insert into test values (1,`test`,random()*100,random()*100,now());  
  
  
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 28 -j 28 -T 120  

無觸發器

transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 28  
number of threads: 28  
duration: 120 s  
number of transactions actually processed: 29783861  
latency average = 0.113 ms  
latency stddev = 0.329 ms  
tps = 248197.480858 (including connections establishing)  
tps = 248213.036221 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.113  insert into test values (1,`test`,random()*100,random()*100,now());  

2、ROW觸發器

create or replace function tg1() returns trigger as $$  
declare  
begin  
  if NEW.c1<=1 and NEW.c2<=1 then   
    insert into t_result values (NEW.*);  
  end if;  
  if NEW.c1<=1 and NEW.c2<=1 then   
    insert into t_result values (NEW.*);  
  end if;  
  if NEW.c1<=1 and NEW.c2<=1 then   
    insert into t_result values (NEW.*);  
  end if;  
  if NEW.c1<=1 and NEW.c2<=1 then   
    insert into t_result values (NEW.*);  
  end if;  
  if NEW.c1<=1 and NEW.c2<=1 then   
    insert into t_result values (NEW.*);  
  end if;  
  if NEW.c1<=1 and NEW.c2<=1 then   
    insert into t_result values (NEW.*);  
  end if;  
  return null;  
end;  
$$ language plpgsql strict;  
  
  
create trigger tg1 after insert on test for each row execute procedure tg1();  
  
  
  
  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 28  
number of threads: 28  
duration: 120 s  
number of transactions actually processed: 18398837  
latency average = 0.183 ms  
latency stddev = 0.335 ms  
tps = 153322.919798 (including connections establishing)  
tps = 153332.364931 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.183  insert into test values (1,`test`,random()*100,random()*100,now());  

3、STATEMENT觸發器

create or replace function tg1() returns trigger as $$  
declare  
begin  
  insert into t_result select * from new_table where c1<=1 and c2<=1;  
  insert into t_result select * from new_table where c1<=1 and c2<=1;  
  insert into t_result select * from new_table where c1<=1 and c2<=1;  
  insert into t_result select * from new_table where c1<=1 and c2<=1;  
  insert into t_result select * from new_table where c1<=1 and c2<=1;  
  insert into t_result select * from new_table where c1<=1 and c2<=1;  
  return null;  
end;  
$$ language plpgsql strict;  
  
drop trigger tg1 ON test;  
  
create trigger tg1 after insert on test REFERENCING NEW TABLE AS new_table for each STATEMENT execute procedure tg1();  
  
  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 28  
number of threads: 28  
duration: 120 s  
number of transactions actually processed: 16868185  
latency average = 0.199 ms  
latency stddev = 0.214 ms  
tps = 140567.379172 (including connections establishing)  
tps = 140576.126770 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.199  insert into test values (1,`test`,random()*100,random()*100,now());  

效能總結

模式 規則個數 語句觸發器 處理吞吐 行觸發器 處理吞吐 無規則 處理吞吐
單條寫入 單觸發器6個規則 14萬行/s 15.3萬行/s 24.8萬行/s
批量寫入100萬行 單觸發器6個規則 54.1萬行/s 19.8萬行/s 89.4萬行/s
批量寫入100萬行 2個觸發器2個規則 20.5萬行/s
批量寫入100萬行 4個觸發器4個規則 11.4萬行/s
批量寫入100萬行 6個觸發器6個規則 7.9萬行/s
CTE語法批量寫入100萬行 6個規則 35.3萬行/s (含6規則)

小結

對於需要邏輯處理的資料表,使用語句級觸發器,同時使用中間表的模式來進行邏輯處理,效能是比較均衡的。相比行級觸發器,效能好很多。

另外,每增加一個觸發器,效能會下降比較厲害。建議把處理邏輯放到一個觸發器裡面,而不要使用多個觸發器。

參考

《PostgreSQL 流式處理應用實踐 – 二手商品實時歸類(非同步訊息notify/listen、閱後即焚)》


相關文章