HTAP資料庫PostgreSQL場景與效能測試之15-(OLTP)物聯網-查詢一個時序區間的資料

德哥發表於2017-11-12

背景

PostgreSQL是一個歷史悠久的資料庫，歷史可以追溯到1973年，最早由2014計算機圖靈獎得主，關聯式資料庫的鼻祖Michael_Stonebraker 操刀設計，PostgreSQL具備與Oracle類似的功能、效能、架構以及穩定性。

PostgreSQL社群的貢獻者眾多，來自全球各個行業，歷經數年，PostgreSQL 每年釋出一個大版本，以持久的生命力和穩定性著稱。

2017年10月，PostgreSQL 推出10 版本，攜帶諸多驚天特性，目標是勝任OLAP和OLTP的HTAP混合場景的需求：

《最受開發者歡迎的HTAP資料庫PostgreSQL 10特性》

1、多核並行增強

2、fdw 聚合下推

3、邏輯訂閱

4、分割槽

5、金融級多副本

6、json、jsonb全文檢索

7、還有外掛化形式存在的特性，如向量計算、JIT、SQL圖計算、SQL流計算、分散式平行計算、時序處理、基因測序、化學分析、影像分析等。

在各種應用場景中都可以看到PostgreSQL的應用：

PostgreSQL近年來的發展非常迅猛，從知名資料庫評測網站dbranking的資料庫評分趨勢，可以看到PostgreSQL向上發展的趨勢：

從每年PostgreSQL中國召開的社群會議，也能看到同樣的趨勢，參與的公司越來越多，分享的公司越來越多，分享的主題越來越豐富，橫跨了傳統企業、網際網路、醫療、金融、國企、物流、電商、社交、車聯網、共享XX、雲、遊戲、公共交通、航空、鐵路、軍工、培訓、諮詢服務等行業。

接下來的一系列文章，將給大家介紹PostgreSQL的各種應用場景以及對應的效能指標。

環境

環境部署方法參考：

《PostgreSQL 10 + PostGIS + Sharding(pg_pathman) + MySQL(fdw外部表) on ECS 部署指南(適合新使用者)》

阿里雲 ECS：56核，224G，1.5TB*2 SSD雲盤。

作業系統：CentOS 7.4 x64

資料庫版本：PostgreSQL 10

PS：ECS的CPU和IO效能相比物理機會打一定的折扣，可以按下降1倍效能來估算。跑物理主機可以按這裡測試的效能乘以2來估算。

場景 – 物聯網 – 查詢一個時序區間的資料 (OLTP)

1、背景

在物聯網、網際網路、業務系統中都有時序資料，隨著時間推移產生的資料。在時間維度或序列欄位上呈現自增特性。

區間查詢是一種按範圍查詢的業務需求。

PostgreSQL針對時序型別的資料，除了有傳統的b-tree索引，還有一種塊級索引BRIN，非常適合這種相關性很好的時序資料。這種索引在Oracle Exadata一體機上也有。而使用PostgreSQL可以免費享用這種高階特性。

2、設計

1億條時序自增記錄，按任意區間查詢並輸出 5萬條記錄。

3、準備測試表

create table t_range(  
  id int,  
  ts timestamp default clock_timestamp()  
);

4、準備測試函式(可選)

5、準備測試資料

insert into t_range(id) select generate_series(1,100000000);

6、準備測試指令碼

1、使用傳統的b-tree索引

btree索引佔用2142MB空間。

create index idx_t_range_id on t_range using btree (id);  
  
postgres=# di+ idx_t_range_id  
                              List of relations  
 Schema |      Name      | Type  |  Owner   |  Table  |  Size   | Description  
--------+----------------+-------+----------+---------+---------+-------------  
 public | idx_t_range_id | index | postgres | t_range | 2142 MB |  
(1 row)

單次查詢效率：

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t_range where id between 1 and 50000;  
                                                                QUERY PLAN  
-------------------------------------------------------------------------------------------------------------------------------------------  
 Index Scan using idx_t_range_id on public.t_range  (cost=0.57..1527.31 rows=53167 width=12) (actual time=0.013..9.938 rows=50000 loops=1)  
   Output: id, ts  
   Index Cond: ((t_range.id >= 1) AND (t_range.id <= 50000))  
   Buffers: shared hit=411  
 Planning time: 0.060 ms  
 Execution time: 14.320 ms  
(6 rows)

vi test.sql  
  
set id random(1,90000000)  
set mx :id+50000  
select * from t_range where id between :id and :mx;

2、使用BRIN塊級索引

BRIN索引僅佔用256KB空間。

drop index idx_t_range_id;  
create index idx_t_range_id on t_range using brin (id) with (pages_per_range=64);  
postgres=# di+ idx_t_range_id  
                              List of relations  
 Schema |      Name      | Type  |  Owner   |  Table  |  Size  | Description  
--------+----------------+-------+----------+---------+--------+-------------  
 public | idx_t_range_id | index | postgres | t_range | 256 kB |  
(1 row)

單次查詢效率：

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t_range where id between 1 and 50000;  
                                                          QUERY PLAN  
-------------------------------------------------------------------------------------------------------------------------------  
 Bitmap Heap Scan on public.t_range  (cost=43.31..52572.18 rows=38593 width=12) (actual time=1.497..9.807 rows=50000 loops=1)  
   Output: id, ts  
   Recheck Cond: ((t_range.id >= 1) AND (t_range.id <= 50000))  
   Rows Removed by Index Recheck: 9200  
   Heap Blocks: lossy=320  
   Buffers: shared hit=355  
   ->  Bitmap Index Scan on idx_t_range_id  (cost=0.00..33.66 rows=47360 width=0) (actual time=1.489..1.489 rows=3200 loops=1)  
         Index Cond: ((t_range.id >= 1) AND (t_range.id <= 50000))  
         Buffers: shared hit=35  
 Planning time: 0.036 ms  
 Execution time: 14.162 ms  
(11 rows)

壓測

vi test.sql  
  
set id random(1,90000000)  
set mx :id+50000  
select * from t_range where id between :id and :mx;

7、測試

壓測

CONNECTS=16  
TIMES=300  
export PGHOST=$PGDATA  
export PGPORT=1999  
export PGUSER=postgres  
export PGPASSWORD=postgres  
export PGDATABASE=postgres  
  
pgbench -M prepared -n -r -f ./test.sql -P 5 -c $CONNECTS -j $CONNECTS -T $TIMES

8、測試結果

1、b-tree索引

transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 16  
number of threads: 16  
duration: 300 s  
number of transactions actually processed: 188165  
latency average = 25.509 ms  
latency stddev = 4.625 ms  
tps = 627.166703 (including connections establishing)  
tps = 627.187145 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.002  set id random(1,90000000)  
         0.000  set mx :id+50000  
        25.507  select * from t_range where id between :id and :mx;

2、brin索引

transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 16  
number of threads: 16  
duration: 300 s  
number of transactions actually processed: 189889  
latency average = 25.278 ms  
latency stddev = 4.570 ms  
tps = 632.907768 (including connections establishing)  
tps = 632.927776 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.002  set id random(1,90000000)  
         0.000  set mx :id+50000  
        25.276  select * from t_range where id between :id and :mx;

TPS

1、b-tree索引

627  
  
相當於每秒返回3135萬行記錄。

2、brin索引

632  
  
相當於每秒返回3160萬行記錄。

平均響應時間

1、b-tree索引

25.509 毫秒

2、brin索引

25.278 毫秒

參考

《PostgreSQL、Greenplum 應用案例寶典《如來神掌》 – 目錄》

《資料庫選型之 – 大象十八摸 – 致架構師、開發者》

《PostgreSQL 使用 pgbench 測試 sysbench 相關case》

《資料庫界的華山論劍 tpc.org》

https://www.postgresql.org/docs/10/static/pgbench.html

HTAP資料庫PostgreSQL場景與效能測試之26-(OLTP)NOTIN、NOTEXISTS查詢
2017-11-14
資料庫SQL
HTAP資料庫PostgreSQL場景與效能測試之10-(OLTP)字串搜尋-字首查詢
2017-11-12
資料庫SQL字串
HTAP資料庫PostgreSQL場景與效能測試之28-(OLTP)高併發點更新
2017-11-14
資料庫SQL
HTAP資料庫PostgreSQL場景與效能測試之21-(OLTP+OLAP)排序、建索引
2017-11-13
資料庫SQL排序索引
HTAP資料庫PostgreSQL場景與效能測試之40-(OLTP+OLAP)不含索引多表批量寫入
2017-11-19
資料庫SQL索引
HTAP資料庫PostgreSQL場景與效能測試之36-(OLTP+OLAP)不含索引單表批量寫入
2017-11-19
資料庫SQL索引
HTAP資料庫PostgreSQL場景與效能測試之23-(OLAP)平行計算
2017-11-14
資料庫SQL
HTAP資料庫PostgreSQL場景與效能測試之34-(OLTP+OLAP)不含索引單表單點寫入
2017-11-19
資料庫SQL索引
HTAP資料庫PostgreSQL場景與效能測試之42-(OLTP+OLAP)unloggedtable不含索引多表批量寫入
2017-11-19
資料庫SQL索引
HTAP資料庫PostgreSQL場景與效能測試之43-(OLTP+OLAP)unloggedtable含索引多表批量寫入
2017-11-19
資料庫SQL索引
Prometheus時序資料庫-資料的查詢
2021-03-15
Prometheus資料庫
HTAP資料庫及應用場景分析
2022-12-01
資料庫
關於同一個連線不同資料庫之間的 Eloquent 關聯查詢
2020-06-22
資料庫
資料庫AR之關聯查詢
2015-02-10
資料庫
查詢一個表插入資料的時間，按BLOCK時間
2017-05-03
BloC
實時資料庫與時序資料庫
2020-11-28
資料庫
查詢兩個日期之間的資料
2013-08-19
關於Oracle資料庫的時間查詢
2018-06-06
Oracle資料庫
【clickhouse專欄】資料庫、資料倉儲之間的區別與聯絡
2022-06-06
資料庫
百度大規模時序資料儲存（一）| 監控場景的時序資料
2022-12-05
重磅 | 物聯網資料分析利器阿里雲釋出時序資料庫InfluxDB版
2019-05-05
阿里資料庫UX
MongoDB資料庫順序讀效能評估測試
2016-08-11
MongoDB資料庫
openGauss 支援OLTP場景資料壓縮
2024-04-09
mysql查詢最近時間的一組資料
2018-12-21
MySql
雲端計算、大資料和物聯網之間，之間有什麼關係與區別？
2019-03-02
大資料
PostgreSQL：資料庫連結測試
2020-08-15
SQL資料庫
ASM資料庫的一個測試
2008-01-01
ASM資料庫
複雜場景資料處理的 OLTP 與 OLAP 融合實踐
2022-10-24
HTAP資料庫(OLTP+OLAP)-資料庫典型架構優缺點剖析(shardVSshared)
2017-10-28
資料庫架構
【MySQL】資料庫效能測試
2016-02-25
MySql資料庫
NoSQL資料庫效能測試
2012-09-24
SQL資料庫
JdbcTemplate查詢資料三種callback之間的區別
2015-10-31
JDBC
資料庫資料的查詢----連線查詢
2017-08-12
資料庫
ORACLE資料庫遞迴查詢時間區間，可傳入指定日期
2020-11-06
Oracle資料庫遞迴
多個異構資料庫如何關聯查詢
2020-06-16
資料庫
時間序列化資料庫選型？時序資料庫的選擇？
2022-05-18
資料庫
SQLServer效能優化之 nolock，大幅提升資料庫查詢效能
2014-07-16
SQLServer優化資料庫
大資料測試與傳統資料庫測試
2019-08-07
大資料資料庫

HTAP資料庫PostgreSQL場景與效能測試之15-(OLTP)物聯網-查詢一個時序區間的資料

標籤

背景

環境

場景 – 物聯網 – 查詢一個時序區間的資料 (OLTP)

1、背景

2、設計

3、準備測試表

4、準備測試函式(可選)

5、準備測試資料

6、準備測試指令碼

7、測試

8、測試結果

TPS

平均響應時間

參考

相關文章