PostgreSQL空間獨立事件相關性分析二-人車擬合

德哥發表於2017-10-28

SQL事件

背景

獨立事件相關性分析是一件很有意思的事情，例如

探探軟體的擦肩而過功能點，在不同時空與你擦肩而過的人。

輿情分析。

商品最佳銷售組合。

安全系統中的人車擬合，對時空資料進行處理，用於司機、乘客、車輛的擬合。

人車擬合

1、建立表結構

人

create table u_pos (  
  id int8,  
  uid int8,  
  crt_time timestamp,  
  pos geometry  
);

車

create table c_pos (  
  id int8,  
  car_id int8,  
  crt_time timestamp,  
  pos geometry  
);

2、生成測試資料。

以杭州市為例，經緯度範圍如下：

東經118°21′-120°30′，北緯29°11′-30°33′ 計算得東經118.35°-120.5°，北緯29.183°-30.55°。

活躍量假設：

1000萬人口，1000萬車輛。  
  
人的軌跡數，一天10億。  
  
車的軌跡數，一天1億。

2.1、寫入人的活動位置資料，按天分割槽，保留一年。

for ((i=1;i<=32;i++))  
do  
nohup psql -c "insert into u_pos select id, random()*10000000, `2017-10-01`::date + ((id*2.7648)||` ms`)::interval, st_setsrid(st_makepoint(118.35+random()*2.15, 29.183+random()*1.367), 4326) from generate_series(1,31250000) t(id);" >/dev/null 2>&1 &  
done

採用時序資料中最常用的brin索引。

create index idx_u_pos_1 on u_pos using brin(crt_time);

建立人+時間的索引。

create index idx_u_pos_2 on u_pos using btree(uid, crt_time);

2.2、寫入車輛的活動位置資料，按天分割槽，保留一年。

for ((i=1;i<=32;i++))  
do  
nohup psql -c "insert into c_pos select id, random()*10000000, `2017-10-01`::date + ((id*27.648)||` ms`)::interval, st_setsrid(st_makepoint(118.35+random()*2.15, 29.183+random()*1.367), 4326) from generate_series(1,3125000) t(id);" >/dev/null 2>&1 &  
done

採用時序資料中最常用的brin索引。

create index idx_c_pos_1 on c_pos using brin(crt_time);

建立車+時間的索引。

create index idx_c_pos_2 on c_pos using btree(car_id, crt_time);

3、求某個時間區間的人車擬合

3.1、車輛，行駛過程中抓到的N個點，返回時間，位置。

select pos, crt_time from c_pos where car_id=? and crt_time between ? and ?;

返回對應時間區間的N個點附近的人交集

create or replace function merge_car_u(  
  v_car_id int8,       -- 汽車ID  
  s_time timestamp,    -- 搜尋範圍，開始時間  
  e_time timestamp,    -- 搜尋範圍，結束時間  
  ts_range interval,   -- 每個汽車軌跡點對應的：目標人出現的時間與汽車出現時間的時間差（前後各放大多少）  
  pos_range float8     -- 每個汽車軌跡點對應的：目標人與汽車的距離  
) returns int8[] as $$  
declare  
  res int8[];  
  tmp int8[];  
  v_pos geometry;  
  v_crt_time timestamp;  
  i int := 0;  
begin  
  for v_pos, v_crt_time in select pos, crt_time from c_pos where car_id=v_car_id and crt_time between s_time and e_time  -- 求軌跡點  
  loop  
    select array_agg(uid) into tmp from u_pos where crt_time between v_crt_time-ts_range and v_crt_time+ts_range and (v_pos <-> pos) < pos_range;  -- 求對應目標的ID  
    if (i <> 0) then  
      select array_agg(unnest) into res from (select unnest(res) intersect select unnest(tmp)) t;  -- 求交集  
    else  
      res := tmp;  
    end if;  
    i := i+1;  
  end loop;  
  return res;  
end;  
$$ language plpgsql strict;

例子：

postgres=# select * from merge_car_u(1, `2017-10-01 01:00:00`, `2017-10-01 04:00:00`, `10 s`, 0.004);  
            merge_car_u              
-----------------------------------  
 {5481974,5958009,3682524,1313466}  
(1 row)  
  
Time: 232.960 ms

3.2、人，運動過程中抓到的N個點，返回時間，位置。

返回對應時間區間的N個點附近的車輛的交集

create or replace function merge_u_car(  
  v_uid int8,              -- 人ID  
  s_time timestamp,        -- 搜尋範圍，開始時間  
  e_time timestamp,        -- 搜尋範圍，結束時間  
  ts_range interval,       -- 每個人軌跡點對應的：目標車輛出現的時間與人出現時間的時間差（前後各放大多少）  
  pos_range float8         -- 每個人軌跡點對應的：目標車輛與人的距離  
) returns int8[] as $$  
declare  
  res int8[];  
  tmp int8[];  
  v_pos geometry;  
  v_crt_time timestamp;  
  i int := 0;  
begin  
  for v_pos, v_crt_time in select pos, crt_time from u_pos where uid=v_uid and crt_time between s_time and e_time  -- 求軌跡點  
  loop  
    select array_agg(car_id) into tmp from c_pos where crt_time between v_crt_time-ts_range and v_crt_time+ts_range and (v_pos <-> pos) < pos_range;  -- 求對應目標的ID  
    if (i <> 0) then  
      select array_agg(unnest) into res from (select unnest(res) intersect select unnest(tmp)) t;  -- 求交集  
    else  
      res := tmp;  
    end if;  
    i := i+1;  
  end loop;  
  return res;  
end;  
$$ language plpgsql strict;

例子：

postgres=# select * from merge_u_car(100, `2017-10-01 01:00:00`, `2017-10-01 02:00:00`, `100 s`, 0.2);  
                                                merge_u_car                                                  
-----------------------------------------------------------------------------------------------------------  
 {6214562,6180159,4534165,7824219,6826437,3020910,1463798,2939986,5786345,7233751,2856178,1719127,7763683}  
(1 row)  
  
Time: 96.986 ms

小結

1、儲存、索引優化思路。

時間截斷 + 空間排序儲存

例如

(YYYY-MM-DD HH24:MI), (geohash)

儲存修整後，建立以上結構的btree或BRIN索引。

當搜尋某個時間點，出現在某個點附近的記錄時，可以並行，並且搜尋的資料塊是比較少的，因為密集儲存。

2、其他需求：缺失位置的補齊。某些情況下，可能導致車輛、人的位置資訊未採集的情況，例如經過擁堵路段、採集裝置死角等。

在位置獲取出現空缺的情況下，使用pgrouting，以及路網資訊，生成若干條路徑，補齊為出現的點。同時估算時間，得到點和經過的時間。

3、其他需求：異常位置糾正。

4、擬合效能，以天為分割槽。1000萬人口，1000萬車輛。人的軌跡數，一天10億。車的軌跡數，一天1億。

可以做到毫秒級別的擬合響應。

參考

《潘金蓮改變了歷史之 – PostgreSQL輿情事件分析應用》

《為什麼啤酒和紙尿褲最搭 – 用HybridDB/PostgreSQL查詢商品營銷最佳組合》

SCM通道模型和SCME通道模型的matlab特性模擬,對比空間相關性,時間相關性,頻率相關性
2024-09-14
模型Matlab
MySQL InnoDB 共享表空間和獨立表空間
2017-09-03
MySql
MySQL InnoDB 共享表空間和獨立表空間
2016-03-04
MySql
獨立模型的相關需求
2019-05-11
模型
MySQL UNDO表空間獨立和截斷
2020-09-15
MySql
獨立主機相對於其他網站空間的優勢有哪些？
2020-10-16
網站
MySQL 中的共享表空間與獨立表空間如何選擇
2021-07-25
MySql
mysql無備份恢復-獨立表空間
2017-11-12
MySql
表空間相關查詢
2015-03-23
Oracle表空間相關操作
2012-05-24
Oracle
獨立IP的網站空間有什麼優勢？
2018-12-26
網站
MySQL 引數- Innodb_File_Per_Table（獨立表空間）
2018-01-10
MySql
(原)獨立需求與相關需求-讀書筆記
2008-04-26
筆記
hihocoder 1158 質數相關(二分圖匹配最大獨立集)
2015-09-21
Excel做分析-相關性分析
2017-07-21
Excel
特徵向量/特徵值/協方差矩陣/相關/正交/獨立/主成分分析/PCA/
2018-08-14
特徵矩陣PCA
MySQL innoDB獨立表空間和共享表空間的優點和缺點介紹
2016-08-25
MySql
oracle臨時表空間相關
2023-12-29
Oracle
Sybase資料庫空間相關
2015-02-08
資料庫
【原創】表空間相關操作
2008-05-05
PostgreSQL：表空間
2020-12-14
SQL
PostgreSQL 表空間
2024-07-20
SQL
matlab相關性分析
2020-11-14
Matlab
PostgreSQL vs. Oracle DML 獨立壓測
2015-01-06
SQLOracle
空間統計（二）分析模式 A
2020-04-05
模式
為Zabbix MySQL設定獨立表空間innodb_file_per_table
2020-04-24
MySql
Oracle 表空間查詢相關sql
2017-11-27
OracleSQL
火車發車時間api 火車相關內容查詢
2015-12-01
API
量化相關性分析應用
2017-08-30
【等待事件之二】log 相關的等待
2010-11-21
事件
MIT 線性代數 Linear Algebra 9: 向量空間的一些定義 -- 線性獨立，基，維度
2020-10-03
MIT
mysql之共享表空間與獨立表空間、frm,MYD,MYI.idb,par檔案說明
2017-05-29
MySql
空間統計（二）分析模式 B
2020-04-05
模式
PostgreSQL大學選課相關性應用實踐
2018-01-27
SQL
MySQL InnoDB獨立表空間模式的優點和缺點介紹
2017-04-11
MySql模式
獨立高防伺服器特點免費全能空間存在嗎
2024-07-21
伺服器
Oracle - 表空間相關常用操作語句
2018-01-15
Oracle
臨時表空間temporary tablespace相關操作
2013-06-26

PostgreSQL空間獨立事件相關性分析二-人車擬合

標籤

背景

人車擬合

小結

參考

相關文章