Q-REG論文閱讀

name555difficult發表於2023-10-04

原文網址 : https://www.cnblogs.com/name555difficult/p/17742819.html

Q-REG

2023-09-27 airxiv preprint

Jin, S., Barath, D., Pollefeys, M., & Armeni, I. (2023). Q-REG: End-to-End Trainable Point Cloud Registration with Surface Curvature.

paper: 2309.16023v1] Q-REG: End-to-End Trainable Point Cloud Registration with Surface Curvature (arxiv.org)
code: waiting

Questions Raised

RANSAC-like estimation methods cope with the combinatorics of the problem via selecting random subsets of m correspondences( e.g., m=3 for rigid pose estimation). this allows to progressively explore the \((\frac{n}{m})\) possible combinations, where n is the total number of matches.

簡單來說就是RANSAC style不可微，不能end-to-end；而其他learning-based方法為了實現端到端就將hard correspondence換成了基於socre的soft correspondence(hard就是True or False，soft就是有權重，或者說點對匹配程度)，又會使得計算開銷太大，並且引入大量噪聲。

作者就想實現hard correspondence的端到端，怎麼辦，採用single correspondence來預測變換就可以了，這樣就沒有random subsets，而是迭代遍歷correspondence set，取最好預測結果。

Contribution

設計了Q-REG，一種結合single correspondence的local surface patches(fitting quadrics)，來估計位姿的點雲配準方法，意圖替代RANSAC。從介紹上，Q-REG與correspondence matching method 無關(it is agnostic to the correspondence matching method)，並且能夠快速做outlier rejection by filtering degenerate solutions and assumption inconsistent motions (rigid poses inconsistent with motion priors (e.g., to avoid unrealistically large scaling).)
將Q-REG設計成可微(differentiable)方案，用於無論是在correspondence matching method 還是 pose estimation method的端到端訓練
刷SOTA哩

Description

employing higher-order geometric information , Q-REG achieving exhaustive search to replace RANSAC and improve the performance and run-time

Q-pipline

First Step: Correspondence Matching

使用任意Correspondence Matcher（e.g patch-based: PPFNet, PPF-FoldNet; full-conv: FCGF）得到feature-matching based putative correspondences \(\{P, Q\}\in C\) , 用於之後的Q-REG方法預估變換矩陣。

Q-REG是single-correspondence方法，因此區別於RANSAC每次隨機挑選三對corresponding point \(\{p, q\}\) 預測變換矩陣，Q-REG每次只取單對corresponding point，用於estimate transform between \(P\) and \(Q\) 。

Second Step: Q-REG

Q-REG直接當作工具用的步驟為：

從correspondence set \(C\) 中迭代取出single correspondence \(\{p,\ q \}\) ;
對以每個single corrspondence為輸入預測變換矩陣
選擇best transformation model 作為初步結果, the pose quality metric is calculated as the cardinality of its support i.e., the number of inliers.
之後根據論文[^ 1] 的方法進行local optimization.( a local re-sampling and re-fitting of inlier correspondences based on their normals (coming from the fitted quadrics) and positions. )

如果嵌入端到端訓練則只進行到第二步時根據預測結果構建Loss: \(L_{pose}\) 。

後文對single correspondence為輸入預測變換矩陣的過程進行詳述，以及介紹 \(L_{pose}\) 的構成

1. Quadric Fitting based local patch

對於single correspondence \(\{p, q\}\in C\) ，可以為點劃分local patch(Q-REG透過K=50的KNN來劃分)，預測一對local patch，並計算兩個loca patch彼此的LRF(local reference frame) \(R_p, R_q \in SO(3)\) （即作為將點從世界座標系轉換到區域性參考系的旋轉矩陣）。假如預測正確，我們就可以做兩片點雲的對齊( \(R=R_qR_p^T\) )。因此Q-REG應用二次曲面擬合來預估 \(R_p,\ R_q\) 。

至於translation vector \(t\) ，論文直接以 q, p作為兩片點雲重疊區域的質心， \(t=q-p\) 。

論文中應用如下約束擬合3D quadric surface：

\[\hat{p}^TQp=0 \]

\(\hat{p}\) ：3D homogeneous point(3D齊次點) lying on the surface
Q is the quadirc parameters in matrix as:

\[Q = \begin{pmatrix}A&D&E&G\\D&B&F&H\\E&F&C&I\\G&H&I&J\end{pmatrix} \]

理論上最佳的是local patch的所有點都能落在曲面上，但是當然不可能?，所以需要擬合。

之後，作者重寫了上述公式便於應用：

\(|\mathcal{N}|\) is the number of neighbors to which the quadric is fitted(paper sets to 50). 換句話說，二次曲面擬合不用single corrspondence 中的p，q點，也就是keypoints，而是使用local patch中的其他點，也就是neighbor points.
\(d_i\) 是第i個neighbor point離原點(the origin)的平方距離(squared dist)。（所以這裡實現時是不是需要先對local patch以keypoint求相對距離進行標準化）。

使用上述linear equation獲得 \(Q\) 中的係數。

然後對求得二次曲面係數矩陣 \(Q\) 應用平移，使得keypoint能落在曲面上，也就是調整係數 \(J\) 使得對於keypoint，公式 \(p^TQp =0\) 成立。

最終取二次曲面係數矩陣 \(Q\) 的部分，得到如下矩陣 \(P\) ，並使用對矩陣 \(P\) 使用 Eigen-decomposition ，得到特徵向量矩陣 \(V\) 作為求得LRF \(R_p或R_q\) 。

注意：為了保留尺度(scaling) 資訊，這裡不對特徵向量進行單位化。

2. Estimate rigid Transformation

the rotation \(R=R_pPR_q^T \in SO(3)\) ，其中 \(P\) 表示一個unknown permutation matrix，用於控制p的LRF與q的LRF之間的各軸對應關係，這種對應關係分三種情況考慮：

當LRF三軸的模（長度）各不相同時，也就是x-y-z三方向尺度資訊都不一致。只需要按照三軸的長度從大到小排列對應即可 。這種方式基於這樣的假設：該過程建立在點雲中沒有或有但是可忽略的各向異性縮放的假設之上，因此相對應軸長度保持不變。這種方式可以實現scale-invariant，並且透過不可實現的縮放過濾不可靠匹配。因此，rigid transformation可以透過single correspondence解決。
當LRF三軸的模（長度）其中兩個相同，與另一個不相同時，也就是x-y-z三方向有兩個方向尺度資訊一致，那麼直觀上就可以理解：兩個方向尺度資訊一致，使得一對一匹配LRF三軸時，有兩對軸無法明確匹配。因此，需要最起碼two correspondences來互相印證，保證 \(P\) 矩陣預測正確。
當LRF三軸的模（長度）都相同，也就是x-y-z三方向尺度都一致，此時local patch以keypoint為原點接近一個sphere surface。同理，需要最起碼three correspondences。

所以為了實現estimate rigid transformation from a single correspondence，只保留 \(C\) LRF三軸的模（長度）各不相同的corrspondences，各軸長度差都大於 \(10^{-3}\) 。之後就可以用 \(R=R_pPR_q^T\) 公式計算剛性旋轉矩陣。

3. End-to-End Training Loss

\[\epsilon (T_{p,q}) = \sqrt{\frac{1}{|C|}\sum_{(p_i,q_i) \in C}{||T_{p,q}p_i-q_i}||_2^2} \]

\[L_{pose} = \sum_{(p,q)\in C}{(1-\frac{min(\epsilon(T_{p,q}), \gamma)}{\gamma} -s)} \]

\(\gamma\) is a threshold and \(s\) is the score of the point correspondence predicted by the matching network

上述所提到的 \(L_{pose}\) 可以與其他廣泛使用的registration loss functions 相結合實現從特徵匹配到配準的端到端訓練。

Experiments

dataset：3DMatch、3DLoMatch；KITTI；ModelNet、ModelLoNet
corresponding matcher：Predator、RegTR、GeoTr
metrics：RR(registration recall)、RRE(registration rotation Error)、RTE(Registration Translation Error)、

沒說的，在matcher一致的情況下全SOTA，並且還比其他estimator(ICP、PointDesc……)好.消融實驗也證明了Q-REG所有component都有效提升了一定的指標額度：quadric-fitting single-corresponding solver、local optimation、used in end-to-end training。

Run-time

[^ 1]:Karel Lebeda, Jirı Matas, and Ondrej Chum. Fixing the locally optimized ransac-full experimental evaluation. In British machine vision conference. Citeseer, 2012. 5

論文閱讀：SiameseFC
2018-04-09
論文閱讀20241117
2024-11-22
GeoChat論文閱讀
2024-10-29
阿里DMR論文閱讀
2024-04-29
阿里
[論文閱讀] Hector Mapping
2020-12-16
APP
並行多工學習論文閱讀（五）：論文閱讀總結
2021-11-12
並行
XGBoost論文閱讀及其原理
2018-05-13
MapReduce 論文閱讀筆記
2020-06-24
筆記
「DNN for YouTube Recommendations」- 論文閱讀
2020-02-19
DNN
G-FRNet論文閱讀
2020-10-11
AutoEmbedding論文閱讀筆記
2023-03-29
筆記
論文閱讀——Deformable Convolutional Networks
2020-12-25
ORM
【2020論文閱讀】11月
2020-11-27
論文閱讀狀態壓縮
2019-02-05
論文閱讀2-思維鏈
2024-03-14
CornerNet-Lite論文閱讀筆記
2020-10-31
筆記
Visual Instruction Tuning論文閱讀筆記
2024-06-07
Struct筆記
論文閱讀：《Learning by abstraction: The neural state machine》
2022-04-10
Mac
閱讀論文：《Compositional Attention Networks for Machine Reasoning》
2022-04-10
Mac
論文閱讀 Inductive Representation Learning on Temporal Graphs
2022-07-11
深度學習論文閱讀路線圖
2018-08-06
深度學習
論文閱讀-Causality Inspired Representation Learning for Domain Generalization
2024-04-09
AI
ACL2020論文閱讀筆記：BART
2020-09-26
筆記
Reading Face, Read Health論文閱讀筆記
2020-10-31
筆記
Pixel Aligned Language Models論文閱讀筆記
2024-08-01
筆記
論文閱讀 Exploring Temporal Information for Dynamic Network Embedding
2022-06-25
ORM
[論文閱讀] Residual Attention(Multi-Label Recognition)
2021-08-15
[論文閱讀筆記] Structural Deep Network Embedding
2021-06-04
筆記Struct
論文閱讀筆記：Fully Convolutional Networks for Semantic Segmentation
2019-01-20
筆記Segmentation
《learn to count everything》論文閱讀、實驗記錄
2024-05-01
閱讀論文的方法和技巧（快速且有效）
2023-05-12
論文閱讀-RankME: Reliable Human Ratings for Natural Language Generation
2020-10-09
論文閱讀：A neuralized feature engineering method for entity relation extraction
2024-07-29
Zed
論文閱讀：Sequence to sequence learning for joint extraction of entities and relations
2024-07-29
【論文閱讀筆記】Transformer——《Attention Is All You Need》
2024-11-08
筆記ORM
論文閱讀 TEMPORAL GRAPH NETWORKS FOR DEEP LEARNING ON DYNAMIC GRAPHS
2022-07-17
【論文閱讀】CVPR2022: Learning from all vehicles
2022-03-23
[論文閱讀筆記] Adversarial Learning on Heterogeneous Information Networks
2021-06-05
筆記ORM