Zero-shot Learning零樣本學習 論文閱讀(一)——Learning to detect unseen object classes by between-class attribute
Zero-shot Learning零樣本學習 論文閱讀(一)——Learning to detect unseen object classes by between-class attribute transfer
Learning to detect unseen object classes by between-class attribute這篇文章首次提出了Zero-shot Learning這一問題的概念,並給出了基於物體屬性的解決方法。
演算法概要
前提
( x 1 , l 1 ) , ⋯ , ( x n , l n ) (x_1,l_1),\cdots,(x_n,l_n) (x1,l1),⋯,(xn,ln)為訓練樣本 x x x和相應類別標籤 l l l,這樣的成對資料共有 n n n組, l l l中一共有 K K K類,用 Y = { y 1 , ⋯ , y K } Y=\{y_1,\cdots,y_K\} Y={y1,⋯,yK}表示, Z = { z 1 , ⋯ , z L } Z=\{z_1,\cdots,z_L\} Z={z1,⋯,zL} 為測試集中所包含的 L L L個類別,這裡 Y Y Y和 Z Z Z就分別是可見類和不可見類,二者之間沒有交集.
目標
學習一個分類器: f : X → Z f:X\rightarrow Z f:X→Z,也就是通過學習分類器,找到訓練資料 x x x和相應可見類別標籤 l l l與位置類別標籤 Z Z Z之間的關係。
思路
通過建立一個人工定義的屬性層A,這個屬性層是高維的、可以表徵訓練樣本的各項特徵,比如顏色、條紋等,目的是將基於圖片的低維特徵分類器轉化到一個表徵高維語義特徵的屬性層。這樣可以使得分類器分類能力更廣,具備突破類別邊界的可能。
基於這個思路,作者提出了兩種方法,分別是DAP和IAP.
具體原理
DAP(Direct attribute prediction)
如下圖,DAP在樣本和訓練類別標籤之間加入了一個屬性表示層A,
a
a
a為
M
M
M維屬性向量
(
a
1
,
⋯
,
a
M
)
(a_1,\cdots,a_M)
(a1,⋯,aM),每一維代表一個屬性,且在
{
0
,
1
}
\{0,1\}
{0,1}之間取值,對於每個標籤都對應一個M維向量作為其屬性向量(原型)。通過訓練集
X
X
X的對應屬性進行訓練,學習得到屬性層的引數
β
\beta
β,之後便可以得到
P
(
a
∣
x
)
P(a|x)
P(a∣x),
將輸入測試例項x輸出的標籤作為待估計的引數,對於測試例項x,即可利用MAP的思想,找出概率最大的類為輸出的估計類。
MAP的原理見此連結https://blog.csdn.net/River_J777/article/details/111500068
z的後驗概率為:
p
(
z
∣
x
)
=
∑
a
∈
{
0
,
1
}
M
p
(
z
∣
a
)
p
(
a
∣
x
)
p(z \mid x)=\sum_{a \in\{0,1\}^{M}} p(z \mid a) p(a \mid x)
p(z∣x)=a∈{0,1}M∑p(z∣a)p(a∣x)
根據貝葉斯公式:
=
∑
a
∈
{
0
,
1
}
M
p
(
a
∣
z
)
p
(
z
)
p
(
a
)
p
(
a
∣
x
)
=\sum_{a \in\{0,1\}^{M}} \frac{p(a \mid z) p(z)}{p(a)} p(a \mid x)
=a∈{0,1}M∑p(a)p(a∣z)p(z)p(a∣x)
根據文章中的假設前提各個維度屬性條件獨立(這個假設有點過強也是DAP主要問題所在)
=
∑
a
∈
{
0
,
1
}
M
p
(
a
∣
z
)
p
(
z
)
p
(
a
)
∏
m
=
1
M
p
(
a
m
∣
x
)
=\sum_{a \in\{0,1\}^{M}} \frac{p(a \mid z) p(z)}{p(a)} \prod_{m=1}^{M} p\left(a_{m} \mid x\right)
=a∈{0,1}M∑p(a)p(a∣z)p(z)m=1∏Mp(am∣x)
根據Iverson bracket
[
[
x
]
]
[[x]]
[[x]],若其中語句為真則為1,否則為0,得
p
(
a
∣
z
)
=
[
[
a
=
a
z
]
]
p(a \mid z)=\left[\left[a=a^{z}\right]\right]
p(a∣z)=[[a=az]] ,可得:
=
∑
a
∈
{
0
,
1
}
M
p
(
z
)
p
(
a
)
[
[
a
=
a
z
]
]
∏
m
=
1
M
p
(
a
m
∣
x
)
=\sum_{a \in\{0,1\}^{M}} \frac{p(z)}{p(a)}\left[\left[a=a^{z}\right]\right] \prod_{m=1}^{M} p\left(a_{m} \mid x\right)
=a∈{0,1}M∑p(a)p(z)[[a=az]]m=1∏Mp(am∣x)
由DAP的圖模型知
p
(
a
z
)
=
p
(
a
)
p\left(a^{z}\right)=p(a)
p(az)=p(a),可得:
=
∑
a
∈
{
0
,
1
}
M
p
(
z
)
p
(
a
z
)
[
[
a
=
a
z
]
]
∏
m
=
1
M
p
(
a
m
∣
x
)
=\sum_{a \in\{0,1\}^{M}} \frac{p(z)}{p\left(a^{z}\right)}\left[\left[a=a^{z}\right]\right] \prod_{m=1}^{M} p\left(a_{m} \mid x\right)
=a∈{0,1}M∑p(az)p(z)[[a=az]]m=1∏Mp(am∣x)
整理得:
=
p
(
z
)
p
(
a
z
)
∑
a
∈
{
0
,
1
}
M
[
[
a
=
a
z
]
]
∏
m
=
1
M
p
(
a
m
∣
x
)
=\frac{p(z)}{p\left(a^{z}\right)} \sum_{a \in\{0,1\}^{M}}\left[\left[a=a^{z}\right]\right] \prod_{m=1}^{M} p\left(a_{m} \mid x\right)
=p(az)p(z)a∈{0,1}M∑[[a=az]]m=1∏Mp(am∣x)
省略掉為零的項:
=
p
(
z
)
p
(
a
z
)
∏
m
=
1
M
p
(
a
m
z
∣
x
)
=\frac{p(z)}{p\left(a^{z}\right)} \prod_{m=1}^{M} p\left(a_{m}^{z} \mid x\right)
=p(az)p(z)m=1∏Mp(amz∣x)
表示出z的後驗概率後,對於輸入測試例項x進入分類器後,分別測試不可見標籤集
z
1
,
⋯
,
z
l
z_1,\cdots,z_l
z1,⋯,zl,求最大:
f
(
x
)
=
argmax
l
=
1
,
2
,
…
…
L
p
(
z
)
p
(
a
z
l
)
∏
m
=
1
M
p
(
a
m
z
l
∣
x
)
f(x)=\operatorname{argmax}_{l=1,2, \ldots \ldots L \frac{p(z)}{p\left(a^{z_{l}}\right)}} \prod_{m=1}^{M} p\left(a_{m}^{z_{l}} \mid x\right)
f(x)=argmaxl=1,2,……Lp(azl)p(z)m=1∏Mp(amzl∣x)
根據屬性之間獨立:
=
argmax
l
=
1
,
2
,
…
.
.
L
∏
m
=
1
M
p
(
a
m
z
l
∣
x
)
∏
m
=
1
M
p
(
a
m
z
l
)
=\operatorname{argmax}_{l=1,2, \ldots . . L} \frac{\prod_{m=1}^{M}p\left(a_{m}^{z_{l}} \mid x\right)}{\prod_{m=1}^{M}p\left(a_{m}^{z_{l}}\right)}
=argmaxl=1,2,…..L∏m=1Mp(amzl)∏m=1Mp(amzl∣x)
=
argmax
l
=
1
,
2
,
…
.
.
L
∏
m
=
1
M
p
(
a
m
z
l
∣
x
)
p
(
a
m
z
l
)
=\operatorname{argmax}_{l=1,2, \ldots . . L} \prod_{m=1}^{M} \frac{p\left(a_{m}^{z_{l}} \mid x\right)}{p\left(a_{m}^{z_{l}}\right)}
=argmaxl=1,2,…..Lm=1∏Mp(amzl)p(amzl∣x)
f
(
x
)
f(x)
f(x) 的輸出即為對於輸入x的預測標籤.
IAP
區別於DAP,DAP的PGM中屬性層是在例項層和標籤層(包括可見和不可見)之間,而IAP則是將屬性層置於可見標籤層與不可見標籤層之間,用來遷移可見類標籤與例項的資訊到不可見標籤層。
原理和DAP類似,此時的後驗概率為:
p
(
a
m
∣
x
)
=
∑
i
=
1
K
p
(
a
m
∣
y
k
)
p
(
y
k
∣
x
)
p\left(a_{m} \mid x\right)=\sum_{i=1}^{K} p\left(a_{m} \mid y_{k}\right) p\left(y_{k} \mid x\right)
p(am∣x)=i=1∑Kp(am∣yk)p(yk∣x)
得到這個後驗後,再求出z的後驗,即可如同DAP中一樣應用MAP即可.
相關文章
- Zero-shot Learning零樣本學習 論文閱讀(三)——Semantic Autoencoder for Zero-Shot Learning
- Zero-shot Learning零樣本學習 論文閱讀(四)——Zero-Shot Recognition using Dual Visual-Semantic Mapping PathsAPP
- 論文閱讀:《Learning by abstraction: The neural state machine》Mac
- 論文閱讀 Inductive Representation Learning on Temporal Graphs
- 論文閱讀-Causality Inspired Representation Learning for Domain GeneralizationAI
- 論文閱讀:Sequence to sequence learning for joint extraction of entities and relations
- 論文閱讀 TEMPORAL GRAPH NETWORKS FOR DEEP LEARNING ON DYNAMIC GRAPHS
- 【論文閱讀】CVPR2022: Learning from all vehicles
- [論文閱讀筆記] Adversarial Learning on Heterogeneous Information Networks筆記ORM
- 【論文解讀】【半監督學習】【Google教你水論文】A Simple Semi-Supervised Learning Framework for Object DetectionGoFrameworkObject
- 論文閱讀:Robust and Privacy-Preserving Collaborative Learning: A Comprehensive Survey
- [論文閱讀筆記] Adversarial Mutual Information Learning for Network Embedding筆記ORM
- 論文解讀(MLDG)《Learning to Generalize: Meta-Learning for Domain Generalization》AI
- 論文閱讀《Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising》CNN
- Beyond Fixed Grid: Learning Geometric Image Representation with a Deformable Grid——論文閱讀ORM
- 論文閱讀翻譯之Deep reinforcement learning from human preferences
- 論文解讀(Debiased)《Debiased Contrastive Learning》AST
- 2018-07-28-論文閱讀(1)-Learning Ensembled for Structured Prediction RulesStruct
- 多模態學習之論文閱讀:《Multi-modal Learning with Missing Modality in Predicting Axillary Lymph Node Metastasis 》AST
- 論文解讀(gCooL)《Graph Communal Contrastive Learning》GCAST
- 論文解讀(BGRL)《Bootstrapped Representation Learning on Graphs》bootAPP
- 論文筆記 Deep Patch Learning for Weakly Supervised Object Classication and Discovery筆記Object
- [論文閱讀筆記] metapath2vec: Scalable Representation Learning for Heterogeneous Networks筆記
- 論文閱讀:LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy
- Zero shot Learning 論文學習筆記(未完待續)筆記
- 論文閱讀 dyngraph2vec: Capturing Network Dynamics using Dynamic Graph Representation LearningAPT
- 論文解讀(Survey)《An Empirical Study of Graph Contrastive Learning》AST
- 論文解讀(GRACE)《Deep Graph Contrastive Representation Learning》AST
- 論文解讀(SUGRL)《Simple Unsupervised Graph Representation Learning》
- 論文解讀(DeepWalk)《DeepWalk: Online Learning of Social Representations》
- 論文解讀(PCL)《Probabilistic Contrastive Learning for Domain Adaptation》ASTAIAPT
- 論文解讀(GCA)《Graph Contrastive Learning with Adaptive Augmentation》GCASTAPT
- 論文解讀(GROC)《Towards Robust Graph Contrastive Learning》AST
- 論文解讀(ARVGA)《Learning Graph Embedding with Adversarial Training Methods》AI
- 論文解讀(MLGCL)《Multi-Level Graph Contrastive Learning》GCAST
- 論文解讀(AutoSSL)《Automated Self-Supervised Learning for Graphs》
- 論文解讀(GraphDA)《Data Augmentation for Deep Graph Learning: A Survey》
- 論文解讀(S^3-CL)《Structural and Semantic Contrastive Learning for Self-supervised Node Representation Learning》StructAST