論文閱讀筆記:Fully Convolutional Networks for Semantic Segmentation
這是CVPR 2015拿到best paper候選的論文。
論文下載地址:Fully Convolutional Networks for Semantic Segmentation
尊重原創,轉載請註明:http://blog.csdn.net/tangwei2014
1.概覽&主要貢獻
提出了一種end-to-end的做semantic segmentation的方法,簡稱FCN。
如下圖所示,直接拿segmentation 的 ground truth作為監督資訊,訓練一個端到端的網路,讓網路做pixelwise的prediction,直接預測label map。
2.問題&解決辦法
1)如何做pixelwise的prediction?
傳統的網路是subsampling的,對應的輸出尺寸會降低,要想做pixelwise prediction,必須保證輸出尺寸。
解決辦法:
(1)對傳統網路如AlexNet,VGG等的最後全連線層變成卷積層。
例如VGG16中第一個全連線層是25088x4096的,將之解釋為512x7x7x4096的卷積核,則如果在一個更大的輸入影象上進行卷積操作(上圖的下半部分),原來輸出4096維feature的節點處(上圖的上半部分),就會輸出一個coarse feature map。
這樣做的好處是,能夠很好的利用已經訓練好的supervised pre-training的網路,不用像已有的方法那樣,從頭到尾訓練,只需要fine-tuning即可,訓練efficient。
(2)加 In-network upsampling layer。
對中間得到的feature map做bilinear上取樣,就是反摺積層。實現把conv的前傳和反傳過程對調一下即可。
2)如何refine,得到更好的結果?
upsampling中步長是32,輸入為3x500x500的時候,輸出是544x544,邊緣很不好,並且limit the scale of detail of the upsampling output。
解決辦法:
採用skip layer的方法,在淺層處減小upsampling的步長,得到的fine layer 和 高層得到的coarse layer做融合,然後再upsampling得到輸出。這種做法兼顧local和global資訊,即文中說的combining what and where,取得了不錯的效果提升。FCN-32s為59.4,FCN-16s提升到了62.4,FCN-8s提升到62.7。可以看出效果還是很明顯的。
3.訓練細節
用AlexNet,VGG16或者GoogleNet訓練好的模型做初始化,在這個基礎上做fine-tuning,全部都fine-tuning。
採用whole image做訓練,不進行patchwise sampling。實驗證明直接用全圖已經很effective and efficient。
對class score的卷積層做全零初始化。隨機初始化在效能和收斂上沒有優勢。
4.結果
當然是state-of-the-art的了。感受一下:
相關文章
- 論文閱讀——Deformable Convolutional NetworksORM
- [論文閱讀筆記] Adversarial Learning on Heterogeneous Information Networks筆記ORM
- [論文閱讀] VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
- Image Super-Resolution Using DeepConvolutional Networks論文閱讀筆記筆記
- 論文閱讀—第一篇《ImageNet Classification with Deep Convolutional Neural Networks》
- 【論文閱讀筆記】Aspect-based sentiment analysis with alternating coattention networks筆記
- [論文閱讀筆記] metapath2vec: Scalable Representation Learning for Heterogeneous Networks筆記
- MapReduce 論文閱讀筆記筆記
- AutoEmbedding論文閱讀筆記筆記
- 閱讀論文:《Compositional Attention Networks for Machine Reasoning》Mac
- 論文筆記:Diffusion-Convolutional Neural Networks (傳播-卷積神經網路)筆記卷積神經網路
- Automatic Brain Tumor Segmentation using Cascaded Anisotropic Convolutional Neural NetworksAISegmentation
- 論文速覽:Multi-source Domain Adaptation for Semantic SegmentationAIAPTSegmentation
- 論文解讀(Geom-GCN)《Geom-GCN: Geometric Graph Convolutional Networks》GC
- CornerNet-Lite論文閱讀筆記筆記
- Visual Instruction Tuning論文閱讀筆記Struct筆記
- 論文閱讀 TEMPORAL GRAPH NETWORKS FOR DEEP LEARNING ON DYNAMIC GRAPHS
- Delphi 論文閱讀 Delphi: A Cryptographic Inference Service for Neural Networks
- 論文閱讀:《Deep Compositional Question Answering with Neural Module Networks》
- Text Augmented Spatial-aware Zero-shot Referring Image Segmentation論文閱讀筆記(EMNLP23 Findings)Segmentation筆記
- ACL2020論文閱讀筆記:BART筆記
- Reading Face, Read Health論文閱讀筆記筆記
- Pixel Aligned Language Models論文閱讀筆記筆記
- [論文閱讀筆記] Structural Deep Network Embedding筆記Struct
- 論文解讀二代GCN《Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering》GCASTZedFilter
- 【論文筆記】Shortest Paths and Centrality in Uncertain Networks筆記AI
- [論文閱讀] Patient subtyping via time-aware LSTM networks
- 論文閱讀:《Multimodal Graph Networks for Compositional Generalization in Visual Question Answering》
- 論文閱讀:2023_Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables
- 【論文閱讀筆記】Transformer——《Attention Is All You Need》筆記ORM
- 【論文翻譯】MobileNets: Efficient Convolutional Neural Networks for Mobile Vision ApplicationsAPP
- 影像處理論文詳解 | Deformable Convolutional Networks | CVPR | 2017ORM
- 【論文速遞】PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic SegmentationCloudSegmentation
- 論文閱讀 A Data-Driven Graph Generative Model for Temporal Interaction Networks
- 【論文閱讀筆記】An Improved Neural Baseline for Temporal Relation Extraction筆記
- [論文閱讀筆記] Community aware random walk for network embedding筆記Unityrandom
- [論文閱讀筆記] Adversarial Mutual Information Learning for Network Embedding筆記ORM
- DMCP: Differentiable Markov Channel Pruning for Neural Networks 閱讀筆記筆記