Paper -- DenseNet:Densely Connected Convolutional Network
Abstracts:
DenseNet breaks away from the fixed thinking of deepening the number of network layers (ResNet) and widening the network structure (Inception) to improve network performance. From the perspective of features, through feature reuse and bypass (Bypass) settings, it has greatly reduced the network The amount of parameters alleviates the gradient vanishing problem to a certain extent
DenseNets has several compelling advantages:
Alleviated the vanishing gradient
Enhanced feature spread
Enhanced feature reuse
Reduce the amount of parameters
From the figure we can draw the following conclusions:
a) Some features extracted from earlier layers may still be used directly by deeper layers
b) Even the Transition layer will use the features of all layers in the previous Denseblock
c) The layers in the 2-3 Denseblock have low utilization of the previous Transition layer, indicating that the transition layer outputs a large number of redundant features. This also provides evidence support for DenseNet-BC, which is the necessity of Compression.
d) Although the final classification layer uses the multiple layers of information in the previous Denseblock, it is more inclined to use the features of the last few feature maps, indicating that in the last few layers of the network, some high-level features may be generated.
a) 一些較早層提取出的特徵仍可能被較深層直接使用
b) 即使是Transition layer也會使用到之前Denseblock中所有層的特徵
c) 第2-3個Denseblock中的層對之前Transition layer利用率很低,說明transition layer輸出大量冗餘特徵.這也為DenseNet-BC提供了證據支援,既Compression的必要性.
d) 最後的分類層雖然使用了之前Denseblock中的多層資訊,但更偏向於使用最後幾個feature map的特徵,說明在網路的最後幾層,某些high-level的特徵可能被產生.
All the layers in front of each layer add a single shortcut to this layer, so that any two layers of networks can directly “communication”. This is the picture below:
好處:
從feature來考慮,每一層feature 被用到時,都可以被看作做了新的 normalization,從實驗結果看到即便去掉BN, 深層 DenseNet也可以保證較好的收斂率。
從perceptual field來看,淺層和深層的field 可以更自由的組合,會使得模型的結果更加robust。
從 wide-network 來看, DenseNet 看以被看作一個真正的寬網路,在訓練時會有比 ResNet 更穩定的梯度,收斂速度自然更好(paper的實驗可以佐證)
benefit:
From the feature point of view, when each layer of feature is used, it can be regarded as a new normalization. It can be seen from the experimental results that even if BN is removed, the deep DenseNet can guarantee a better convergence rate.
From the perspective of perceptual field, shallow and deep fields can be combined more freely, which will make the results of the model more robust.
From the perspective of wide-network, DenseNet can be seen as a real wide network. During training, there will be a more stable gradient than ResNet, and the convergence speed will naturally be better (paper experiments can confirm)
The input of each layer in DenseNet is all the previous layers, so there is a connection between any two layers. However, in actual situations, because the size of feature maps is different between layers, it is not convenient to combine any two layers. Inspired by GoogleNet, the ** paper proposes Dense Block, that is, in each block, all layers are kept Dense connectivity, and there is no dense connectivity between the blocks, but connected through the transition layer.
DenseNets
ResNets [11] add a skip-connection that bypasses the non-linear transformations with an identity function
x ℓ = H ℓ ( x ℓ − 1 ) + x ℓ − 1 \mathbf{x}_{\ell}=H_{\ell}\left(\mathbf{x}_{\ell-1}\right)+\mathbf{x}_{\ell-1} xℓ=Hℓ(xℓ−1)+xℓ−1
Dense connectivity
x ℓ = H ℓ ( [ x 0 , x 1 , … , x ℓ − 1 ] ) \mathbf{x}_{\ell}=H_{\ell}\left(\left[\mathbf{x}_{0}, \mathbf{x}_{1}, \ldots, \mathbf{x}_{\ell-1}\right]\right) xℓ=Hℓ([x0,x1,…,xℓ−1])
where [x_0 ,x _1 ,…,x_l−1 ] refers to the concatenation of the feature-maps produced in layers 0,…,l−1
Composite function
H定義為 a composite function of three consecutive operations: batch normalization (BN) , followed by a rectified linear unit (ReLU) and a 3 × 3 convolution (Conv)
Pooling layers 可以改變特徵圖的尺寸,便於 concatenation
Growth rate:If each function H produces k feature-maps as output,We refer to the hyper-parameter k as the growth rate of the network
Bottleneck layers:儘管每個網路層只輸出 k 個特徵圖,但是同時仍然有太多的輸入個數,通常的做法是降維,在進行3×3卷積之前首先用一個 1×1卷積將輸入個數降低到 4*k, 也就是在 H的定義中再加入一個 1×1卷積
Although each layer only produces k output feature maps, it typically has many more inputs. It has been noted in [36, 11] that a 1×1 convolution can be introduced as bottleneck layer before each 3×3 convolution to reduce the number of input feature-maps
為什麼有太多的輸入個數了? If each function H_l produces k feature-maps as output, it follows that the l th layer has k×(l−1)+k0 input feature-maps, where k 0 is the number of channels in the input image.
Compression :為了進一步提升模型的簡潔性,我們在 transition layers裡 降低特徵圖數量
To further improve model compactness, we can reduce the number of feature-maps at transition layers. If a dense block contains m feature-maps, we let the following transition layer generate bθmc output feature-maps, where 0 <θ ≤1 is referred to as the compression factor.
總的來說就是簡單的多加幾個 shortcut ,效果就好了,計算量少了!
In general, simply add a few more shortcuts, the effect will be good, and the amount of calculation will be less!
相關文章
- 深度學習論文翻譯解析(十五):Densely Connected Convolutional Networks深度學習
- 論文翻譯:2020_Densely connected neural network with dilated convolutions for real-time speech enhancement in the time domainAI
- 關於Graph Convolutional Network的初步理解
- 卷積神經網路(Convolutional Neural Network,CNN)卷積神經網路CNN
- “卷積神經網路(Convolutional Neural Network,CNN)”之問卷積神經網路CNN
- 《T-GCN: A Temporal Graph Convolutional Network for Traffic Prediction》 論文解讀GC
- 《T-GCN: A Temporal Graph Convolutional Network for Traffic Prediction》 程式碼解讀GC
- 論文翻譯:2020_WaveCRN: An efficient convolutional recurrent neural network for end-to-end speech enhancement
- 2019-02-26 論文閱讀:Learning a Single Convolutional Super-Resolution Network for Multiple Degradations...
- 論文翻譯:2020_RESIDUAL ACOUSTIC ECHO SUPPRESSION BASED ON EFFICIENT MULTI-TASK CONVOLUTIONAL NEURAL NETWORK
- paper 管理
- 神經網路之卷積篇:詳解單層卷積網路(One layer of a convolutional network)神經網路卷積
- paper記錄
- tensorflow學習筆記——DenseNet筆記SENet
- Strongly connected(HDU-4635)
- 如何閱讀 Paper
- 翻譯|How to Export a Connected ComponentExport
- 記一次zookeeper not connected
- Convolutional Neural Networks(CNN)CNN
- Backbone 網路-DenseNet 論文解讀SENet
- CNN (Convolutional Neural Networks) AbstractCNN
- 如何讀一篇paper
- Coding-and-Paper-Letter(四十七)
- [Paper Reading] DDIM: DENOISING DIFFUSION IMPLICIT MODELS
- [Paper Reading] Tesla AI Day for FSD BetaAI
- 「A Convolutional Click Prediction Model」- 論文摘要
- 論文閱讀——Deformable Convolutional NetworksORM
- abc372E K-th Largest Connected Components
- paper: The years of pedestrain Detection,what have we learned?AI
- [Paper Reading] Reconstructing Hands in 3D with TransformersStruct3DORM
- Paper Reading: Random Balance ensembles for multiclass imbalance learningrandom
- Simple Neural Network
- Set介面_network
- FCOS: Fully Convolutional One-Stage Object DetectionObject
- pytorch實現 | Deformable Convolutional Networks | CVPR | 2017PyTorchORM
- [Paper Reading] VQ-VAE: Neural Discrete Representation Learning
- Paper Reading:A Survey of Deep Learning-based Object DetectionObject
- Paper Reading: Cost-sensitive deep forest for price predictionREST