Caffe：Layer的相關學習

查志強發表於2015-08-07

【原文：https://yufeigan.github.io/2014/12/09/Caffe%E5%AD%A6%E4%B9%A0%E7%AC%94%E8%AE%B03-Layer%E7%9A%84%E7%9B%B8%E5%85%B3%E5%AD%A6%E4%B9%A0/】

Layer

Layer是所有層的基類，在Layer的基礎上衍生出來的有5種Layers：

data_layer
neuron_layer
loss_layer
common_layer
vision_layer

它們都有對應的[.hpp .cpp]檔案宣告和實現了各個類的介面。下面一個一個地講這5個Layer。

data_layer

先看data_layer.hpp中標頭檔案呼叫情況：

#include "boost/scoped_ptr.hpp"
#include "hdf5.h"
#include "leveldb/db.h"
#include "lmdb.h"
//前4個都是資料格式有關的檔案
#include "caffe/blob.hpp"
#include "caffe/common.hpp"
#include "caffe/data_transformer.hpp"
#include "caffe/filler.hpp"
#include "caffe/internal_thread.hpp"
#include "caffe/layer.hpp"
#include "caffe/proto/caffe.pb.h"

不難看出data_layer主要包含與資料有關的檔案。在官方文件中指出data是caffe資料的入口是網路的最低層，並且支援多種格式，在這之中又有5種LayerType：

DATA
MEMORY_DATA
HDF5_DATA
HDF5_OUTPUT
IMAGE_DATA

其實還有兩種WINDOW_DATA, DUMMY_DATA用於測試和預留的介面，這裡暫時不管。

DATA

template <typename Dtype>
class BaseDataLayer : public Layer<Dtype>
template <typename Dtype>
class BasePrefetchingDataLayer : public BaseDataLayer<Dtype>, public InternalThread
template <typename Dtype>
class DataLayer : public BasePrefetchingDataLayer<Dtype>

用於LevelDB或LMDB資料格式的輸入的型別，輸入引數有source, batch_size, (rand_skip), (backend)。後兩個是可選。

MEMORY_DATA

1 2	template <typename Dtype> class MemoryDataLayer : public BaseDataLayer<Dtype>

這種型別可以直接從記憶體讀取資料使用時需要呼叫MemoryDataLayer::Reset，輸入引數有batch_size,channels, height, width。

HDF5_DATA

1 2	template <typename Dtype> class HDF5DataLayer : public Layer<Dtype>

HDF5資料格式輸入的型別，輸入引數有source, batch_size。

HDF5_OUTPUT

1 2	template <typename Dtype> class HDF5OutputLayer : public Layer<Dtype>

HDF5資料格式輸出的型別，輸入引數有file_name。

IMAGE_DATA

1 2	template <typename Dtype> class ImageDataLayer : public BasePrefetchingDataLayer<Dtype>

影象格式資料輸入的型別，輸入引數有source, batch_size, (rand_skip), (shuffle), (new_height), (new_width)。

neuron_layer

先看neuron_layer.hpp中標頭檔案呼叫情況

#include "caffe/blob.hpp"
#include "caffe/common.hpp"
#include "caffe/layer.hpp"
#include "caffe/proto/caffe.pb.h"

同樣是資料的操作層，neuron_layer實現裡大量啟用函式，主要是元素級別的操作，具有相同的bottom,topsize。
Caffe中實現了大量啟用函式GPU和CPU的都有很多。它們的父類都是NeuronLayer

1 2	template <typename Dtype> class NeuronLayer : public Layer<Dtype>

這部分目前沒什麼需要深究的地方值得注意的是一般的引數設定格式如下（以ReLU為例）：

layers {
  name: "relu1"
  type: RELU
  bottom: "conv1"
  top: "conv1"
}

loss_layer

Loss層計算網路誤差，loss_layer.hpp標頭檔案呼叫情況：

#include "caffe/blob.hpp"
#include "caffe/common.hpp"
#include "caffe/layer.hpp"
#include "caffe/neuron_layers.hpp"
#include "caffe/proto/caffe.pb.h"

可以看見呼叫了neuron_layers.hpp，估計是需要呼叫裡面的函式計算Loss，一般來說Loss放在最後一層。caffe實現了大量loss function，它們的父類都是LossLayer。

1 2	template <typename Dtype> class LossLayer : public Layer<Dtype>

common_layer

先看common_layer.hpp標頭檔案呼叫：

#include "caffe/blob.hpp"
#include "caffe/common.hpp"
#include "caffe/data_layers.hpp"
#include "caffe/layer.hpp"
#include "caffe/loss_layers.hpp"
#include "caffe/neuron_layers.hpp"
#include "caffe/proto/caffe.pb.h"

用到了前面提到的data_layers.hpp, loss_layers.hpp, neuron_layers.hpp說明這一層肯定開始有複雜的操作了。
這一層主要進行的是vision_layer的連線
宣告瞭9個型別的common_layer，部分有GPU實現：

InnerProductLayer
SplitLayer
FlattenLayer
ConcatLayer
SilenceLayer
(Elementwise Operations) 這裡面是我們常說的啟用函式層Activation Layers。
- EltwiseLayer
- SoftmaxLayer
- ArgMaxLayer
- MVNLayer

InnerProductLayer

常常用來作為全連線層，設定格式為：

layers {
  name: "fc8"
  type: INNER_PRODUCT
  blobs_lr: 1          # learning rate multiplier for the filters
  blobs_lr: 2          # learning rate multiplier for the biases
  weight_decay: 1      # weight decay mu
  weight_decay: 0      # weight decay multiplier for the biases
  inner_product_param {
    num_output: 1000
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  bottom: "fc7"
  top: "fc8
}

SplitLayer

用於一輸入對多輸出的場合（對blob）

FlattenLayer

將n * c * h * w變成向量的格式n * ( c * h * w ) * 1 * 1

ConcatLayer

用於多輸入一輸出的場合。

layers {
  name: "concat"
  bottom: "in1"
  bottom: "in2"
  top: "out"
  type: CONCAT
  concat_param {
    concat_dim: 1
  }
}

SilenceLayer

用於一輸入對多輸出的場合（對layer）

(Elementwise Operations)

EltwiseLayer, SoftmaxLayer, ArgMaxLayer, MVNLayer

vision_layer

標頭檔案包含前面所有檔案，也就是說包含了最複雜的操作。

#include "caffe/blob.hpp"
#include "caffe/common.hpp"
#include "caffe/common_layers.hpp"
#include "caffe/data_layers.hpp"
#include "caffe/layer.hpp"
#include "caffe/loss_layers.hpp"
#include "caffe/neuron_layers.hpp"
#include "caffe/proto/caffe.pb.h"

它主要是實現Convolution和Pooling操作。主要有以下幾個類。

template <typename Dtype>
class ConvolutionLayer : public Layer<Dtype>
template <typename Dtype>
class Im2colLayer : public Layer<Dtype>
template <typename Dtype>
class LRNLayer : public Layer<Dtype>
template <typename Dtype>
class PoolingLayer : public Layer<Dtype>

ConvolutionLayer

最常用的卷積操作，設定格式如下

layers {
  name: "conv1"
  type: CONVOLUTION
  bottom: "data"
  top: "conv1"
  blobs_lr: 1          # learning rate multiplier for the filters
  blobs_lr: 2          # learning rate multiplier for the biases
  weight_decay: 1      # weight decay multiplier for the filters
  weight_decay: 0      # weight decay multiplier for the biases
  convolution_param {
    num_output: 96     # learn 96 filters
    kernel_size: 11    # each filter is 11x11
    stride: 4          # step 4 pixels between each filter application
    weight_filler {
      type: "gaussian" # initialize the filters from a Gaussian
      std: 0.01        # distribution with stdev 0.01 (default mean: 0)
    }
    bias_filler {
      type: "constant" # initialize the biases to zero (0)
      value: 0
    }
  }
}

Im2colLayer

與MATLAB裡面的im2col類似，即image-to-column transformation，轉換後方便卷積計算

LRNLayer

全稱local response normalization layer，在Hinton論文中有詳細介紹ImageNet Classification with Deep Convolutional Neural Networks 。

PoolingLayer

即Pooling操作，格式：

layers {
  name: "pool1"
  type: POOLING
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3 # pool over a 3x3 region
    stride: 2      # step two pixels (in the bottom blob) between pooling regions
  }
}

arm相關學習
2024-10-27
【caffe2從頭學】：2.學習caffe2
2018-09-03
深度學習相關理論
2024-05-05
深度學習
深度學習相關論文
2020-04-06
深度學習
《深度學習：21天實戰Caffe》
2019-12-17
深度學習
three.js學習3_相機相關
2020-09-16
JS
強化學習相關資料
2024-11-03
強化學習
View工作流程-相關學習
2018-10-15
View
caffe整體框架的學習的部落格,這個部落格山寨了一個caffe框架
2018-08-03
框架
學習PHP中的URL相關操作函式
2021-11-16
PHP函式
最近學習了限流相關的演算法
2020-06-13
演算法
(7)caffe總結之Blob,Layer and Net以及對應配置檔案的編寫
2020-04-04
tensorflow相關函式學習總結
2020-12-22
函式
GC相關知識簡單學習
2020-11-29
GC
圖學習相關論文快訊
2019-03-18
網路、HTTP相關學習總結
2019-03-22
HTTP
JVM相關知識整理和學習
2018-06-14
JVM
Android效能優化相關的學習記錄(1)
2018-04-15
Android優化
計網學習筆記六 Network Layer Overview
2023-04-04
筆記View
計網學習筆記二 Link Layer Service
2023-03-02
筆記
跟阿銘學Linux-相關學習連結
2021-02-26
Linux
FastAPI 學習之路（二十）介面文件配置相關
2021-10-18
ASTAPI
[Docker 系列]docker 學習四，映象相關原理
2021-11-12
Docker
【Docker 系列】docker 學習四，映象相關原理
2021-11-14
Docker
優化學習率相關演算法
2020-05-11
優化演算法
MySQL 5.7 學習心得之安全相關特性
2021-09-09
MySql
【linux】Linux程式相關知識學習整理
2021-01-12
Linux
(十五) 學習筆記: Python程式(Process)相關
2018-06-03
筆記Python
如何系統的學習伺服器相關知識？
2022-07-20
伺服器
Python學習系列之一: python相關環境的搭建
2021-10-24
Python
深度學習 Caffe 記憶體管理機制理解
2019-05-06
深度學習記憶體
深度學習---之caffe如何加入Leaky_relu層
2018-05-21
深度學習
Java技術相關學習路線，學習Java後薪資如何？
2020-02-29
Java
OceanBase學習之路10|C 相關 API 介紹
2023-01-16
API
Flutter學習(六) 動畫以及動效相關
2019-11-22
Flutter動畫
ThreeJS學習6_幾何體相關(BufferGeometry)
2020-10-20
JS
Linux 相關學習內容（不定期更新）
2020-07-08
Linux
學習 java 做自動化測試相關
2021-01-05
Java
Spring-boot整合AOP及AOP相關學習
2018-12-28
Springboot