【面向程式碼】學習 Deep Learning(三)Convolution Neural Network(CNN)
==========================================================================================
最近一直在看Deep Learning,各類部落格、論文看得不少
但是說實話,這樣做有些疏於實現,一來呢自己的電腦也不是很好,二來呢我目前也沒能力自己去寫一個toolbox
只是跟著Andrew Ng的UFLDL tutorial 寫了些已有框架的程式碼(這部分的程式碼見github)
後來發現了一個matlab的Deep Learning的toolbox,發現其程式碼很簡單,感覺比較適合用來學習演算法
再一個就是matlab的實現可以省略掉很多資料結構的程式碼,使演算法思路非常清晰
所以我想在解讀這個toolbox的程式碼的同時來鞏固自己學到的,同時也為下一步的實踐打好基礎
(本文只是從程式碼的角度解讀演算法,具體的演算法理論步驟還是需要去看paper的
我會在文中給出一些相關的paper的名字,本文旨在梳理一下演算法過程,不會深究演算法原理和公式)
==========================================================================================
使用的程式碼:DeepLearnToolbox ,下載地址:點選開啟,感謝該toolbox的作者
==========================================================================================
今天是CNN的內容啦,CNN講起來有些糾結,你可以事先看看convolution和pooling(subsampling),還有這篇:tornadomeet的博文
下面是那張經典的圖:
======================================================================================================
開啟\tests\test_example_CNN.m一觀
cnn.layers = {
struct('type', 'i') %input layer
struct('type', 'c', 'outputmaps', 6, 'kernelsize', 5) %convolution layer
struct('type', 's', 'scale', 2) %sub sampling layer
struct('type', 'c', 'outputmaps', 12, 'kernelsize', 5) %convolution layer
struct('type', 's', 'scale', 2) %subsampling layer
};
cnn = cnnsetup(cnn, train_x, train_y); //here!!!
opts.alpha = 1;
opts.batchsize = 50;
opts.numepochs = 1;
cnn = cnntrain(cnn, train_x, train_y, opts); //here!!!
似乎這次要複雜了一些啊,首先是layer,有三種,i是input,c是convolution,s是subsampling
'c'的outputmaps是convolution之後有多少張圖,比如上(最上那張經典的))第一層convolution之後就有六個特徵圖
'c'的kernelsize 其實就是用來convolution的patch是多大
's'的scale就是pooling的size為scale*scale的區域
接下來似乎就是常規思路了,cnnsetup()和cnntrain()啦,我們來看程式碼
\CNN\cnnsetup.m
function net = cnnsetup(net, x, y)
inputmaps = 1;
mapsize = size(squeeze(x(:, :, 1)));
//尤其注意這幾個迴圈的引數的設定
//numel(net.layers) 表示有多少層
for l = 1 : numel(net.layers) // layer
if strcmp(net.layers{l}.type, 's')
mapsize = mapsize / net.layers{l}.scale;
//subsampling層的mapsize,最開始mapsize是每張圖的大小28*28(這是第一次卷積後的結果,卷積前是32*32)
//這裡除以scale,就是pooling之後圖的大小,這裡為14*14
assert(all(floor(mapsize)==mapsize), ['Layer ' num2str(l) ' size must be integer. Actual: ' num2str(mapsize)]);
for j = 1 : inputmaps //inputmap就是上一層有多少張特徵圖,通過初始化為1然後依層更新得到
net.layers{l}.b{j} = 0;
end
end
if strcmp(net.layers{l}.type, 'c')
mapsize = mapsize - net.layers{l}.kernelsize + 1;
//這裡的mapsize可以參見UFLDL裡面的那張圖下面的解釋
fan_out = net.layers{l}.outputmaps * net.layers{l}.kernelsize ^ 2;
//隱藏層的大小,是一個(後層特徵圖數量)*(用來卷積的patch圖的大小)
for j = 1 : net.layers{l}.outputmaps // output map
fan_in = inputmaps * net.layers{l}.kernelsize ^ 2;
//對於每一個後層特徵圖,有多少個引數鏈到前層
for i = 1 : inputmaps // input map
net.layers{l}.k{i}{j} = (rand(net.layers{l}.kernelsize) - 0.5) * 2 * sqrt(6 / (fan_in + fan_out));
end
net.layers{l}.b{j} = 0;
end
inputmaps = net.layers{l}.outputmaps;
end
end
// 'onum' is the number of labels, that's why it is calculated using size(y, 1). If you have 20 labels so the output of the network will be 20 neurons.
// 'fvnum' is the number of output neurons at the last layer, the layer just before the output layer.
// 'ffb' is the biases of the output neurons.
// 'ffW' is the weights between the last layer and the output neurons. Note that the last layer is fully connected to the output layer, that's why the size of the weights is (onum * fvnum)
fvnum = prod(mapsize) * inputmaps;
onum = size(y, 1);
//這裡是最後一層神經網路的設定
net.ffb = zeros(onum, 1);
net.ffW = (rand(onum, fvnum) - 0.5) * 2 * sqrt(6 / (onum + fvnum));
end
\CNN\cnntrain.m
cnntrain就和nntrain是一個節奏了:
net = cnnff(net, batch_x);
net = cnnbp(net, batch_y);
net = cnnapplygrads(net, opts);
cnntrain是用back propagation來計算gradient的,我們一次來看這三個函式:
cnnff.m
function net = cnnff(net, x)
n = numel(net.layers);
net.layers{1}.a{1} = x;
inputmaps = 1;
for l = 2 : n // for each layer
if strcmp(net.layers{l}.type, 'c')
// !!below can probably be handled by insane matrix operations
for j = 1 : net.layers{l}.outputmaps // for each output map
// create temp output map
z = zeros(size(net.layers{l - 1}.a{1}) - [net.layers{l}.kernelsize - 1 net.layers{l}.kernelsize - 1 0]);
for i = 1 : inputmaps // for each input map
// convolve with corresponding kernel and add to temp output map
// 做卷積,參考UFLDL,這裡是對每一個input的特徵圖做一次卷積,再加起來
z = z + convn(net.layers{l - 1}.a{i}, net.layers{l}.k{i}{j}, 'valid');
end
// add bias, pass through nonlinearity
// 加入bias
net.layers{l}.a{j} = sigm(z + net.layers{l}.b{j});
end
// set number of input maps to this layers number of outputmaps
inputmaps = net.layers{l}.outputmaps;
elseif strcmp(net.layers{l}.type, 's')
// downsample
for j = 1 : inputmaps
//這裡有點繞繞的,它是新建了一個patch來做卷積,但我們要的是pooling,所以它跳著把結果讀出來,步長為scale
//這裡做的是mean-pooling
z = convn(net.layers{l - 1}.a{j}, ones(net.layers{l}.scale) / (net.layers{l}.scale ^ 2), 'valid'); // !! replace with variable
net.layers{l}.a{j} = z(1 : net.layers{l}.scale : end, 1 : net.layers{l}.scale : end, :);
end
end
end
// 收納到一個vector裡面,方便後面用~~
// concatenate all end layer feature maps into vector
net.fv = [];
for j = 1 : numel(net.layers{n}.a)
sa = size(net.layers{n}.a{j});
net.fv = [net.fv; reshape(net.layers{n}.a{j}, sa(1) * sa(2), sa(3))];
end
// 最後一層的perceptrons,資料識別的結果
net.o = sigm(net.ffW * net.fv + repmat(net.ffb, 1, size(net.fv, 2)));
end
cnnbp.m
這個就哭了,程式碼有些糾結,不得已又找資料看啊,《Notes on Convolutional Neural Networks》要好一些只是這個toolbox的程式碼和《Notes on Convolutional Neural Networks》裡有些不一樣的是這個toolbox在subsampling(也就是pooling層)沒有加sigmoid啟用函式,只是單純地pooling了一下,所以這地方還需仔細辨別,這個toolbox裡的subsampling是不用計算gradient的,而在Notes裡是計算了的
還有這個toolbox沒有Combinations of Feature Maps,也就是tornadomeet的博文裡這張表格:
具體就去看看上面這篇論文吧
然後就看程式碼吧:
function net = cnnbp(net, y)
n = numel(net.layers);
// error
net.e = net.o - y;
// loss function
net.L = 1/2* sum(net.e(:) .^ 2) / size(net.e, 2);
//從最後一層的error倒推回來deltas
//和神經網路的bp有些類似
//// backprop deltas
net.od = net.e .* (net.o .* (1 - net.o)); // output delta
net.fvd = (net.ffW' * net.od); // feature vector delta
if strcmp(net.layers{n}.type, 'c') // only conv layers has sigm function
net.fvd = net.fvd .* (net.fv .* (1 - net.fv));
end
//和神經網路類似,參看神經網路的bp
// reshape feature vector deltas into output map style
sa = size(net.layers{n}.a{1});
fvnum = sa(1) * sa(2);
for j = 1 : numel(net.layers{n}.a)
net.layers{n}.d{j} = reshape(net.fvd(((j - 1) * fvnum + 1) : j * fvnum, :), sa(1), sa(2), sa(3));
end
//這是算delta的步驟
//這部分的計算參看Notes on Convolutional Neural Networks,其中的變化有些複雜
//和這篇文章裡稍微有些不一樣的是這個toolbox在subsampling(也就是pooling層)沒有加sigmoid啟用函式
//所以這地方還需仔細辨別
//這這個toolbox裡的subsampling是不用計算gradient的,而在上面那篇note裡是計算了的
for l = (n - 1) : -1 : 1
if strcmp(net.layers{l}.type, 'c')
for j = 1 : numel(net.layers{l}.a)
net.layers{l}.d{j} = net.layers{l}.a{j} .* (1 - net.layers{l}.a{j}) .* (expand(net.layers{l + 1}.d{j}, [net.layers{l + 1}.scale net.layers{l + 1}.scale 1]) / net.layers{l + 1}.scale ^ 2);
end
elseif strcmp(net.layers{l}.type, 's')
for i = 1 : numel(net.layers{l}.a)
z = zeros(size(net.layers{l}.a{1}));
for j = 1 : numel(net.layers{l + 1}.a)
z = z + convn(net.layers{l + 1}.d{j}, rot180(net.layers{l + 1}.k{i}{j}), 'full');
end
net.layers{l}.d{i} = z;
end
end
end
//參見paper,注意這裡只計算了'c'層的gradient,因為只有這層有引數
//// calc gradients
for l = 2 : n
if strcmp(net.layers{l}.type, 'c')
for j = 1 : numel(net.layers{l}.a)
for i = 1 : numel(net.layers{l - 1}.a)
net.layers{l}.dk{i}{j} = convn(flipall(net.layers{l - 1}.a{i}), net.layers{l}.d{j}, 'valid') / size(net.layers{l}.d{j}, 3);
end
net.layers{l}.db{j} = sum(net.layers{l}.d{j}(:)) / size(net.layers{l}.d{j}, 3);
end
end
end
//最後一層perceptron的gradient的計算
net.dffW = net.od * (net.fv)' / size(net.od, 2);
net.dffb = mean(net.od, 2);
function X = rot180(X)
X = flipdim(flipdim(X, 1), 2);
end
end
cnnapplygrads.m
這部分就輕鬆了,已經有grads了,依次進行梯度更新就好了
function net = cnnapplygrads(net, opts)
for l = 2 : numel(net.layers)
if strcmp(net.layers{l}.type, 'c')
for j = 1 : numel(net.layers{l}.a)
for ii = 1 : numel(net.layers{l - 1}.a)
net.layers{l}.k{ii}{j} = net.layers{l}.k{ii}{j} - opts.alpha * net.layers{l}.dk{ii}{j};
end
net.layers{l}.b{j} = net.layers{l}.b{j} - opts.alpha * net.layers{l}.db{j};
end
end
end
net.ffW = net.ffW - opts.alpha * net.dffW;
net.ffb = net.ffb - opts.alpha * net.dffb;
end
cnntest.m
function [er, bad] = cnntest(net, x, y)
// feedforward
net = cnnff(net, x);
[~, h] = max(net.o);
[~, a] = max(y);
bad = find(h ~= a);
er = numel(bad) / size(y, 2);
end
就是這樣~~cnnff一次後net.o就是結果總結
just code !
這是一個89年的模型啊~~~,最近還和RBM結合起來了,做了一個Imagenet的最好成績(是這個吧?):
Alex Krizhevsky.ImageNet Classification with Deep Convolutional Neural Networks. Video and Slides, 2012
http://www.cs.utoronto.ca/~rsalakhu/papers/dbm.pdf
【參考】:
【Deep learning:三十八(Stacked CNN簡單介紹)】
【UFLDL】
【Notes on Convolutional Neural Networks】
【Convolutional Neural Networks (LeNet)】 這是deeplearning 的theano庫的
相關文章
- 【面向程式碼】學習 Deep Learning(二)Deep Belief Nets(DBNs)
- 【面向程式碼】學習 Deep Learning(四) Stacked Auto-Encoders(SAE)
- Neural Networks and Deep Learning(神經網路與深度學習) - 學習筆記神經網路深度學習筆記
- 深度學習(Deep Learning)深度學習
- 《DEEP LEARNING·深度學習》深度學習
- COMP9444 Neural Networks and Deep Learning
- Deep learning:五十一(CNN的反向求導及練習)CNN求導
- DEEP LEARNING WITH PYTORCH: A 60 MINUTE BLITZ | NEURAL NETWORKSPyTorch
- 深度學習 DEEP LEARNING 學習筆記(一)深度學習筆記
- 深度學習 DEEP LEARNING 學習筆記(二)深度學習筆記
- 機器學習(Machine Learning)&深度學習(Deep Learning)資料機器學習Mac深度學習
- Deep Learning模型之:CNN卷積神經網路(三)CNN常見問題總結模型CNN卷積神經網路
- 深度學習(Deep Learning)優缺點深度學習
- Deep Learning(深度學習)學習筆記整理系列深度學習筆記
- 顯示卡不是你學習 Deep Learning 的藉口
- 卷積神經網路(Convolutional Neural Network,CNN)卷積神經網路CNN
- 深度學習問題記錄:Building your Deep Neural深度學習UI
- 遷移學習《Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks》遷移學習
- 深度學習模型調優方法(Deep Learning學習記錄)深度學習模型
- Deep Learning(深度學習)學習筆記整理系列之(一)深度學習筆記
- 《深度學習》PDF Deep Learning: Adaptive Computation and Machine Learning series深度學習APTMac
- 貝葉斯深度學習(bayesian deep learning)深度學習
- Deep Reinforcement Learning 深度增強學習資源
- 01神經網路和深度學習-Deep Neural Network for Image Classification: Application-第四周程式設計作業2神經網路深度學習APP程式設計
- 深度互學習-Deep Mutual Learning:三人行必有我師
- “卷積神經網路(Convolutional Neural Network,CNN)”之問卷積神經網路CNN
- Simple Neural Network
- 01神經網路和深度學習-Building your Deep Neural Network: Step by Step-第四周程式設計作業1神經網路深度學習UI程式設計
- 【深度學習】大牛的《深度學習》筆記,Deep Learning速成教程深度學習筆記
- Spatio-Temporal Representation With Deep Neural Recurrent Network in MIMO CSI Feedback閱讀筆記筆記
- 通過Visualizing Representations來理解Deep Learning、Neural network、以及輸入樣本自身的高維空間結構
- Searching with Deep Learning 深度學習的搜尋應用深度學習
- 文章學習29“Crafting a Toolchain for Image Restoration by Deep Reinforcement Learning”RaftAIREST
- 林軒田機器學習技法課程學習筆記12 — Neural Network機器學習筆記
- deep learning深度學習之學習筆記基於吳恩達coursera課程深度學習筆記吳恩達
- 論文閱讀《Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising》CNN
- Deep Learning with Differential Privacy
- 強化學習(九)Deep Q-Learning進階之Nature DQN強化學習