計算機視覺—CNN識別手寫數字（11）

Kervin_Chan發表於2018-06-08

原文網址 : https://juejin.im/post/5b178d7af265da6e2d32bf88

一、載入MNIST資料

TensorFlow已經準備了一個指令碼來自動下載和匯入MNIST資料集。它會自動建立一個'MNIST_data'的目錄來儲存資料。

import tensorflow as tf
import numpy as np
import random 
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data',one_hot=True)
# one_hot介紹 :https://blog.csdn.net/lanhaier0591/article/details/78702558
複製程式碼

這裡，mnist是一個輕量級的類。它以Numpy陣列的形式儲存著訓練、校驗和測試資料集。同時提供了一個函式，用於在迭代中獲得minibatch，後面我們將會用到。

原文連結：http://www.tensorfly.cn/tfdoc/tutorials/mnist_download.html

二、輸入與佔位符

placeholder_inputs()函式將生成兩個tf.placeholder操作，定義傳入圖表中的shape引數，shape引數中包括batch_size值，後續還會將實際的訓練用例傳入圖表。

imageInput = tf.placeholder(tf.float32,[None,784]) 
# 訓練影象
labeInput = tf.placeholder(tf.float32,[None,10]) 
# 訓練標籤
複製程式碼

三、構建一個多層卷積網路

1、權重初始化

reshape(tensor, shape, name=None)

引數

tensor，被調整維度的張量

shape，要調整為的形狀

imageInputReshape = tf.reshape(imageInput,[-1,28,28,1])
# 2維轉變為4維
複製程式碼

tf.truncated_normal(shape, mean, stddev)

引數

shape表示生成張量的維度，

mean是均值，

stddev是標準差。

w0 = tf.Variable(tf.truncated_normal([5,5,1,32],stddev = 0.1))
# 求標準差
b0 = tf.Variable(tf.constant(0.1,shape=[32]))
# 生成一個32維的張量
複製程式碼

2、激勵函式+卷積運算

tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, name=None)

引數

input：指需要做卷積的輸入影象，它要求是一個Tensor，具有[batch, in_height, in_width, in_channels]這樣的shape，具體含義是[訓練時一個batch的圖片數量, 圖片高度, 圖片寬度, 影象通道數]，注意這是一個4維的Tensor，要求型別為float32和float64其中之一

filter：相當於CNN中的卷積核，它要求是一個Tensor，具有[filter_height, filter_width, in_channels, out_channels]這樣的shape，具體含義是[卷積核的高度，卷積核的寬度，影象通道數，卷積核個數]，要求型別與引數input相同，有一個地方需要注意，第三維in_channels，就是引數input的第四維

strides：卷積時在影象每一維的步長，這是一個一維的向量，長度4

padding：string型別的量，只能是"SAME","VALID"其中之一，這個值決定了不同的卷積方式

use_cudnn_on_gpu:bool型別，是否使用cudnn加速，預設為true

輸出：

結果返回一個Tensor，這個輸出，就是我們常說的feature map，shape仍然是[batch, height, width, channels]這種形式。

tf.nn.max_pool(value, ksize, strides, padding, name=None)

引數

value：池化的輸入，一般池化層接在卷積層的後面，所以輸出通常為feature map。feature map依舊是[batch, in_height, in_width, in_channels]這樣的引數。

ksize：池化視窗的大小，引數為四維向量，通常取[1, height, width, 1]，因為我們不想在batch和channels上做池化，所以這兩個維度設為了1。ps：估計面tf.nn.conv2d中stries的四個取值也有相同的意思。

stries：步長，同樣是一個四維向量。

padding：填充方式同樣只有兩種不重複了。

layer1 = tf.nn.relu(tf.nn.conv2d(imageInputReshape,w0,strides=[1,1,1,1],padding='SAME')+b0)
# layer1：激勵函式+卷積運算
# imageInputReshape : M*28*28*1  w0:5,5,1,32  
# layer1：M*28*28*32
複製程式碼

3、池化

layer1_pool = tf.nn.max_pool(layer1,ksize=[1,4,4,1],strides=[1,4,4,1],padding='SAME')
# pool取樣：資料量減少很多M*28*28*32 => M*7*7*32
複製程式碼

4、激勵函式+乘加運算

# layer2 out : softmax（激勵函式 + 乘加運算）
w1 = tf.Variable(tf.truncated_normal([7*7*32,1024],stddev=0.1))
b1 = tf.Variable(tf.constant(0.1,shape=[1024]))
h_reshape = tf.reshape(layer1_pool,[-1,7*7*32])
h1 = tf.nn.relu(tf.matmul(h_reshape,w1)+b1)
# [N*7*7*32]  [7*7*32,1024] = N*1024
複製程式碼

5、輸出層

最後，我們新增一個softmax層

w2 = tf.Variable(tf.truncated_normal([1024,10],stddev=0.1))
b2 = tf.Variable(tf.constant(0.1,shape=[10]))
pred = tf.nn.softmax(tf.matmul(h1,w2)+b2)
# N*1024  1024*10 = N*10
複製程式碼

6、損失函式

loss0 = labeInput*tf.log(pred)
loss1 = 0
for m in range(0,500):
    for n in range(0,10):
        loss1 = loss1 - loss0[m,n]
loss = loss1/500
複製程式碼

7、訓練和評估模型

train = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
# 讓誤差儘可能縮小
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(100):
        images,labels = mnist.train.next_batch(500)
        sess.run(train,feed_dict={imageInput:images,labeInput:labels})
        
        pred_test = sess.run(pred,feed_dict={imageInput:mnist.test.images,labeInput:labels})
        acc = tf.equal(tf.arg_max(pred_test,1),tf.arg_max(mnist.test.labels,1))
        acc_float = tf.reduce_mean(tf.cast(acc,tf.float32))
        acc_result = sess.run(acc_float,feed_dict={imageInput:mnist.test.images,labeInput:mnist.test.labels})
        print(acc_result)
複製程式碼

計算機視覺—kNN識別手寫數字（10）
2018-06-04
計算機視覺KNN
iOS計算機視覺—人臉識別
2018-10-01
iOS計算機視覺
【機器學習】手寫數字識別
2022-07-04
機器學習
CNN實現手寫數字識別並改變引數進行分析
2019-12-19
CNN
乾貨｜如何利用CNN建立計算機視覺模型？
2019-05-24
CNN計算機視覺模型
tensorflow.js 手寫數字識別
2024-11-03
JS
計算機視覺專案-人臉識別與檢測
2022-11-01
計算機視覺
卷積神經網路CNN實戰：MINST手寫數字識別——網路定義
2024-07-21
卷積神經網路CNN
【Get】用深度學習識別手寫數字
2018-10-19
深度學習
Tensorflow實現RNN（LSTM）手寫數字識別
2018-05-27
RNN
瀏覽器中的手寫數字識別
2019-04-25
瀏覽器
Tensorflow2.0-mnist手寫數字識別示例
2020-12-29
計算機視覺—人臉識別（Haar特徵+Adaboost分類器）（7）
2019-03-01
計算機視覺特徵
計算機視覺—人臉識別（Hog特徵+SVM分類器）（8）
2019-03-02
計算機視覺HOG特徵
Pytorch搭建MyNet實現MNIST手寫數字識別
2024-06-19
PyTorch
OpenCV + sklearnSVM 實現手寫數字分割和識別
2024-06-17
OpenCV
iOS計算機視覺—ARKit
2019-03-04
iOS計算機視覺
計算機視覺論文集
2020-04-06
計算機視覺
用tensorflow2實現mnist手寫數字識別
2020-11-11
手寫數字圖片識別-全連線網路
2020-11-03
【TensorFlow2.0】LeNet進行手寫體數字識別
2020-10-19
在PaddlePaddle上實現MNIST手寫體數字識別
2018-03-29
小熊飛槳練習冊-01手寫數字識別
2022-04-12
Pytorch 手寫數字識別深度學習基礎分享
2024-12-09
PyTorch深度學習
【Python教程】計算機視覺的基石——讀懂 CNN卷積神經網路
2021-08-26
Python計算機視覺CNN卷積神經網路
機器學習之神經網路識別手寫數字(純python實現)
2019-03-03
機器學習神經網路Python
手寫數字圖片識別-卷積神經網路
2020-11-09
卷積神經網路
《手寫數字識別》神經網路學習筆記
2020-10-26
神經網路筆記
深度學習實驗：Softmax實現手寫數字識別
2022-07-27
深度學習
KNN 演算法-實戰篇-如何識別手寫數字
2020-12-03
KNN演算法
計算機視覺—影象特效（3）
2018-05-22
計算機視覺特效
計算機視覺環境配置
2020-11-06
計算機視覺
OpenVINO計算機視覺模型加速
2022-12-07
計算機視覺模型
學習了哪些知識，計算機視覺才算入門？
2020-02-29
計算機視覺
【計算機視覺】視訊格式介紹
2020-10-07
計算機視覺
【百度飛槳】手寫數字識別模型部署Paddle Inference
2022-07-16
模型
機器視覺以及驗證碼識別
2019-02-16
視覺
計算機視覺頂會引用格式
2024-05-24
計算機視覺