TensorFlow學習筆記（8）：基於MNIST資料的迴圈神經網路RNN

丹追兵發表於2017-02-14

原文網址 : https://flycode.co/archives/99117

前言

本文輸入資料是MNIST，全稱是Modified National Institute of Standards and Technology，是一組由這個機構蒐集的手寫數字掃描檔案和每個檔案對應標籤的資料集，經過一定的修改使其適合機器學習演算法讀取。這個資料集可以從牛的不行的Yann LeCun教授的網站獲取。

本系列的其他文章已經根據TensorFlow的官方教程基於MNIST資料集採用了softmax regression和CNN進行建模。為了完整性，本文對MNIST資料應用RNN模型求解，具體使用的RNN為LSTM。

關於RNN/LSTM的理論知識，可以參考這篇文章

程式碼

# coding: utf-8
# @author: 陳水平
# @date：2017-02-14
# 

# In[1]:

import tensorflow as tf
import numpy as np


# In[2]:

sess = tf.InteractiveSession()


# In[3]:

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets(`mnist/`, one_hot=True)


# In[4]:

learning_rate = 0.001
batch_size = 128

n_input = 28
n_steps = 28
n_hidden = 128
n_classes = 10

x = tf.placeholder(tf.float32, [None, n_steps, n_input])
y = tf.placeholder(tf.float32, [None, n_classes])


# In[5]:

def RNN(x, weight, biases):
    # x shape: (batch_size, n_steps, n_input)
    # desired shape: list of n_steps with element shape (batch_size, n_input)
    x = tf.transpose(x, [1, 0, 2])
    x = tf.reshape(x, [-1, n_input])
    x = tf.split(0, n_steps, x)
    outputs = list()
    lstm = tf.nn.rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0)
    state = (tf.zeros([n_steps, n_hidden]),)*2
    sess.run(state)
    with tf.variable_scope("myrnn2") as scope:
        for i in range(n_steps-1):
            if i > 0:
                scope.reuse_variables()
            output, state = lstm(x[i], state)
            outputs.append(output)
    final = tf.matmul(outputs[-1], weight) + biases
    return final


# In[6]:

def RNN(x, n_steps, n_input, n_hidden, n_classes):
    # Parameters:
    # Input gate: input, previous output, and bias
    ix = tf.Variable(tf.truncated_normal([n_input, n_hidden], -0.1, 0.1))
    im = tf.Variable(tf.truncated_normal([n_hidden, n_hidden], -0.1, 0.1))
    ib = tf.Variable(tf.zeros([1, n_hidden]))
    # Forget gate: input, previous output, and bias
    fx = tf.Variable(tf.truncated_normal([n_input, n_hidden], -0.1, 0.1))
    fm = tf.Variable(tf.truncated_normal([n_hidden, n_hidden], -0.1, 0.1))
    fb = tf.Variable(tf.zeros([1, n_hidden]))
    # Memory cell: input, state, and bias
    cx = tf.Variable(tf.truncated_normal([n_input, n_hidden], -0.1, 0.1))
    cm = tf.Variable(tf.truncated_normal([n_hidden, n_hidden], -0.1, 0.1))
    cb = tf.Variable(tf.zeros([1, n_hidden]))
    # Output gate: input, previous output, and bias
    ox = tf.Variable(tf.truncated_normal([n_input, n_hidden], -0.1, 0.1))
    om = tf.Variable(tf.truncated_normal([n_hidden, n_hidden], -0.1, 0.1))
    ob = tf.Variable(tf.zeros([1, n_hidden]))
    # Classifier weights and biases
    w = tf.Variable(tf.truncated_normal([n_hidden, n_classes]))
    b = tf.Variable(tf.zeros([n_classes]))

    # Definition of the cell computation
    def lstm_cell(i, o, state):
        input_gate = tf.sigmoid(tf.matmul(i, ix) + tf.matmul(o, im) + ib)
        forget_gate = tf.sigmoid(tf.matmul(i, fx) + tf.matmul(o, fm) + fb)
        update = tf.tanh(tf.matmul(i, cx) + tf.matmul(o, cm) + cb)
        state = forget_gate * state + input_gate * update
        output_gate = tf.sigmoid(tf.matmul(i, ox) +  tf.matmul(o, om) + ob)
        return output_gate * tf.tanh(state), state
    
    # Unrolled LSTM loop
    outputs = list()
    state = tf.Variable(tf.zeros([batch_size, n_hidden]))
    output = tf.Variable(tf.zeros([batch_size, n_hidden]))
    
    # x shape: (batch_size, n_steps, n_input)
    # desired shape: list of n_steps with element shape (batch_size, n_input)
    x = tf.transpose(x, [1, 0, 2])
    x = tf.reshape(x, [-1, n_input])
    x = tf.split(0, n_steps, x)
    for i in x:
        output, state = lstm_cell(i, output, state)
        outputs.append(output)
    logits =tf.matmul(outputs[-1], w) + b
    return logits


# In[7]:

pred = RNN(x, n_steps, n_input, n_hidden, n_classes)

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

correct_pred = tf.equal(tf.argmax(pred,1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initializing the variables
init = tf.global_variables_initializer()


# In[8]:

# Launch the graph
sess.run(init)
for step in range(20000):
    batch_x, batch_y = mnist.train.next_batch(batch_size)
    batch_x = batch_x.reshape((batch_size, n_steps, n_input))
    sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})

    if step % 50 == 0:
        acc = sess.run(accuracy, feed_dict={x: batch_x, y: batch_y})
        loss = sess.run(cost, feed_dict={x: batch_x, y: batch_y})
        print "Iter " + str(step) + ", Minibatch Loss= " +               "{:.6f}".format(loss) + ", Training Accuracy= " +               "{:.5f}".format(acc)
print "Optimization Finished!"


# In[9]:

# Calculate accuracy for 128 mnist test images
test_len = batch_size
test_data = mnist.test.images[:test_len].reshape((-1, n_steps, n_input))
test_label = mnist.test.labels[:test_len]
print "Testing Accuracy:", sess.run(accuracy, feed_dict={x: test_data, y: test_label})

輸出如下：

Iter 0, Minibatch Loss= 2.540429, Training Accuracy= 0.07812
Iter 50, Minibatch Loss= 2.423611, Training Accuracy= 0.06250
Iter 100, Minibatch Loss= 2.318830, Training Accuracy= 0.13281
Iter 150, Minibatch Loss= 2.276640, Training Accuracy= 0.13281
Iter 200, Minibatch Loss= 2.276727, Training Accuracy= 0.12500
Iter 250, Minibatch Loss= 2.267064, Training Accuracy= 0.16406
Iter 300, Minibatch Loss= 2.234139, Training Accuracy= 0.19531
Iter 350, Minibatch Loss= 2.295060, Training Accuracy= 0.12500
Iter 400, Minibatch Loss= 2.261856, Training Accuracy= 0.16406
Iter 450, Minibatch Loss= 2.220284, Training Accuracy= 0.17969
Iter 500, Minibatch Loss= 2.276015, Training Accuracy= 0.13281
Iter 550, Minibatch Loss= 2.220499, Training Accuracy= 0.14062
Iter 600, Minibatch Loss= 2.219574, Training Accuracy= 0.11719
Iter 650, Minibatch Loss= 2.189177, Training Accuracy= 0.25781
Iter 700, Minibatch Loss= 2.195167, Training Accuracy= 0.19531
Iter 750, Minibatch Loss= 2.226459, Training Accuracy= 0.18750
Iter 800, Minibatch Loss= 2.148620, Training Accuracy= 0.23438
Iter 850, Minibatch Loss= 2.122925, Training Accuracy= 0.21875
Iter 900, Minibatch Loss= 2.065122, Training Accuracy= 0.24219
...
Iter 19350, Minibatch Loss= 0.001304, Training Accuracy= 1.00000
Iter 19400, Minibatch Loss= 0.000144, Training Accuracy= 1.00000
Iter 19450, Minibatch Loss= 0.000907, Training Accuracy= 1.00000
Iter 19500, Minibatch Loss= 0.002555, Training Accuracy= 1.00000
Iter 19550, Minibatch Loss= 0.002018, Training Accuracy= 1.00000
Iter 19600, Minibatch Loss= 0.000853, Training Accuracy= 1.00000
Iter 19650, Minibatch Loss= 0.001035, Training Accuracy= 1.00000
Iter 19700, Minibatch Loss= 0.007034, Training Accuracy= 0.99219
Iter 19750, Minibatch Loss= 0.000608, Training Accuracy= 1.00000
Iter 19800, Minibatch Loss= 0.002913, Training Accuracy= 1.00000
Iter 19850, Minibatch Loss= 0.003484, Training Accuracy= 1.00000
Iter 19900, Minibatch Loss= 0.005693, Training Accuracy= 1.00000
Iter 19950, Minibatch Loss= 0.001904, Training Accuracy= 1.00000
Optimization Finished!

Testing Accuracy: 0.992188

深度學習之RNN(迴圈神經網路)
2018-05-28
深度學習RNN神經網路
迴圈神經網路（RNN）
2020-07-14
神經網路RNN
迴圈神經網路 RNN
2020-12-21
神經網路RNN
深度學習筆記8：利用Tensorflow搭建神經網路
2021-09-09
深度學習筆記神經網路
迴圈神經網路（Recurrent Neural Network，RNN）
2018-08-22
神經網路RNN
（一）線性迴圈神經網路（RNN）
2019-02-21
神經網路RNN
吳恩達《序列模型》課程筆記（1）– 迴圈神經網路（RNN）
2018-08-02
吳恩達模型筆記神經網路RNN
用於自然語言處理的迴圈神經網路RNN
2024-11-25
自然語言處理神經網路RNN
關於 RNN 迴圈神經網路的反向傳播求導
2021-01-11
RNN神經網路反向傳播求導
（二）非線性迴圈神經網路（RNN）
2019-02-16
神經網路RNN
4.5 RNN迴圈神經網路（recurrent neural network）
2021-07-05
RNN神經網路
RNN-迴圈神經網路和LSTM_01基礎
2018-05-27
RNN神經網路
TensorFlow系列專題（七）：一文綜述RNN迴圈神經網路
2018-11-22
RNN神經網路
從網路架構方面簡析迴圈神經網路RNN
2019-05-17
架構神經網路RNN
[譯] RNN 迴圈神經網路系列 2：文字分類
2019-03-01
RNN神經網路文字分類
迴圈神經網路LSTM RNN迴歸：sin曲線預測
2021-09-11
神經網路RNN
深度學習迴圈神經網路詳解
2018-05-28
深度學習神經網路
迴圈神經網路
2020-03-14
神經網路
NLP與深度學習（二）迴圈神經網路
2021-08-28
深度學習神經網路
TensorFlow系列專題（八）：七步帶你實現RNN迴圈神經網路小示例
2018-11-22
RNN神經網路
【神經網路篇】--RNN遞迴神經網路初始與詳解
2018-05-13
神經網路RNN遞迴
[譯] RNN 迴圈神經網路系列 3：編碼、解碼器
2019-03-03
RNN神經網路
精講深度學習RNN三大核心點，三分鐘掌握迴圈神經網路
2019-03-09
深度學習RNN神經網路
從前饋到反饋：解析迴圈神經網路（RNN）及其tricks
2018-07-26
神經網路RNN
TensorFlow筆記-07-神經網路優化-學習率,滑動平均
2018-09-12
筆記神經網路優化
迴圈神經網路介紹
2018-08-12
神經網路
pytorch--迴圈神經網路
2020-12-22
PyTorch神經網路
深度學習四從迴圈神經網路入手學習LSTM及GRU
2020-10-24
深度學習神經網路
卷積神經網路學習筆記——Siamese networks（孿生神經網路）
2021-01-14
卷積神經網路筆記
【機器學習】搭建神經網路筆記
2018-12-25
機器學習神經網路筆記
深度學習筆記------卷積神經網路
2020-02-09
深度學習筆記卷積神經網路
全連線神經網路學習筆記
2021-11-28
神經網路筆記
卷積神經網路學習筆記——SENet
2021-01-23
卷積神經網路筆記SENet
深度學習卷積神經網路筆記
2020-12-19
深度學習卷積神經網路筆記
matlab練習程式（神經網路識別mnist手寫資料集）
2018-05-15
Matlab神經網路
第五週：迴圈神經網路
2020-08-22
神經網路
【菜鳥筆記|機器學習】神經網路
2020-10-29
筆記機器學習神經網路
機器學習筆記(3): 神經網路初步
2024-06-08
機器學習筆記神經網路
幾種型別神經網路學習筆記
2023-03-03
型別神經網路筆記

TensorFlow學習筆記（8）：基於MNIST資料的迴圈神經網路RNN

前言

程式碼

相關文章