《深度學習——Andrew Ng》第四課第四周程式設計作業_2_神經網路風格遷移
課程筆記
演算法將一幅圖片分為內容+風格,有了這兩像,圖片也就確定了,所以”生成圖片主要的思想,通過兩個損失函式(內容損失+風格損失)來進行迭代更新”
遷移學習總體分為三步:
- 建立內容損失函式 J_{content}(C,G)
- 建立風格損失函式 J_{style}(S,G)
- 加權組合起來,即總體損失函式 J(G) = \alpha J_{content}(C,G) + \beta J_{style}(S,G).
CNN是對輸入的圖片進行處理的神經網路,一般有卷積層、池化層、全連線層,每一層都是對圖片進行畫素級的運算。圖片以矩陣的形式輸入神經網路,在經過每一層時的輸出依然時矩陣,把這個矩陣反轉回去得到的影像,就是這一層對圖片進行處理後得到的影像。
一個神經網路,前面幾層(淺層)一般檢測圖片的基礎特徵,例如邊緣和結構;後面幾層(深層)一般檢測圖片的綜合特徵,例如具體的類別。
內容損失函式
我們希望“生成的”影像G具有與輸入影像C相似的內容。但是選擇神經網路的哪些層的輸出來表示圖片的內容呢,作業中使用了中間的層, 既不太淺也不太深,可以取得好的效果。 (完成此練習後,請隨時返回並嘗試使用不同的圖層,以檢視結果的變化。)
使用已經訓練過的網路 VGG,輸出層輸入影像為C,經過VGG網路前向傳播,得到
What you should remember:
- The content cost takes a hidden layer activation of the neural network, and measures how different
- When we minimize the content cost later, this will help make sure
風格損失函式
上面的內容矩陣是直接採用指定層的輸出矩陣,而風格矩陣在這裡用 “Gram matrix.” 表示,也叫相關矩陣,如下圖:
計算Gram Matrix首先對矩陣進行展開(Unrolled),隨後再進行矩陣轉置,矩陣點乘。
線上性代數中, Gram matrix表示的是矩陣中不同向量之間的相關性, G 的向量是做如下運算得到的:
矩陣對角線上的元素是 向量內積;非對角線元素是 兩兩不同向量內積,值的大小可以反應這兩個不同向量的相關性,值越大,相關性越大。
在神經網路中,上述的進過 Unrolled 矩陣的不同向量代表同一層不同濾波器的輸出,所以 Gram Matrix 對角線上的元素
在有了 Gram Matrix 以後,風格損失函式定義如下:
What you should remember:
- The style of an image can be represented using the Gram matrix of a hidden layer’s activations. However, we get even better results combining this representation from multiple different layers. This is in contrast to the content representation, where usually using just a single hidden layer is sufficient.
- Minimizing the style cost will cause the image
總體損失函式
最後,將內容損失函式和風格損失函式進行加權相加,得到總的損失函式:
有了總體損失函式,每次迭代更新的引數應該是輸入白噪聲圖片的畫素;就像是神經網路看了兩幅畫,找到他們的特徵(
[l]層輸出影像),然後找到不同的地方(總體損失函式),去做修正(畫素級),最終得到想要的結果。具體怎麼更新圖片畫素,有待研究。
pycharm版程式
使用 tensorflow 進行訓練
import os
import sys
import scipy.io
import scipy.misc
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
from PIL import Image
from nst_utils import *
import numpy as np
import tensorflow as tf
import datetime
# GRADED FUNCTION: compute_content_cost
def compute_content_cost(a_C, a_G):
"""
Computes the content cost
Arguments:
a_C -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing content of the image C
a_G -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing content of the image G
Returns:
J_content -- scalar that you compute using equation 1 above.
"""
### START CODE HERE ###
# Retrieve dimensions from a_G (≈1 line)
m, n_H, n_W, n_C = a_G.get_shape().as_list() # 用 a_G 和 a_C 的區別?
# Reshape a_C and a_G (≈2 lines)
a_C_unrolled = tf.reshape(a_C,[n_H * n_W, n_C])
a_G_unrolled = tf.reshape(a_G,[n_H * n_W, n_C])
# compute the cost with tensorflow (≈1 line)
J_content = tf.reduce_sum(tf.square(tf.subtract(a_C_unrolled, a_G_unrolled))) / (4*n_H*n_W*n_C)
### END CODE HERE ###
return J_content
# GRADED FUNCTION: gram_matrix
def gram_matrix(A):
"""
Argument:
A -- matrix of shape (n_C, n_H*n_W)
Returns:
GA -- Gram matrix of A, of shape (n_C, n_C)
"""
### START CODE HERE ### (≈1 line)
GA = tf.matmul(A, A ,transpose_a=False, transpose_b=True) # 矩陣相乘,後面的flag表示是否對對應矩陣進行轉置操作
### END CODE HERE ###
return GA
# GRADED FUNCTION: compute_layer_style_cost
def compute_layer_style_cost(a_S, a_G):
"""
Arguments:
a_S -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing style of the image S
a_G -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing style of the image G
Returns:
J_style_layer -- tensor representing a scalar value, style cost defined above by equation (2)
"""
### START CODE HERE ###
# Retrieve dimensions from a_G (≈1 line)
m, n_H, n_W, n_C = a_G.get_shape().as_list()
# Reshape the images to have them of shape (n_H*n_W, n_C) (≈2 lines)
a_S = tf.reshape(a_S, [n_W*n_H, n_C])
a_G = tf.reshape(a_G, [n_W*n_H, n_C])
# Computing gram_matrices for both images S and G (≈2 lines)
GS = gram_matrix(tf.transpose(a_S))
GG = gram_matrix(tf.transpose(a_G))
# GS = gram_matrix(a_S)
# GG = gram_matrix(a_G)
# Computing the loss (≈1 line)
J_style_layer = tf.reduce_sum(tf.square(tf.subtract(GS, GG))) / (4*tf.to_float(tf.square(n_C*n_H*n_W)))
### END CODE HERE ###
return J_style_layer
def compute_style_cost(model, STYLE_LAYERS):
"""
Computes the overall style cost from several chosen layers
Arguments:
model -- our tensorflow model
STYLE_LAYERS -- A python list containing:
- the names of the layers we would like to extract style from
- a coefficient for each of them
Returns:
J_style -- tensor representing a scalar value, style cost defined above by equation (2)
"""
# initialize the overall style cost
J_style = 0
for layer_name, coeff in STYLE_LAYERS:
# Select the output tensor of the currently selected layer
out = model[layer_name]
# Set a_S to be the hidden layer activation from the layer we have selected, by running the session on out
a_S = sess.run(out)
# Set a_G to be the hidden layer activation from same layer. Here, a_G references model[layer_name]
# and isn't evaluated yet. Later in the code, we'll assign the image G as the model input, so that
# when we run the session, this will be the activations drawn from the appropriate layer, with G as input.
a_G = out
# Compute style_cost for the current layer
J_style_layer = compute_layer_style_cost(a_S, a_G)
# Add coeff * J_style_layer of this layer to overall style cost
J_style += coeff * J_style_layer
return J_style
# GRADED FUNCTION: total_cost
def total_cost(J_content, J_style, alpha=10, beta=40):
"""
Computes the total cost function
Arguments:
J_content -- content cost coded above
J_style -- style cost coded above
alpha -- hyperparameter weighting the importance of the content cost
beta -- hyperparameter weighting the importance of the style cost
Returns:
J -- total cost as defined by the formula above.
"""
### START CODE HERE ### (≈1 line)
J = alpha * J_content + beta * J_style
### END CODE HERE ###
return J
def model_nn(sess, input_image, num_iterations=200):
# Initialize global variables (you need to run the session on the initializer)
### START CODE HERE ### (1 line)
sess.run(tf.global_variables_initializer())
### END CODE HERE ###
# Run the noisy input image (initial generated image) through the model. Use assign().
### START CODE HERE ### (1 line)
sess.run(model['input'].assign(input_image))
### END CODE HERE ###
for i in range(num_iterations):
# Run the session on the train_step to minimize the total cost
### START CODE HERE ### (1 line)
sess.run(train_step)
### END CODE HERE ###
# Compute the generated image by running the session on the current model['input']
### START CODE HERE ### (1 line)
generated_image = sess.run(model['input'])
### END CODE HERE ###
# Print every 20 iteration.
if i % 20 == 0:
Jt, Jc, Js = sess.run([J, J_content, J_style])
print("Iteration " + str(i) + " :")
print("total cost = " + str(Jt))
print("content cost = " + str(Jc))
print("style cost = " + str(Js))
# save current generated image in the "/output" directory
save_image("out1/3/" + str(i) + ".png", generated_image)
# save last generated image
save_image('out1/3/generated_image.jpg', generated_image)
return generated_image
if __name__ == '__main__':
starttime = datetime.datetime.now()
###############################################
# Reset the graph
tf.reset_default_graph()
# Start interactive session
sess = tf.InteractiveSession()
content_image = scipy.misc.imread("input/y.jpg")
content_image = reshape_and_normalize_image(content_image)
style_image = scipy.misc.imread("images/sky.jpg")
style_image = reshape_and_normalize_image(style_image)
generated_image = generate_noise_image(content_image)
plt.imshow(generated_image[0])
plt.show()
model = load_vgg_model("pretrained-model/imagenet-vgg-verydeep-19.mat")
STYLE_LAYERS = [ # style_layers 的作用
('conv1_1', 0.2),
('conv2_1', 0.2),
('conv3_1', 0.2),
('conv4_1', 0.2),
('conv5_1', 0.2)]
# Assign the content image to be the input of the VGG model.
sess.run(model['input'].assign(content_image))
# Select the output tensor of layer conv4_2
out = model['conv4_2']
# Set a_C to be the hidden layer activation from the layer we have selected
a_C = sess.run(out)
# Set a_G to be the hidden layer activation from same layer. Here, a_G references model['conv4_2']
# and isn't evaluated yet. Later in the code, we'll assign the image G as the model input, so that
# when we run the session, this will be the activations drawn from the appropriate layer, with G as input.
a_G = out
# Compute the content cost
J_content = compute_content_cost(a_C, a_G)
# Assign the input of the model to be the "style" image
sess.run(model['input'].assign(style_image))
# Compute the style cost
J_style = compute_style_cost(model, STYLE_LAYERS)
### START CODE HERE ### (1 line)
J = total_cost(J_content=J_content, J_style=J_style)
### END CODE HERE ###
# define optimizer (1 line)
optimizer = tf.train.AdamOptimizer(2.0)
# define train_step (1 line)
train_step = optimizer.minimize(J)
model_nn(sess, generated_image)
#################################################
endtime = datetime.datetime.now()
print("the running time :" + str((endtime - starttime).seconds))
print("END!")
結果
剛開始生成的白噪聲圖片,400*300 ,神經網路通過學習,把這個圖片改成想要的模樣,可怕:
內容圖片(400*300):
風格圖片(400*300):
生成圖片(400*300),迭代200,結果已穩定:
相關文章
- 《深度學習——Andrew Ng》第四課第四周程式設計作業_1_人臉識別深度學習程式設計
- 01神經網路和深度學習-Deep Neural Network for Image Classification: Application-第四周程式設計作業2神經網路深度學習APP程式設計
- 01神經網路和深度學習-Building your Deep Neural Network: Step by Step-第四周程式設計作業1神經網路深度學習UI程式設計
- 《深度學習——Andrew Ng》第五課第一週程式設計作業_1_Building a RNN Step by Step深度學習程式設計UIRNN
- 第四周:卷積神經網路 part 3卷積神經網路
- Coursera Deep Learning 4 卷積神經網路 第四周習題卷積神經網路
- 神經風格遷移:使用 tf.keras 和 Eager Execution,藉助深度學習創作藝術作品Keras深度學習
- 01神經網路和深度學習-Python-Basics-With-Numpy-第二週程式設計作業1神經網路深度學習Python程式設計
- 吳恩達《卷積神經網路》課程筆記(4)– 人臉識別與神經風格遷移吳恩達卷積神經網路筆記
- 卷積神經網路第四周作業2: Art Generation with Neural Style Transfer - v1卷積神經網路
- 機器學習之光:神經風格遷移的直觀指南!機器學習
- Andrew NG 深度學習課程筆記:梯度下降與向量化操作深度學習筆記梯度
- android中的深度學習——快速風格遷移Android深度學習
- 神經網路與深度學習 課程複習總結神經網路深度學習
- 01神經網路和深度學習-Logistic-Regression-with-a-Neural-Network-mindset-第二週程式設計作業2神經網路深度學習程式設計
- Ng深度學習筆記——卷積神經網路基礎深度學習筆記卷積神經網路
- 吳恩達《神經網路與深度學習》課程筆記(4)– 淺層神經網路吳恩達神經網路深度學習筆記
- 吳恩達《神經網路與深度學習》課程筆記(5)– 深層神經網路吳恩達神經網路深度學習筆記
- 再聊神經網路與深度學習神經網路深度學習
- AI之(神經網路+深度學習)AI神經網路深度學習
- 【深度學習】神經網路入門深度學習神經網路
- 深度學習與圖神經網路深度學習神經網路
- 01神經網路和深度學習-Planar data classification with one hidden layer v3-第三週程式設計作業神經網路深度學習程式設計
- 【深度學習篇】--神經網路中的卷積神經網路深度學習神經網路卷積
- 高階語言程式設計課程第四次作業程式設計
- 吳恩達《神經網路與深度學習》課程筆記(1)– 深度學習概述吳恩達神經網路深度學習筆記
- 《計算機基礎與程式設計》第四周學習總結計算機程式設計
- Andrew NG 深度學習課程筆記:二元分類與 Logistic 迴歸深度學習筆記
- Java程式設計第四章作業Java程式設計
- 學習Java第四周Java
- 【深度學習】1.4深層神經網路深度學習神經網路
- 深度學習三:卷積神經網路深度學習卷積神經網路
- 深度學習教程 | 深層神經網路深度學習神經網路
- 第四周-雲端計算運維作業運維
- 高階語言程式設計課程第四次個人作業程式設計
- 深度學習與圖神經網路學習分享:CNN 經典網路之-ResNet深度學習神經網路CNN
- Java課堂 第四周Java
- Andrew BP 神經網路詳細推導神經網路