（二）神經網路入門之Logistic迴歸（分類問題）

chen_h發表於2019-02-16

原文網址 : https://flycode.co/archives/80555

神經網路

作者：chen_h
微訊號 & QQ：862251340
微信公眾號：coderpai
我的部落格：請點選這裡

這篇教程是翻譯Peter Roelants寫的神經網路教程，作者已經授權翻譯，這是原文。

該教程將介紹如何入門神經網路，一共包含五部分。你可以在以下連結找到完整內容。

Logistic迴歸（分類問題）

這部分教程將介紹一部分：

Logistic分類模型

我們在上次的教程中給出了一個很簡單的模型，只有一個輸入和一個輸出。在這篇教程中，我們將構建一個二分類模型，輸入引數是兩個變數。這個模型在統計上被稱為Logistic迴歸模型，網路結構可以被描述如下：

我們先匯入教程需要使用的軟體包。

import numpy as np 
import matplotlib.pyplot as plt 
from matplotlib.colors import colorConverter, ListedColormap
from matplotlib import cm複製程式碼

定義類分佈

在教程中，目標分類t將從兩個獨立分佈中產生，當t=1時，用藍色表示。當t=0時，用紅色表示。輸入引數X是一個N*2的矩陣，目標分類t是一個N * 1的向量。更直觀的表現，見下圖。

# Define and generate the samples
nb_of_samples_per_class = 20  # The number of sample in each class
red_mean = [-1,0]  # The mean of the red class
blue_mean = [1,0]  # The mean of the blue class
std_dev = 1.2  # standard deviation of both classes
# Generate samples from both classes
x_red = np.random.randn(nb_of_samples_per_class, 2) * std_dev + red_mean
x_blue = np.random.randn(nb_of_samples_per_class, 2) * std_dev + blue_mean

# Merge samples in set of input variables x, and corresponding set of output variables t
X = np.vstack((x_red, x_blue))
t = np.vstack((np.zeros((nb_of_samples_per_class,1)), np.ones((nb_of_samples_per_class,1))))複製程式碼

# Plot both classes on the x1, x2 plane
plt.plot(x_red[:,0], x_red[:,1], 'ro', label='class red')
plt.plot(x_blue[:,0], x_blue[:,1], 'bo', label='class blue')
plt.grid()
plt.legend(loc=2)
plt.xlabel('$x_1$', fontsize=15)
plt.ylabel('$x_2$', fontsize=15)
plt.axis([-4, 4, -4, 4])
plt.title('red vs. blue classes in the input space')
plt.show()複製程式碼

Logistic函式和交叉熵損失函式

Logistic函式

我們設計的網路的目的是從輸入的x去預測目標t。假設，輸入x = [x1, x2]，權重w = [w1, w2]，預測目標t = 1。那麼，概率P(t = 1|x, w)將是神經網路輸出的y，即y = σ(x∗wT)。其中，σ表示Logistic函式，定義如下：

如果，對於Logistic函式和它的導數還不是很清楚的，可以檢視這個教程，裡面進行了詳細描述。

交叉熵損失函式

對於這個分類問題的損失函式優化，我們使用交叉熵誤差函式來解決，對於每個訓練樣本i，交叉熵誤差函式定義如下：

如果我們要計算整個訓練樣本的交叉熵誤差，那麼只需要把每一個樣本的值進行累加就可以了，即：

關於交叉熵誤差函式更加詳細的介紹可以看這個教程。

logistic(z)函式實現了Logistic函式，cost(y, t)函式實現了損失函式，nn(x, w)實現了神經網路的輸出結果，nn_predict(x, w)實現了神經網路的預測結果。

# Define the logistic function
def logistic(z): 
    return 1 / (1 + np.exp(-z))

# Define the neural network function y = 1 / (1 + numpy.exp(-x*w))
def nn(x, w): 
    return logistic(x.dot(w.T))

# Define the neural network prediction function that only returns
#  1 or 0 depending on the predicted class
def nn_predict(x,w): 
    return np.around(nn(x,w))
    
# Define the cost function
def cost(y, t):
    return - np.sum(np.multiply(t, np.log(y)) + np.multiply((1-t), np.log(1-y)))複製程式碼

# Plot the cost in function of the weights
# Define a vector of weights for which we want to plot the cost
nb_of_ws = 100 # compute the cost nb_of_ws times in each dimension
ws1 = np.linspace(-5, 5, num=nb_of_ws) # weight 1
ws2 = np.linspace(-5, 5, num=nb_of_ws) # weight 2
ws_x, ws_y = np.meshgrid(ws1, ws2) # generate grid
cost_ws = np.zeros((nb_of_ws, nb_of_ws)) # initialize cost matrix
# Fill the cost matrix for each combination of weights
for i in range(nb_of_ws):
    for j in range(nb_of_ws):
        cost_ws[i,j] = cost(nn(X, np.asmatrix([ws_x[i,j], ws_y[i,j]])) , t)
# Plot the cost function surface
plt.contourf(ws_x, ws_y, cost_ws, 20, cmap=cm.pink)
cbar = plt.colorbar()
cbar.ax.set_ylabel('$\\xi$', fontsize=15)
plt.xlabel('$w_1$', fontsize=15)
plt.ylabel('$w_2$', fontsize=15)
plt.title('Cost function surface')
plt.grid()
plt.show()複製程式碼

梯度下降優化損失函式

梯度下降演算法的工作原理是損失函式ξ對於每一個引數的求導，然後沿著負梯度方向進行引數更新。

引數w按照一定的學習率沿著負梯度方向更新，即w(k+1)=w(k)−Δw(k+1)，其中Δw可以表示為：

對於每個訓練樣本i，∂ξi/∂w計算如下：

其中，yi=σ(zi)是神經元的Logistic輸出，zi=xi∗wT是神經元的輸入。

在詳細推導損失函式對於權重的導數之前，我們先這個教程中摘取幾個推導。

參考上面的分步推導，我們可以得到下面的詳細推導：

因此，對於每個權重的更新Δwj可以表示為：

在批處理中，我們需要將N個樣本的梯度都進行累加，即：

在開始梯度下降演算法之前，你需要對引數都進行一個隨機數賦值過程，然後採用梯度下降演算法更新引數，直至收斂。

gradient(w, x, t)函式實現了梯度∂ξ/∂w，delta_w(w_k, x, t, learning_rate)函式實現了Δw。

# define the gradient function.
def gradient(w, x, t):
    return (nn(x, w) - t).T * x

# define the update function delta w which returns the 
#  delta w for each weight in a vector
def delta_w(w_k, x, t, learning_rate):
    return learning_rate * gradient(w_k, x, t)複製程式碼

梯度下降更新

我們在訓練集X上面執行10次去做預測，下圖中畫出了前三次的結果，圖中藍色的點表示在第k次，w(k)的值。

# Set the initial weight parameter
w = np.asmatrix([-4, -2])
# Set the learning rate
learning_rate = 0.05

# Start the gradient descent updates and plot the iterations
nb_of_iterations = 10  # Number of gradient descent updates
w_iter = [w]  # List to store the weight values over the iterations
for i in range(nb_of_iterations):
    dw = delta_w(w, X, t, learning_rate)  # Get the delta w update
    w = w-dw  # Update the weights
    w_iter.append(w)  # Store the weights for plotting複製程式碼

# Plot the first weight updates on the error surface
# Plot the error surface
plt.contourf(ws_x, ws_y, cost_ws, 20, alpha=0.9, cmap=cm.pink)
cbar = plt.colorbar()
cbar.ax.set_ylabel('cost')

# Plot the updates
for i in range(1, 4): 
    w1 = w_iter[i-1]
    w2 = w_iter[i]
    # Plot the weight-cost value and the line that represents the update
    plt.plot(w1[0,0], w1[0,1], 'bo')  # Plot the weight cost value
    plt.plot([w1[0,0], w2[0,0]], [w1[0,1], w2[0,1]], 'b-')
    plt.text(w1[0,0]-0.2, w1[0,1]+0.4, '$w({})$'.format(i), color='b')
w1 = w_iter[3]  
# Plot the last weight
plt.plot(w1[0,0], w1[0,1], 'bo')
plt.text(w1[0,0]-0.2, w1[0,1]+0.4, '$w({})$'.format(4), color='b') 
# Show figure
plt.xlabel('$w_1$', fontsize=15)
plt.ylabel('$w_2$', fontsize=15)
plt.title('Gradient descent updates on cost surface')
plt.grid()
plt.show()複製程式碼

訓練結果視覺化

下列程式碼，我們將訓練的結果進行視覺化。

# Plot the resulting decision boundary
# Generate a grid over the input space to plot the color of the
#  classification at that grid point
nb_of_xs = 200
xs1 = np.linspace(-4, 4, num=nb_of_xs)
xs2 = np.linspace(-4, 4, num=nb_of_xs)
xx, yy = np.meshgrid(xs1, xs2) # create the grid
# Initialize and fill the classification plane
classification_plane = np.zeros((nb_of_xs, nb_of_xs))
for i in range(nb_of_xs):
    for j in range(nb_of_xs):
        classification_plane[i,j] = nn_predict(np.asmatrix([xx[i,j], yy[i,j]]) , w)
# Create a color map to show the classification colors of each grid point
cmap = ListedColormap([
        colorConverter.to_rgba('r', alpha=0.30),
        colorConverter.to_rgba('b', alpha=0.30)])

# Plot the classification plane with decision boundary and input samples
plt.contourf(xx, yy, classification_plane, cmap=cmap)
plt.plot(x_red[:,0], x_red[:,1], 'ro', label='target red')
plt.plot(x_blue[:,0], x_blue[:,1], 'bo', label='target blue')
plt.grid()
plt.legend(loc=2)
plt.xlabel('$x_1$', fontsize=15)
plt.ylabel('$x_2$', fontsize=15)
plt.title('red vs. blue classification boundary')
plt.show()複製程式碼

完整程式碼，點選這裡

CoderPai 是一個專注於演算法實戰的平臺，從基礎的演算法到人工智慧演算法都有設計。如果你對演算法實戰感興趣，請快快關注我們吧。加入AI實戰微信群，AI實戰QQ群，ACM演算法微信群，ACM演算法QQ群。詳情請關注 “CoderPai” 微訊號（coderpai）。

三、邏輯迴歸logistic regression——分類問題
2024-08-06
邏輯迴歸
【ML系列】簡單的二元分類——Logistic迴歸
2018-09-15
torch神經網路--線性迴歸
2024-10-05
神經網路
[譯] RNN 迴圈神經網路系列 2：文字分類
2019-03-01
RNN神經網路文字分類
[Python人工智慧] 二.theano實現迴歸神經網路分析
2018-05-21
Python人工智慧神經網路
Pytorch實戰-logistic 迴歸二元分類程式碼詳細註釋
2019-12-27
PyTorch
圖神經網路入門
2020-11-22
神經網路
Andrew NG 深度學習課程筆記：二元分類與 Logistic 迴歸
2018-12-01
深度學習筆記
1.4 - logistic迴歸
2024-04-03
機器學習之Logistic迴歸
2018-03-28
機器學習
Python TensorFlow深度神經網路迴歸：keras.Sequential
2023-02-03
Python神經網路Keras
哪個才是解決迴歸問題的最佳演算法？線性迴歸、神經網路還是隨機森林？
2018-03-08
演算法神經網路隨機森林
迴圈神經網路LSTM RNN迴歸：sin曲線預測
2021-09-11
神經網路RNN
邏輯迴歸求解二分類問題以及SPSS的實現
2024-07-03
邏輯迴歸SPSS
機器學習簡介之基礎理論- 線性迴歸、邏輯迴歸、神經網路
2019-04-02
機器學習邏輯迴歸神經網路
迴圈神經網路
2020-03-14
神經網路
神經網路 | 基於MATLAB 深度學習工具實現簡單的數字分類問題（卷積神經網路）
2019-03-07
神經網路Matlab深度學習卷積
神經網路實現鳶尾花分類
2020-09-29
神經網路
（二）非線性迴圈神經網路（RNN）
2019-02-16
神經網路RNN
【深度學習】神經網路入門
2020-04-04
深度學習神經網路
TensorFlow.NET機器學習入門【3】採用神經網路實現非線性迴歸
2021-12-24
機器學習神經網路
吳恩達《神經網路與深度學習》課程筆記（2）– 神經網路基礎之邏輯迴歸
2018-07-29
吳恩達神經網路深度學習筆記邏輯迴歸
機器學習實戰之Logistic迴歸
2018-06-25
機器學習
深度學習之RNN(迴圈神經網路)
2018-05-28
深度學習RNN神經網路
迴圈神經網路（RNN）
2020-07-14
神經網路RNN
迴圈神經網路 RNN
2020-12-21
神經網路RNN
如何入門Pytorch之四：搭建神經網路訓練MNIST
2020-09-13
PyTorch神經網路
NLP與深度學習（二）迴圈神經網路
2021-08-28
深度學習神經網路
分類演算法-邏輯迴歸與二分類
2022-04-05
演算法邏輯迴歸
【神經網路篇】--RNN遞迴神經網路初始與詳解
2018-05-13
神經網路RNN遞迴
Logistic 迴歸-原理及應用
2020-12-22
迴圈神經網路介紹
2018-08-12
神經網路
pytorch--迴圈神經網路
2020-12-22
PyTorch神經網路
神經網路的菜鳥入門祕籍
2018-12-19
神經網路
PyTorch入門-殘差卷積神經網路
2023-04-18
PyTorch卷積神經網路
[機器學習實戰-Logistic迴歸]使用Logistic迴歸預測各種例項
2020-04-29
機器學習
從環境搭建到迴歸神經網路案例，帶你掌握Keras
2021-11-11
神經網路Keras
前沿高階技術之遞迴神經網路（RNN）
2022-07-20
遞迴神經網路RNN