【CS231n】Spring 2020 Assignments - Assignment1 - Two Layer Net
文章目錄
前言
我的作業是在 Google Colaboratory 上完成的,還是挺方便的。
注意:該次作業應該是聽完卷積神經網路才做的
1. Implementing a Neural Network
1.1. TwoLayerNet.loss
1.1.1. Scores
首先計算得分 Scores ,即正向傳播。雖然很簡單,但是別忘了兩個全聯接層之間還有一個非線形層(RELU),直接放程式碼:
XW1 = X.dot(W1)
XW1pb1 = XW1 + b1
H = np.maximum(0, XW1pb1)
HW2 = H.dot(W2)
HW2pb2 = HW2 + b2
scores = HW2pb2
1.1.2. Loss
接下來是計算損失 Loss 。其實在前面的作業中我們已經推匯出了 Softmax 的損失是什麼了(如下),這邊也差不多:
L i = − l o g ( e s y i ∑ j e s j ) L_i = -\ log(\frac{e^{s_{y_i}}}{\sum_{j}{e^{s_j}}}) Li=− log(∑jesjesyi)
但是需要注意的是:我們現在有兩個特徵矩陣 W1 和 W2 了,所以正則化懲罰就有兩項了。
positive_scores = np.exp(scores)
softmax = positive_scores / np.sum(positive_scores, axis=1).reshape((N, 1))
loss = np.sum((-np.log(softmax))[range(N), y])
loss /= N
loss += (reg * np.sum(W1*W1) + reg * np.sum(W2*W2))
1.1.3. dW
雖然計算梯度矩陣 dW 看起來非常複雜,但是細心一點還是沒問題的。
整體思路就是畫出計算圖(Computational Graph),根據鏈式法則一步一步反向傳播,千萬不要傻傻得想要推出損失函式對每個超引數的求導公式!
計算圖如下:
完整程式碼如下:
W1 = np.zeros_like(W1)
dW2 = np.zeros_like(W2)
db1 = np.zeros_like(b1)
db2 = np.zeros_like(b2)
# reg = softmax + (reg * np.sum(W1*W1) + reg * np.sum(W2*W2)
dW1 += 2 * reg * W1
dW2 += 2 * reg * W2
# softmax( H * W2 + b2 )
softmax[range(N), y] -= 1.0
dsoftmax = softmax / N
# b2
db2 += np.sum(dsoftmax, axis=0)
# W2
dW2 += H.T.dot(dsoftmax)
# H = X * w1+ b1
dH = dsoftmax.dot(W2.T)
# Max gate
dH[np.where(H <= 0)] = 0.0
# b1
db1 += np.sum(dH, axis=0)
# W1
dW1 += X.T.dot(dH)
grads['W1'] = dW1
grads['b1'] = db1
grads['W2'] = dW2
grads['b2'] = db2
1.2. TwoLayerNet.train
這部分和之前的作業中寫的都差不多,不贅述。
1.2.1 Create a random minibatch of training data and labels
indices = np.random.choice(num_train, batch_size, replace=True)
X_batch = X[indices]
y_batch = y[indices]
1.2.2. Update the parameters
for param in self.params:
self.params[param] -= learning_rate * grads[param]
1.3. TwoLayerNet.predict
這部分和之前的作業中寫的都差不多,不贅述。
scores = self.loss(X)
y_pred = scores.argmax(axis=1)
2. Tune Your Hyperparameters
因為引數較多,所以一下子就將所有引數聯合起來訓練、找合理範圍是不太現實的。我的做法是先在每個引數中找出能得出最高得分的範圍,例如下面各個引數後面的備註,一般是找 3 次,最後得出一個較小的範圍。最後將這四個較小範圍聯合起來訓練。最後在測試集上得出了 0.529 的分數,還不錯。
下面是程式碼:
best_acc = 0.0
best_lr = 0.0
best_bs = 0
best_r = 0.0
num_expriment = 5
hs = 78 # 76~80, 78 is best
lr = 7.4e-4 # 6.6e-4~8.2e-4, 7.4e-4 is best
r = 0.32 # 0.30~0.34, 0.32 is best
bs = 550 # 550~650, 550 is best
hs_rate = (80 - 76) / (num_expriment - 1)
lr_rate = (8.2e-4 - 6.6e-4) / (num_expriment - 1)
r_rate = (0.34 - 0.30) / (num_expriment - 1)
bs_rate = (650 - 550) / (num_expriment - 1)
for hsi in range(num_expriment):
for bsi in range(num_expriment):
for lri in range(num_expriment):
for ri in range(num_expriment):
input_size = 32 * 32 * 3
hidden_size = hs + hsi * hs_rate
num_classes = 10
net = TwoLayerNet(input_size, hidden_size, num_classes)
# Train the network
stats = net.train(X_train, y_train, X_val, y_val,
num_iters=3000, batch_size=(bs + bsi * bs_rate),
learning_rate=(lr + lri * lr_rate), learning_rate_decay=0.95,
reg=(r + ri * r_rate), verbose=False)
# Predict on the validation set
val_acc = (net.predict(X_val) == y_val).mean()
print('hs: ', (hs + hsi * hs_rate), '; lr: ', (lr + lri * lr_rate), '; r: ', (r + ri * r_rate), '; bs: ', (bs + bsi * bs_rate), '; VA: ', val_acc)
if val_acc > best_acc:
best_acc = val_acc
best_net = net
best_bs = bs + bsi * bs_rate
best_lr = lr + lri * lr_rate
best_r = r + ri * r_rate
總結
主要考察的是反向傳播的思想,這點真的很重要!
相關文章
- 【CS231n】Spring 2020 Assignments - Assignment1 - SoftmaxSpring
- cs231n - assignment1 - neural net 梯度推導梯度
- The Network Program Log Two (Scapy)
- layer.oad,layer.open
- 計網學習筆記六 Network Layer Overview筆記View
- Layer NormalizationORM
- Outrageously Large Neural Networks The Sparsely-Gated Mixture-of-Experts Layer
- Other two network issues in the installation of Oracle RAC DBOracle
- SaccadeNet:使用角點特徵進行two-stage預測框精調 | CVPR 2020特徵
- AT_aising2020_f Two Snuke 題解AI
- asp.net中獲取Layer彈出層返回值ASP.NET
- iOS動畫程式設計-Layer動畫[ 1 ] Layer Animations OverviewiOS動畫程式設計View
- Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer 筆記筆記
- coca after two months vs in two months
- iOS動畫——Layer AnimationsiOS動畫
- layer小tips
- iOS動畫程式設計-Layer動畫[ 2 ] Getting Started with Layer AnimationsiOS動畫程式設計
- Netflix推薦系統(Part two)-系統架構架構
- Spring.NetSpring
- Spring Cloud 2020.0.0正式釋出,再見了NetflixSpringCloud
- a tale of two cities
- Two extremes in lifeREM
- Two Pirates - 2
- (7)caffe總結之Blob,Layer and Net以及對應配置檔案的編寫
- layer Tips引數使用
- Layer的實現細節
- asp.net web開發中使用的Web彈窗/層的Layer使用介紹ASP.NETWeb
- Two Types of Error in JAVAErrorJava
- Divide Two IntegersIDE
- Enqueue events part twoENQ
- F - Two Sequence Queries
- Tokitsukaze and Two Colorful Tapes
- 《Towards Good Practices for Very Deep Two-Stream ConvNets》閱讀筆記Go筆記
- Angular6 引用Layer外掛Angular
- 彈出框外掛layer使用
- frame、bounds、layer的position、anchorPoint
- LOGFILE LAYER OPCODE 彙總
- Android layer type與WebView白屏AndroidWebView