【CS231n】Spring 2020 Assignments - Assignment1 - Two Layer Net
文章目錄
前言
我的作業是在 Google Colaboratory 上完成的,還是挺方便的。
注意:該次作業應該是聽完卷積神經網路才做的
1. Implementing a Neural Network
1.1. TwoLayerNet.loss
1.1.1. Scores
首先計算得分 Scores ,即正向傳播。雖然很簡單,但是別忘了兩個全聯接層之間還有一個非線形層(RELU),直接放程式碼:
XW1 = X.dot(W1)
XW1pb1 = XW1 + b1
H = np.maximum(0, XW1pb1)
HW2 = H.dot(W2)
HW2pb2 = HW2 + b2
scores = HW2pb2
1.1.2. Loss
接下來是計算損失 Loss 。其實在前面的作業中我們已經推匯出了 Softmax 的損失是什麼了(如下),這邊也差不多:
L i = − l o g ( e s y i ∑ j e s j ) L_i = -\ log(\frac{e^{s_{y_i}}}{\sum_{j}{e^{s_j}}}) Li=− log(∑jesjesyi)
但是需要注意的是:我們現在有兩個特徵矩陣 W1 和 W2 了,所以正則化懲罰就有兩項了。
positive_scores = np.exp(scores)
softmax = positive_scores / np.sum(positive_scores, axis=1).reshape((N, 1))
loss = np.sum((-np.log(softmax))[range(N), y])
loss /= N
loss += (reg * np.sum(W1*W1) + reg * np.sum(W2*W2))
1.1.3. dW
雖然計算梯度矩陣 dW 看起來非常複雜,但是細心一點還是沒問題的。
整體思路就是畫出計算圖(Computational Graph),根據鏈式法則一步一步反向傳播,千萬不要傻傻得想要推出損失函式對每個超引數的求導公式!
計算圖如下:
完整程式碼如下:
W1 = np.zeros_like(W1)
dW2 = np.zeros_like(W2)
db1 = np.zeros_like(b1)
db2 = np.zeros_like(b2)
# reg = softmax + (reg * np.sum(W1*W1) + reg * np.sum(W2*W2)
dW1 += 2 * reg * W1
dW2 += 2 * reg * W2
# softmax( H * W2 + b2 )
softmax[range(N), y] -= 1.0
dsoftmax = softmax / N
# b2
db2 += np.sum(dsoftmax, axis=0)
# W2
dW2 += H.T.dot(dsoftmax)
# H = X * w1+ b1
dH = dsoftmax.dot(W2.T)
# Max gate
dH[np.where(H <= 0)] = 0.0
# b1
db1 += np.sum(dH, axis=0)
# W1
dW1 += X.T.dot(dH)
grads['W1'] = dW1
grads['b1'] = db1
grads['W2'] = dW2
grads['b2'] = db2
1.2. TwoLayerNet.train
這部分和之前的作業中寫的都差不多,不贅述。
1.2.1 Create a random minibatch of training data and labels
indices = np.random.choice(num_train, batch_size, replace=True)
X_batch = X[indices]
y_batch = y[indices]
1.2.2. Update the parameters
for param in self.params:
self.params[param] -= learning_rate * grads[param]
1.3. TwoLayerNet.predict
這部分和之前的作業中寫的都差不多,不贅述。
scores = self.loss(X)
y_pred = scores.argmax(axis=1)
2. Tune Your Hyperparameters
因為引數較多,所以一下子就將所有引數聯合起來訓練、找合理範圍是不太現實的。我的做法是先在每個引數中找出能得出最高得分的範圍,例如下面各個引數後面的備註,一般是找 3 次,最後得出一個較小的範圍。最後將這四個較小範圍聯合起來訓練。最後在測試集上得出了 0.529 的分數,還不錯。
下面是程式碼:
best_acc = 0.0
best_lr = 0.0
best_bs = 0
best_r = 0.0
num_expriment = 5
hs = 78 # 76~80, 78 is best
lr = 7.4e-4 # 6.6e-4~8.2e-4, 7.4e-4 is best
r = 0.32 # 0.30~0.34, 0.32 is best
bs = 550 # 550~650, 550 is best
hs_rate = (80 - 76) / (num_expriment - 1)
lr_rate = (8.2e-4 - 6.6e-4) / (num_expriment - 1)
r_rate = (0.34 - 0.30) / (num_expriment - 1)
bs_rate = (650 - 550) / (num_expriment - 1)
for hsi in range(num_expriment):
for bsi in range(num_expriment):
for lri in range(num_expriment):
for ri in range(num_expriment):
input_size = 32 * 32 * 3
hidden_size = hs + hsi * hs_rate
num_classes = 10
net = TwoLayerNet(input_size, hidden_size, num_classes)
# Train the network
stats = net.train(X_train, y_train, X_val, y_val,
num_iters=3000, batch_size=(bs + bsi * bs_rate),
learning_rate=(lr + lri * lr_rate), learning_rate_decay=0.95,
reg=(r + ri * r_rate), verbose=False)
# Predict on the validation set
val_acc = (net.predict(X_val) == y_val).mean()
print('hs: ', (hs + hsi * hs_rate), '; lr: ', (lr + lri * lr_rate), '; r: ', (r + ri * r_rate), '; bs: ', (bs + bsi * bs_rate), '; VA: ', val_acc)
if val_acc > best_acc:
best_acc = val_acc
best_net = net
best_bs = bs + bsi * bs_rate
best_lr = lr + lri * lr_rate
best_r = r + ri * r_rate
總結
主要考察的是反向傳播的思想,這點真的很重要!
相關文章
- 【CS231n】Spring 2020 Assignments - Assignment1 - SoftmaxSpring
- HDU 3661 Assignments
- AT_aising2020_f Two Snuke 題解AI
- layer.oad,layer.open
- Layer NormalizationORM
- coca after two months vs in two months
- cs231n lecture1 introduction
- (7)caffe總結之Blob,Layer and Net以及對應配置檔案的編寫
- www.wg2020.net@17008768000@www.wg2020.net@我要一千萬
- 論文解讀(GRCCA)《 Graph Representation Learning via Contrasting Cluster Assignments》AST
- @www.wg2020.net@17084222211@www.wg2020.net@我是大神仙@
- Spring.Net 依賴注入Spring依賴注入
- TensorRT IRNNv2LayerRNN
- Pytorch MNIST Multi-layerPyTorch
- layer Tips引數使用
- Two Pirates - 2
- Two Pointer Method
- layer open div 注意事項
- 11.23 Two Different Worlds
- LeetCode | 1 Two SumLeetCode
- Tokitsukaze and Two Colorful Tapes
- F - Two Sequence Queries
- Merge Two Sorted List
- Leetcode 231 Power of TwoLeetCode
- Leetcode 1 two sumLeetCode
- 2020-5-23-SpringSpring
- CPNDet:粗暴地給CenterNet加入two-stage精調,更快更強 | ECCV 2020
- Guide to app architecture 2 - UI layer OverviewGUIIDEAPPView
- Angular6 引用Layer外掛Angular
- 關於layer.open彈框
- SaccadeNet:使用角點特徵進行two-stage預測框精調 | CVPR 2020特徵
- The Network Program Log Two (Scapy)
- LeetCode | 349 Intersection Of Two ArraysLeetCode
- H-Two Convex PolygonsGo
- B. Two Out of Three
- Study for Go ! Chapter two - ExpressionGoAPTExpress
- LeetCode-1 Two SumLeetCode
- Two useful scenarios of git stashiOSGit