pytorch和tensorflow的愛恨情仇之定義可訓練的引數

西西嘛呦發表於2020-10-06

pytorch版本：1.6.0

tensorflow版本：1.15.0

之前我們就已經瞭解了pytorch和tensorflow中的變數，本節我們深入瞭解可訓練的引數-變數

接下來我們將使用sklearn自帶的iris資料集來慢慢品味。

1、pytorch

（1）第一種方式，不使用nn.Module或nn.Sequntial()來建立模型的情況下自定義引數；

載入資料集並轉換為tensot：

import torch
import torch.nn.functional as F
import numpy as np
from sklearn.datasets import load_iris
iris = load_iris()
data=iris.data
target = iris.target

data = torch.from_numpy(data).float()  #(150,4)
target = torch.from_numpy(target).long()  #(150,3)
batch_size=data.shape[0]  #設定batchsize的大小就是所有資料
dataset = torch.utils.data.TensorDataset(data, target)      # 設定資料集
train_iter = torch.utils.data.DataLoader(dataset, batch_size, shuffle=True) # 設定獲取資料方式

自己定義好要訓練的引數：

classes = 3
input = 4
hidden = 10

w_0 = torch.tensor(np.random.normal(0, 0.01, (input, hidden)), dtype=torch.float)
b_0 = torch.zeros(hidden, dtype=torch.float)
w_1 = torch.tensor(np.random.normal(0, 0.01, (hidden, classes)), dtype=torch.float)
b_1 = torch.zeros(classes, dtype=torch.float)

我們可以在定義引數的時候指定requires_grad=True使其為可訓練的引數，也可以使用如下方式：

params = [w_0, b_0, w_1, b_1]
for param in params:
    param.requires_grad_(requires_grad=True)

定義學習率、優化器、損失函式、網路

lr = 5
optimizer = None
criterion = torch.nn.CrossEntropyLoss()
epoch = 1000

def sgd(params, lr, batch_size):
    for param in params:
        param.data -= lr * param.grad / batch_size # 注意這裡更改param時用的param.data  

def net(x):
    h = torch.matmul(x,w_0)+b_0
    h = F.relu(h)
    output = torch.matmul(h,w_1)+b_1
    #output = F.softmax(output,dim=1)
    return output

為了更加清楚引數訓練的過程，這裡我們不使用pytorch自帶的，而是我們自己定義的隨機梯度下降。

定義訓練主函式：

def train(net,params,lr,train_iter):
    for i in range(1,epoch+1):
        for x,y in train_iter:
            output = net(x)
            loss = criterion(output,y) 
            # 梯度清零
            if optimizer is not None:
                optimizer.zero_grad()
            elif params is not None and params[0].grad is not None:
                for param in params:
                    param.grad.data.zero_()
            loss.backward()
            if optimizer is None:
                sgd(params, lr, batch_size)
            else:
                optimizer.step()  # “softmax迴歸的簡潔實現”一節將用到
            acc = (output.argmax(dim=1) == y).sum().item() / data.shape[0]
            print("epoch:{:03d} loss:{:.4f} acc:{:.4f}".format(i,loss.item(),acc))
train(net=net,params=params,lr=lr,train_iter=train_iter)

從這裡我們也可以看到optimizer.zero_grad()和optimizer.step()的作用了，以上便是我們自定義訓練引數的完整過程了，看下結果：

epoch:994 loss:0.0928 acc:0.9800
epoch:995 loss:0.0927 acc:0.9800
epoch:996 loss:0.0926 acc:0.9800
epoch:997 loss:0.0926 acc:0.9800
epoch:998 loss:0.0925 acc:0.9800
epoch:999 loss:0.0925 acc:0.9800
epoch:1000 loss:0.0924 acc:0.9800

（2）使用nn.Sequential()來構建模型，進行引數初始化：

匯入相應的包並載入資料集：

import torch
import torch.nn as nn
import torch.nn.init as init
import torch.nn.functional as F
import numpy as np
from sklearn.datasets import load_iris
iris = load_iris()
data=iris.data
target = iris.target

轉換為pytorch資料格式：

data = torch.from_numpy(data).float()
target = torch.from_numpy(target).long()
batch_size=data.shape[0]
dataset = torch.utils.data.TensorDataset(data, target)      # 設定資料集
train_iter = torch.utils.data.DataLoader(dataset, batch_size, shuffle=True) # 設定獲取資料方式

定義相關超引數：

classes = 3
input = 4
hidden = 10
lr = 4
optimizer = None

定義網路：

net = nn.Sequential(
    nn.Linear(input,hidden),
    nn.ReLU(),
    nn.Linear(hidden,classes),
)

引數初始化：

for name,param in net.named_parameters(): #使用model.named_parameters()可以獲得相應層的名字的引數以及具體值
    if "weight" in name:
        init.normal_(param, mean=0, std=0.01)
    if "bias" in name:
        init.zeros_(param)

自定義隨機梯度下降優化器：

def sgd(params, lr, batch_size):
    for param in params:
        param.data -= lr * param.grad / batch_size # 注意這裡更改param時用的param.data

訓練主迴圈：

epoch = 1000
criterion = torch.nn.CrossEntropyLoss()
def train(net,lr,train_iter):
    for i in range(1,epoch+1):
        for x,y in train_iter:
            output = net(x)
            loss = criterion(output,y) 
            # 梯度清零
            if optimizer is not None:
                optimizer.zero_grad()
            elif net.parameters() is not None:
                for param in net.parameters():
                    if param.grad is not None:
                        param.grad.data.zero_()
            loss.backward()
            if optimizer is None:
                sgd(net.parameters(), lr, batch_size)
            else:
                optimizer.step()  # “softmax迴歸的簡潔實現”一節將用到
            acc = (output.argmax(dim=1) == y).sum().item() / data.shape[0]
            print("epoch:{:03d} loss:{:.4f} acc:{:.4f}".format(i,loss.item(),acc))
    return 
train(net=net,lr=lr,train_iter=train_iter)

結果：

（3）使用pytorch自帶的優化器

我們只需要將optimizer設定為以下即可：

optimizer = torch.optim.SGD(net.parameters(), lr=0.05)

需要注意的是學習率這裡需要設定的比較小一點，和上面設定的有所不同，結果如下：

（4）使用nn.Module來構建網路，自定義引數並進行初始化

我們只需要修改以下地方即可：

class Net(nn.Module):
    def __init__(self,input,hidden,classes):
        super(Net, self).__init__()
        self.input = input
        self.hidden = hidden
        self.classes = classes
        
        self.w0 = nn.Parameter(torch.Tensor(self.input,self.hidden))
        self.b0 = nn.Parameter(torch.Tensor(self.hidden))
        self.w1 = nn.Parameter(torch.Tensor(self.hidden,self.classes))
        self.b1 = nn.Parameter(torch.Tensor(self.classes))
        self.reset_parameters()
        
    def reset_parameters(self):
        nn.init.normal_(self.w0)
        nn.init.constant_(self.b0,0)
        nn.init.normal_(self.w1)
        nn.init.constant_(self.b1,0)
        
        
    def forward(self,x):
        out = torch.matmul(x,self.w0)+self.b0
        out = F.relu(out)
        out = torch.matmul(out,self.w1)+self.b1
        return out
net = Net(input,hidden,classes)
optimizer = torch.optim.SGD(net.parameters(), lr=0.05)

結果：

（5）使用nn.Module()構建網路，並使用各層中的引數並進行初始化

class Net(nn.Module):
    def __init__(self,input,hidden,classes):
        super(Net, self).__init__()
        self.input = input
        self.hidden = hidden
        self.classes = classes
        self.fc1 = nn.Linear(self.input,self.hidden)
        self.fc2 = nn.Linear(self.hidden,self.classes)
        
        for m in self.modules():
            if isinstance(m, nn.Linear):
                nn.init.normal_(m.weight,0,0.01)
                nn.init.constant_(m.bias, 0)      
        
    def forward(self,x):
        out = self.fc1(x)
        out = F.relu(out)
        out = self.fc2(out)
        return out
net = Net(input,hidden,classes)
optimizer = torch.optim.SGD(net.parameters(), lr=0.05)

結果：

PyTorch 中引數的預設初始化在各個層的 reset_parameters() 方法

我們看下官方的Linear層的實現：

官方Linear層：
class Linear(Module):
    def __init__(self, in_features, out_features, bias=True):
        super(Linear, self).__init__()
        self.in_features = in_features
        self.out_features = out_features
        self.weight = Parameter(torch.Tensor(out_features, in_features))
        if bias:
            self.bias = Parameter(torch.Tensor(out_features))
        else:
            self.register_parameter('bias', None)
        self.reset_parameters()

    def reset_parameters(self):
        stdv = 1. / math.sqrt(self.weight.size(1))
        self.weight.data.uniform_(-stdv, stdv)
        if self.bias is not None:
            self.bias.data.uniform_(-stdv, stdv)

    def forward(self, input):
        return F.linear(input, self.weight, self.bias)

    def extra_repr(self):
        return 'in_features={}, out_features={}, bias={}'.format(
            self.in_features, self.out_features, self.bias is not None
        )

（6）最後我們來看下從網路中獲取引數名字和引數值的一些例子

我們以這個網路為例：

class Net(nn.Module):
    def __init__(self,input,hidden,classes):
        super(Net, self).__init__()
        self.input = input
        self.hidden = hidden
        self.classes = classes
        self.fc1 = nn.Linear(self.input,self.hidden)
        self.fc2 = nn.Linear(self.hidden,self.classes)
        
        for m in self.modules():
            if isinstance(m, nn.Linear):
                nn.init.normal_(m.weight,0,0.01)
                nn.init.constant_(m.bias, 0)      
        
    def forward(self,x):
        out = self.fc1(x)
        out = F.relu(out)
        out = self.fc2(out)
        return out
net = Net(input,hidden,classes)

首先是model.state_dict()：是一個引數字典，鍵是引數的名稱，值是引數的值：

for name,value in net.state_dict().items():
    print(name,value)

接著是：model.parameters()：返回的是一個generator，我們之前也經常使用，通過param.data，param.data.grad來獲取引數的值以及梯度

for param in net.parameters():
    print(param.data,param.grad)

接著是model.named_parameters()：返回的是一個具名引數，也就是包含了引數的名稱

for name,param in net.named_parameters():
    print(name,param)

最後講下的是self.modules()：一般是在網路初始化中使用，返回的是網路中的具體層，我們可以通過其對不同層進行引數初始化，比如nn.Conv2d、nn.Linear等；

參考：

https://www.cnblogs.com/KaifengGuan/p/12332072.html

https://www.geekschool.org/2020/08/02/13455.html

https://blog.csdn.net/weixin_44058333/article/details/92691656

（2）tensorflow

匯入相應的包並載入資料：

import tensorflow as tf
import numpy as np
from sklearn.datasets import load_iris
from sklearn.preprocessing import OneHotEncoder
iris = load_iris()
data=iris.data
target = iris.target

將標籤轉換為onehot編碼：

oneHotEncoder = OneHotEncoder(sparse=False)
onehot_target = oneHotEncoder.fit_transform(target.reshape(-1,1))
print(onehot_target)

定義超引數以及可訓練的引數：

input=4
hidden=10
classes=3
w0=tf.Variable(tf.random.normal([input,hidden],stddev=0.01,seed=1))
b0=tf.Variable(tf.zeros([hidden]))
w1=tf.Variable(tf.random.normal([hidden,classes],stddev=0.01,seed=1))
b1=tf.Variable(tf.zeros([classes]))

定義計算圖中的佔位符：

x = tf.placeholder(tf.float32,shape=(None,input),name="x-input") #輸入資料
y_ = tf.placeholder(tf.float32,shape=(None,classes),name="y-input") #真實標籤

定義網路、損失函式和優化器：

def net(x):
    hid = tf.add(tf.matmul(x,w0),b0)
    hid = tf.nn.relu(hid)
    out = tf.add(tf.matmul(hid,w1),b1)
    out = tf.nn.softmax(out)
    return out
y = net(x)  
cross_entropy = -tf.reduce_mean(y_*tf.log(tf.clip_by_value(y,1e-10,1.0)) \
                    + (1-y_)*tf.log(tf.clip_by_value(1-y,1e-10,1.0)))

optimizer=tf.compat.v1.train.GradientDescentOptimizer(learning_rate=0.05).minimize(cross_entropy)

訓練迴圈：

epoch = 1000
with tf.compat.v1.Session() as sess: #建立會話
    init_op = tf.global_variables_initializer() #初始化引數
    sess.run(init_op)
    for epoch in range(1,epoch+1):
        sess.run(optimizer,feed_dict={x:data,y_:onehot_target}) #傳入資料給優化器
        y_pred = sess.run(y,feed_dict={x:data}) #計算輸出
        total_cross_entropy = sess.run(cross_entropy,feed_dict={y:y_pred,y_:onehot_target}) #計算交叉熵
        pred = tf.argmax(y_pred,axis = 1) # 取出行中最大值的索引，也就是取出其中概率最大的索引
        

        correct = tf.cast(tf.equal(pred,target),dtype=tf.int32) # 判斷與測試集的標籤是否相等並且轉換bool為int型
        correct = tf.reduce_sum(correct) # 沿著指定維度的和，不指定axis則預設為所有元素的和
        acc = correct.eval() / data.shape[0]
        print("epoch:{} loss:{:.4f} acc:{:.4f}".format(epoch, total_cross_entropy,acc))

結果：

但感覺訓練1000個epoch比pytorch慢好多。。

pytorch和tensorflow的愛恨情仇之引數初始化
2020-10-07
PyTorch
pytorch和tensorflow的愛恨情仇之基本資料型別
2020-10-02
PyTorch資料型別
pytorch和tensorflow的愛恨情仇之一元線性迴歸例子（keras插足啦）
2020-12-16
PyTorchKeras
setTimeout&Promise&Async之間的愛恨情仇
2018-11-28
Promise
[譯] React 路由和 React 元件的愛恨情仇
2018-12-25
React路由元件
Pytorch之Embedding與Linear的愛恨糾葛
2023-02-13
PyTorch
Charles與Fiddler的愛恨情仇之讓抓包飛起來
2024-04-11
與數論的愛恨情仇--01：判斷大素數的Miller-Rabin
2019-04-15
pytorch dataloader num_workers引數設定導致訓練阻塞
2020-10-02
PyTorch
走進volatile的世界，探索它與可見性，有序性，原子性之間的愛恨情仇！
2024-03-18
引數的定義和引數的傳遞
2024-11-28
傳說中圖片防盜鏈的愛恨情仇
2019-04-17
S 鎖與 X 鎖的愛恨情仇《死磕MySQL系列四》
2021-11-02
MySql
MySQL常見的兩種儲存引擎：MyISAM與InnoDB的愛恨情仇
2018-06-05
MySql儲存引擎
帶你瞭解COD與BF的愛恨情仇——歸本溯源（下篇）
2019-07-15
pytorch訓練GAN時的detach()
2020-11-09
PyTorch
深度學習煉丹-超引數設定和網路訓練
2022-12-12
深度學習
Python3之函式的引數傳遞與引數定義
2018-09-01
Python函式
pytorch指定GPU訓練
2020-10-16
PyTorchGPU
Pytorch分散式訓練
2022-07-14
PyTorch分散式
獲取和生成基於TensorFlow的MobilNet預訓練模型
2020-11-03
模型
如何高效定義和驗證restful請求的引數
2020-10-21
REST
方法(函式)的定義與引數
2018-11-25
函式
01_pytorch和tensorflow的區別
2021-04-13
PyTorch
直播預告|一鍵觀看關聯網路與團伙欺詐的愛恨情仇
2022-11-08
使用PaddleFluid和TensorFlow訓練序列標註模型
2018-07-11
UI模型
使用Tensorflow Object Detection進行訓練和推理
2021-04-26
Object
pytorch---之固定某些層權重再訓練
2019-03-09
PyTorch
PyTorch預訓練Bert模型
2020-11-17
PyTorch模型
模型的列表定義中，使用函式時如何定義引數？
2020-04-04
模型函式
訓練自己的Android TensorFlow神經網路
2020-10-25
Android神經網路
變數的定義和使用
2020-11-22
變數
Day10 函式基礎+函式三種定義形式 + 函式的返回值、物件和引數 + 可變長引數
2024-10-24
函式物件
用SSD-Pytorch訓練自己的資料集
2019-03-26
PyTorch
MxNet預訓練模型到Pytorch模型的轉換
2018-06-28
模型PyTorch
輕量化模型訓練加速的思考（Pytorch實現）
2020-09-01
模型PyTorch
【AI】精妙的Pytorch訓練視覺化工具tensorboardX
2019-03-03
AIPyTorch視覺化ORB
【AI】Pytorch_預訓練模型
2021-08-26
AIPyTorch模型

pytorch和tensorflow的愛恨情仇之定義可訓練的引數

相關文章