【卷積神經網路學習】(4)機器學習

kiloGrand發表於2020-10-27

用jupyter notebook寫的第一個機器學習

模型:線性函式
演算法:梯度下降

import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
data = pd.read_csv('dataset/Income1.csv') # 匯入資料
data
Unnamed: 0EducationIncome
0110.00000026.658839
1210.40133827.306435
2310.84280922.132410
3411.24414721.169841
4511.64548515.192634
5612.08695726.398951
6712.48829417.435307
7812.88963225.507885
8913.29097036.884595
91013.73244139.666109
101114.13377934.396281
111214.53511741.497994
121314.97658944.981575
131415.37792647.039595
141515.77926448.252578
151616.22073657.034251
161716.62207451.490919
171817.02341161.336621
181917.46488357.581988
192017.86622168.553714
202118.26755964.310925
212218.70903068.959009
222319.11036874.614639
232419.51170671.867195
242519.91304376.098135
252620.35451575.775218
262720.75585372.486055
272821.15719177.355021
282921.59866272.118790
293022.00000080.260571
data.info()  # 檢視資料資訊
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 30 entries, 0 to 29
Data columns (total 3 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   Unnamed: 0  30 non-null     int64  
 1   Education   30 non-null     float64
 2   Income      30 non-null     float64
dtypes: float64(2), int64(1)
memory usage: 848.0 bytes

畫散點圖

plt.scatter(data.Education, data.Income)
plt.xlabel('education')
plt.ylabel('income')
plt.show()

1

from torch import nn
X = torch.from_numpy(data.Education.values.reshape(-1, 1).astype(np.float32))

從data.Education中提取value,並重新把它變成只有1列的矩陣,同時把資料型別轉化為np.float32,然後把這個numpy型別轉換成torch型別

X
tensor([[10.0000],
        [10.4013],
        [10.8428],
        [11.2441],
        [11.6455],
        [12.0870],
        [12.4883],
        [12.8896],
        [13.2910],
        [13.7324],
        [14.1338],
        [14.5351],
        [14.9766],
        [15.3779],
        [15.7793],
        [16.2207],
        [16.6221],
        [17.0234],
        [17.4649],
        [17.8662],
        [18.2676],
        [18.7090],
        [19.1104],
        [19.5117],
        [19.9130],
        [20.3545],
        [20.7559],
        [21.1572],
        [21.5987],
        [22.0000]])
X.shape
torch.Size([30, 1])
Y = torch.from_numpy(data.Income.values.reshape(-1, 1).astype(np.float32))
Y
tensor([[26.6588],
        [27.3064],
        [22.1324],
        [21.1698],
        [15.1926],
        [26.3990],
        [17.4353],
        [25.5079],
        [36.8846],
        [39.6661],
        [34.3963],
        [41.4980],
        [44.9816],
        [47.0396],
        [48.2526],
        [57.0343],
        [51.4909],
        [61.3366],
        [57.5820],
        [68.5537],
        [64.3109],
        [68.9590],
        [74.6146],
        [71.8672],
        [76.0981],
        [75.7752],
        [72.4861],
        [77.3550],
        [72.1188],
        [80.2606]])
Y.shape
torch.Size([30, 1])

建立模型

model = nn.Linear(1, 1)  # output = w @ input + b , output和input長度為1 @代表矩陣乘法
loss_fn = nn.MSELoss()   # 建立損失函式
opt = torch.optim.SGD(model.parameters(), lr=0.0001) # 優化的設定
for epoch in range(5000):       # 訓練全部資料
    for x,y in zip(X,Y):
        y_pred = model(x)       # 使用模型預測
        loss = loss_fn(y, y_pred) # 計算損失
        opt.zero_grad()            # 梯度清零
        loss.backward()            # 計算梯度
        opt.step()                 # 優化模型引數w和b
model.weight  # 檢視訓練完成的w
Parameter containing:
tensor([[3.4113]], requires_grad=True)
model.bias    # 檢視訓練完成的b
Parameter containing:
tensor([0.4944], requires_grad=True)

結果視覺化

plt.scatter(data.Education, data.Income)
plt.plot(X.numpy(), model(X).data.numpy(), c='r')  # .numpy()是把tensor轉化成numpy中的array型別
plt.show()

2

相關文章