python實現線性迴歸之簡單迴歸

西西嘛呦發表於2020-04-29

程式碼來源：https://github.com/eriklindernoren/ML-From-Scratch

首先定義一個基本的迴歸類，作為各種迴歸方法的基類：

class Regression(object):
    """ Base regression model. Models the relationship between a scalar dependent variable y and the independent 
    variables X. 
    Parameters:
    -----------
    n_iterations: float
        The number of training iterations the algorithm will tune the weights for.
    learning_rate: float
        The step length that will be used when updating the weights.
    """
    def __init__(self, n_iterations, learning_rate):
        self.n_iterations = n_iterations
        self.learning_rate = learning_rate

    def initialize_wights(self, n_features):
        """ Initialize weights randomly [-1/N, 1/N] """
        limit = 1 / math.sqrt(n_features)
        self.w = np.random.uniform(-limit, limit, (n_features, ))

    def fit(self, X, y):
        # Insert constant ones for bias weights
        X = np.insert(X, 0, 1, axis=1)
        self.training_errors = []
        self.initialize_weights(n_features=X.shape[1])

        # Do gradient descent for n_iterations
        for i in range(self.n_iterations):
            y_pred = X.dot(self.w)
            # Calculate l2 loss
            mse = np.mean(0.5 * (y - y_pred)**2 + self.regularization(self.w))
            self.training_errors.append(mse)
            # Gradient of l2 loss w.r.t w
            grad_w = -(y - y_pred).dot(X) + self.regularization.grad(self.w)
            # Update the weights
            self.w -= self.learning_rate * grad_w

    def predict(self, X):
        # Insert constant ones for bias weights
        X = np.insert(X, 0, 1, axis=1)
        y_pred = X.dot(self.w)
        return y_pred

說明：初始化時傳入兩個引數，一個是迭代次數，另一個是學習率。initialize_weights()用於初始化權重。fit()用於訓練。需要注意的是，對於原始的輸入X，需要將其最前面新增一項為偏置項。predict()用於輸出預測值。

接下來是簡單線性迴歸，繼承上面的基類：

class LinearRegression(Regression):
    """Linear model.
    Parameters:
    -----------
    n_iterations: float
        The number of training iterations the algorithm will tune the weights for.
    learning_rate: float
        The step length that will be used when updating the weights.
    gradient_descent: boolean
        True or false depending if gradient descent should be used when training. If 
        false then we use batch optimization by least squares.
    """
    def __init__(self, n_iterations=100, learning_rate=0.001, gradient_descent=True):
        self.gradient_descent = gradient_descent
        # No regularization
        self.regularization = lambda x: 0
        self.regularization.grad = lambda x: 0
        super(LinearRegression, self).__init__(n_iterations=n_iterations,
                                            learning_rate=learning_rate)
    def fit(self, X, y):
        # If not gradient descent => Least squares approximation of w
        if not self.gradient_descent:
            # Insert constant ones for bias weights
            X = np.insert(X, 0, 1, axis=1)
            # Calculate weights by least squares (using Moore-Penrose pseudoinverse)
            U, S, V = np.linalg.svd(X.T.dot(X))
            S = np.diag(S)
            X_sq_reg_inv = V.dot(np.linalg.pinv(S)).dot(U.T)
            self.w = X_sq_reg_inv.dot(X.T).dot(y)
        else:
            super(LinearRegression, self).fit(X, y)

這裡使用兩種方式進行計算。如果規定gradient_descent=True，那麼使用隨機梯度下降演算法進行訓練，否則使用標準方程法進行訓練。

最後是使用：

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import make_regression
import sys
sys.path.append("/content/drive/My Drive/learn/ML-From-Scratch/")

from mlfromscratch.utils import train_test_split, polynomial_features
from mlfromscratch.utils import mean_squared_error, Plot
from mlfromscratch.supervised_learning import LinearRegression

def main():

    X, y = make_regression(n_samples=100, n_features=1, noise=20)

    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)

    n_samples, n_features = np.shape(X)

    model = LinearRegression(n_iterations=100)

    model.fit(X_train, y_train)
    
    # Training error plot
    n = len(model.training_errors)
    training, = plt.plot(range(n), model.training_errors, label="Training Error")
    plt.legend(handles=[training])
    plt.title("Error Plot")
    plt.ylabel('Mean Squared Error')
    plt.xlabel('Iterations')
    plt.savefig("test1.png")
    plt.show()

    y_pred = model.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    print ("Mean squared error: %s" % (mse))

    y_pred_line = model.predict(X)

    # Color map
    cmap = plt.get_cmap('viridis')

    # Plot the results
    m1 = plt.scatter(366 * X_train, y_train, color=cmap(0.9), s=10)
    m2 = plt.scatter(366 * X_test, y_test, color=cmap(0.5), s=10)
    plt.plot(366 * X, y_pred_line, color='black', linewidth=2, label="Prediction")
    plt.suptitle("Linear Regression")
    plt.title("MSE: %.2f" % mse, fontsize=10)
    plt.xlabel('Day')
    plt.ylabel('Temperature in Celcius')
    plt.legend((m1, m2), ("Training data", "Test data"), loc='lower right')
    plt.savefig("test2.png")
    plt.show()

if __name__ == "__main__":
    main()

利用sklearn庫生成線性迴歸資料，然後將其拆分為訓練集和測試集。

utils下的mean_squared_error()：

def mean_squared_error(y_true, y_pred):
    """ Returns the mean squared error between y_true and y_pred """
    mse = np.mean(np.power(y_true - y_pred, 2))
    return mse

結果：

Mean squared error: 532.3321383700828

Pytorch 實現簡單線性迴歸
2021-10-18
PyTorch
EVIEWS 簡單線性迴歸 02
2020-11-11
View
pytorch實現線性迴歸
2024-05-31
PyTorch
TensorFlow實現線性迴歸
2019-06-05
梯度下降法實現最簡單線性迴歸問題python實現
2018-11-01
梯度Python
機器學習之線性迴歸(純python實現)
2019-03-04
機器學習Python
線性迴歸——lasso迴歸和嶺迴歸（ridge regression）
2019-05-11
線性迴歸與邏輯迴歸
2019-07-08
邏輯迴歸
【機器學習】線性迴歸python實現
2019-01-17
機器學習Python
線性迴歸實戰
2021-05-29
線性迴歸
2024-11-17
線性迴歸 go 語言實現
2020-04-16
Go
1.3 - 線性迴歸
2024-03-18
Python學習筆記-StatsModels 統計迴歸（1）線性迴歸
2021-05-06
Python筆記
【pytorch_5】線性迴歸的實現
2020-10-03
PyTorch
對比線性迴歸、邏輯迴歸和SVM
2018-08-13
邏輯迴歸
簡明線性迴歸演算法
2024-10-10
演算法
線性迴歸推導
2019-02-22
4-線性迴歸
2024-08-23
1維線性迴歸
2022-04-08
線性迴歸總結
2020-12-26
多元線性迴歸模型
2020-12-03
模型
機器學習 | 線性迴歸與邏輯迴歸
2020-09-23
機器學習邏輯迴歸
線性迴歸—求解介紹及迴歸擴充套件
2018-04-17
套件
線性迴歸-如何對資料進行迴歸分析
2020-12-21
機器學習之線性迴歸
2020-02-07
機器學習
【機器學習】線性迴歸sklearn實現
2019-01-17
機器學習
【深度學習 01】線性迴歸+PyTorch實現
2022-03-27
深度學習PyTorch
線性迴歸：最小二乘法實現
2021-01-10
採用線性迴歸實現訓練和預測（Python）
2024-10-30
Python
對數機率迴歸（邏輯迴歸）原理與Python實現
2021-01-10
邏輯迴歸Python
【深度學習基礎-10】簡單線性迴歸（上）
2019-01-11
深度學習
【深度學習基礎-11】簡單線性迴歸（下）--例項及python程式碼實現
2019-01-11
深度學習Python
用Python實現線性迴歸，8種方法哪個最高效？
2018-04-19
Python
spark-mlib線性迴歸
2018-11-24
Spark
線性迴歸-程式碼庫
2024-08-27
PRML 迴歸的線性模型
2022-03-01
模型
資料分析：線性迴歸
2022-05-19

python實現線性迴歸之簡單迴歸

相關文章