演算法描述
Ridge regression uses the same simple linear regression model but adds an additional penalty on the \(L2\)-norm of the coefficients to the loss function. This is sometimes known as Tikhonov regularization.
In particular, the ridge model is the same as the ordinary Linear Squares model:
\[\mathbf{y} = \mathbf{bX} + \epsilon
\]
where $\epsilon\sim\mathcal{N}(0, \sigma^{2}), except now the error for the model is calculated as:
\[\mathcal{L} = \Vert \mathbf{y}-\mathbf{b}\mathbf{X}\Vert_{2}^{2} + \alpha \Vert\mathbf{b}\Vert_{2}^{2}
\]
The MLE for the model parameter \(\mathbf{b}\) can be computed in closed form via the adjusted normal equation:
\[\mathcal{\hat{b}}_{\text{ridge}} = (\mathbf{X}^{\top}\mathbf{X} + \alpha\mathbf{I})^{-1} \mathbf{X}^{\top}\mathbf{y}
\]
where \((\mathbf{X}^{\top}\mathbf{X} + \alpha\mathbf{I})^{-1}\) is the pseudo-inverse/Moore-Penrose inverse adjusted for \(L2\) penalty on the model coefficients.
程式碼實現
import numpy as np
def fit(X, y, fit_intercept=True, alpha=1):
if fit_intercept:
X = np.c_[np.ones(X.shape[0]), X]
A = alpha * np.eye(X.shape[1])
pseudo_inverse = np.linalg.inv(X.T @ X + A) @ X.T
beta = pseudo_inverse @ y
return beta
參考資料
code
doc-overview
doc