《機器學習Python實現_10_06_整合學習_boosting_gbdt分類實現》

一.利用迴歸樹實現分類

分類也可以用迴歸樹來做，簡單說來就是訓練與類別數相同的幾組迴歸樹，每一組代表一個類別，然後對所有組的輸出進行softmax操作將其轉換為概率分佈，然後再通過交叉熵或者KL一類的損失函式求每顆樹相應的負梯度，指導下一輪的訓練，以三分類為例，流程如下：

png

二.softmax+交叉熵損失，及其梯度求解

分類問題，一般會選擇用交叉熵作為損失函式，下面對softmax+交叉熵損失函式的梯度做推導：

softmax函式在最大熵那一節已有使用，再回顧一下：

\[softmax([y_1^{hat},y_2^{hat},...,y_n^{hat}])=\frac{1}{\sum_{i=1}^n e^{y_i^{hat}}}[e^{y_1^{hat}},e^{y_2^{hat}},...,e^{y_n^{hat}}] \]

交叉熵在logistic迴歸有介紹：

\[cross\_entropy(y,p)=-\sum_{i=1}^n y_ilog p_i \]

將\(p_i\)替換為\(\frac{e^{y_i^{hat}}}{\sum_{i=1}^n e^{y_i^{hat}}}\)即是我們的損失函式：

\[L(y^{hat},y)=-\sum_{i=1}^n y_ilog \frac{e^{y_i^{hat}}}{\sum_{j=1}^n e^{x_j^{hat}}}\\ =-\sum_{i=1}^n y_i(y_i^{hat}-log\sum_{j=1}^n e^{y_j^{hat}})\\ =log\sum_{i=1}^n e^{y_i^{hat}}-\sum_{i=1}^ny_iy_i^{hat}（由於是onehot展開，所以\sum_{i=1}^n y_i=1） \]

計算梯度：

\[\frac{\partial L(y^{hat},y)}{\partial y^{hat}}=softmax([y_1^{hat},y_2^{hat},...,y_n^{hat}])-[y_1,y_2,...,y_n] \]

所以，第一組迴歸樹的擬合目標為\(y_1-\frac{e^{y_1^{hat}}}{\sum_{i=1}^n e^{y_i^{hat}}}\)，第二組迴歸樹學習的擬合目標為\(y_2-\frac{e^{y_2^{hat}}}{\sum_{i=1}^n e^{y_i^{hat}}}\)，....，第\(n\)組迴歸樹的擬合目標為\(y_n-\frac{e^{y_n^{hat}}}{\sum_{i=1}^n e^{y_i^{hat}}}\)

三.程式碼實現

import os
os.chdir('../')
from ml_models.tree import CARTRegressor
from ml_models import utils
import copy
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

class GradientBoostingClassifier(object):
    def __init__(self, base_estimator=None, n_estimators=10, learning_rate=1.0):
        """
        :param base_estimator: 基學習器，允許異質；異質的情況下使用列表傳入比如[estimator1,estimator2,...,estimator10],這時n_estimators會失效；
                                同質的情況，單個estimator會被copy成n_estimators份
        :param n_estimators: 基學習器迭代數量
        :param learning_rate: 學習率，降低後續基學習器的權重，避免過擬合
        """
        self.base_estimator = base_estimator
        self.n_estimators = n_estimators
        self.learning_rate = learning_rate
        if self.base_estimator is None:
            # 預設使用決策樹樁
            self.base_estimator = CARTRegressor(max_depth=2)
        # 同質分類器
        if type(base_estimator) != list:
            estimator = self.base_estimator
            self.base_estimator = [copy.deepcopy(estimator) for _ in range(0, self.n_estimators)]
        # 異質分類器
        else:
            self.n_estimators = len(self.base_estimator)

        # 擴充套件class_num組分類器
        self.expand_base_estimators = []

    def fit(self, x, y):
        # 將y轉one-hot編碼
        class_num = np.amax(y) + 1
        y_cate = np.zeros(shape=(len(y), class_num))
        y_cate[np.arange(len(y)), y] = 1

        # 擴充套件分類器
        self.expand_base_estimators = [copy.deepcopy(self.base_estimator) for _ in range(class_num)]

        # 擬合第一個模型
        y_pred_score_ = []
        # TODO:並行優化
        for class_index in range(0, class_num):
            self.expand_base_estimators[class_index][0].fit(x, y_cate[:, class_index])
            y_pred_score_.append(self.expand_base_estimators[class_index][0].predict(x))
        y_pred_score_ = np.c_[y_pred_score_].T
        # 計算負梯度
        new_y = y_cate - utils.softmax(y_pred_score_)
        # 訓練後續模型
        for index in range(1, self.n_estimators):
            y_pred_score = []
            for class_index in range(0, class_num):
                self.expand_base_estimators[class_index][index].fit(x, new_y[:, class_index])
                y_pred_score.append(self.expand_base_estimators[class_index][index].predict(x))
            y_pred_score_ += np.c_[y_pred_score].T * self.learning_rate
            new_y = y_cate - utils.softmax(y_pred_score_)

    def predict_proba(self, x):
        # TODO:並行優化
        y_pred_score = []
        for class_index in range(0, len(self.expand_base_estimators)):
            estimator_of_index = self.expand_base_estimators[class_index]
            y_pred_score.append(
                np.sum(
                    [estimator_of_index[0].predict(x)] +
                    [self.learning_rate * estimator_of_index[i].predict(x) for i in
                     range(1, self.n_estimators - 1)] +
                    [estimator_of_index[self.n_estimators - 1].predict(x)]
                    , axis=0)
            )
        return utils.softmax(np.c_[y_pred_score].T)

    def predict(self, x):
        return np.argmax(self.predict_proba(x), axis=1)

#造偽資料
from sklearn.datasets import make_classification
data, target = make_classification(n_samples=100, n_features=2, n_classes=2, n_informative=1, n_redundant=0,
                                   n_repeated=0, n_clusters_per_class=1, class_sep=.5,random_state=21)

# 同質
classifier = GradientBoostingClassifier(base_estimator=CARTRegressor(),n_estimators=10)
classifier.fit(data, target)
utils.plot_decision_function(data, target, classifier)

png

#異質
from ml_models.linear_model import LinearRegression
classifier = GradientBoostingClassifier(base_estimator=[LinearRegression(),LinearRegression(),LinearRegression(),CARTRegressor(max_depth=2)])
classifier.fit(data, target)
utils.plot_decision_function(data, target, classifier)

png

《機器學習Python實現_10_06_整合學習_boosting_gbdt分類實現》

一.利用迴歸樹實現分類

二.softmax+交叉熵損失，及其梯度求解

三.程式碼實現

相關文章