《機器學習_05_線性模型_最大熵模型》

努力的番茄發表於2020-05-18

import numpy as np
import os
os.chdir('../')
import matplotlib.pyplot as plt
%matplotlib inline

一.最大熵原理

最大熵的思想很樸素，即將已知事實以外的未知部分看做“等可能”的，而熵是描述“等可能”大小很合適的量化指標，熵的公式如下：

\[H(p)=-\sum_{i}p_i log p_i \]

這裡分佈\(p\)的取值有\(i\)種情況，每種情況的概率為\(p_i\)，下圖繪製了二值隨機變數的熵：

p=np.linspace(0.1,0.9,90)

def entropy(p):
    return -np.log(p)*p-np.log(1-p)*(1-p)

plt.plot(p,entropy(p))

[<matplotlib.lines.Line2D at 0x245a3d6d278>]

png

當兩者概率均為0.5時，熵取得最大值，通過最大化熵，可以使得分佈更“等可能”；另外，熵還有優秀的性質，它是一個凹函式，所以最大化熵其實是一個凸問題。

對於“已知事實”，可以用約束條件來描述，比如4個值的隨機變數分佈，其中已知\(p_1+p_2=0.4\)，它的求解可以表述如下：

\[\max_{p} -\sum_{i=1}^4 p_ilogp_i \\ s.t. p_1+p_2=0.4\\ p_i\geq 0,i=1,2,3,4\\ \sum_i p_i=1 \]

顯然，最優解為：\(p_1=0.2,p_2=0.2,p_3=0.3,p_4=0.3\)

二.最大熵模型

最大熵模型是最大熵原理在分類問題上的應用，它假設分類模型是一個條件概率分佈\(P(Y|X)\)，即對於給定的輸入\(X\)，以概率\(P(Y|X)\)輸出\(Y\)，這時最大熵模型的目標函式定義為條件熵：

\[H(P)=-\sum_{x,y}\tilde{P}(x)P(y|x)logP(y|x) \]

這裡，\(\tilde{P}(x)\)表示邊緣分佈\(P(X)\)的經驗分佈，\(\tilde{P}(x)=\frac{v(X=x)}{N}\)，\(v(X=x)\)表示訓練樣本中輸入\(x\)出現的次數，\(N\)表示訓練樣本的總數。

而最大熵模型的“已知事實”可以通過如下等式來約束：

\[\sum_{x,y}\tilde{P}(x)P(y|x)f(x,y)=\sum_{x,y}\tilde{P}(x,y)f(x,y) \]

為了方便，左邊式子記著\(E_P(f)\)，右邊式子記著\(E_{\tilde{P}}(f)\)，等式描述的是某函式\(f(x,y)\)關於模型\(P(Y|X)\)與經驗分佈\(\tilde{P}(X)\)的期望與函式\(f(x,y)\)關於經驗分佈\(\tilde{P}(X,Y)\)的期望相同。(這裡\(\tilde{P}(x,y)=\frac{v(X=x,Y=y)}{N}\))
所以重要的約束資訊將由\(f(x,y)\)來表示，它的定義如下：

\[f(x,y)=\left\{\begin{matrix} 1 & x與y滿足某一事實\\ 0 & 否則 \end{matrix}\right. \]

故最大熵模型可以理解為，模型在某些事實發生的期望和訓練集相同的條件下，使得條件熵最大化。所以，對於有\(n\)個約束條件的最大熵模型可以表示為：

\[\max_P -\sum_{x,y}\tilde{P}(x)P(y|x)logP(y|x) \\ s.t. E_P(f_i)=E_{\tilde{P}}(f_i),i=1,2,...,n\\ \sum_y P(y|x)=1 \]

按照優化問題的習慣，可以改寫為如下：

\[\min_P \sum_{x,y}\tilde{P}(x)P(y|x)logP(y|x) \\ s.t. E_P(f_i)-E_{\tilde{P}}(f_i)=0,i=1,2,...,n\\ \sum_y P(y|x)-1=0 \]

由於目標函式為凸函式，約束條件為仿射，所以我們可以通過求解對偶問題，得到原始問題的最優解，首先引入拉格朗日乘子\(w_0,w_1,...,w_n\)，定義拉格朗日函式\(L(P,w)\)：

\[L(P,w)=-H(P)+w_0(1-\sum_yP(y|x)+\sum_{i=1}^nw_i(E_{\tilde{P}}(f_i))-E_P(f_i)) \]

所以原問題等價於：

\[\min_P\max_w L(P,w) \]

它的對偶問題：

\[\max_w\min_P L(P,w) \]

首先，解裡面的 \(\min_P L(P,w)\)，其實對於\(\forall w\)，\(L(P,w)\)都是關於\(P\)的凸函式，因為\(-H(P)\)是關於\(P\)的凸函式，而後面的\(w_0(1-\sum_yP(y|x)+\sum_{i=1}^nw_i(E_{\tilde{P}}(f_i))-E_P(f_i))\)是關於\(P(y|x)\)的仿射函式，所以求\(L(P,w)\)對\(P\)的偏導數，並令其等於0，即可解得最優的\(P(y|x)\),記為\(P_w(y|x)\)，即：

\[\frac{\partial L(P,w)}{\partial P(y|x)}=\sum_{x,y}\tilde{P}(x)(logP(y|x)+1)-\sum_yw_0+\sum_{i=1}^n\sum_{x,y}\tilde{P}(x)f_i(x,y)w_i\\ =\sum_{x,y}\tilde{P}(x)(logP(y|x)+1-w_0-\sum_{i=1}^nw_if_i(x,y))\\ =0 \]

在訓練集中對任意樣本\(\forall x,y\)，都有\(\tilde{P}(x)(logP(y|x)+1-w_0-\sum_{i=1}^nw_if_i(x,y))=0\)，顯然\(\tilde{P}(x)>0\)(\(x\)本來就是訓練集中的一個樣本，自然概率大於0)，所以\(logP(y|x)+1-w_0-\sum_{i=1}^nw_if_i(x,y)=0\)，所以：

\[P_w(y|x)=exp(\sum_{i=1}^nw_if_i(x,y)+w_0-1)\\ =\frac{exp(\sum_{i=1}^nw_if_i(x,y))}{exp(1-w_0)}\\ =\frac{exp(\sum_{i=1}^nw_if_i(x,y))}{\sum_y exp(\sum_{i=1}^nw_if_i(x,y))} \]

這就是最大熵模型的表示式（最後一步變換用到了\(\sum_y P(y|x)=1\)），這裡\(w\)即是模型的引數，聰明的童鞋其實已經發現，最大熵模型其實就是一個線性函式外面套了一個softmax函式，它大概就是如下圖所示的這麼回事：

接下來，將\(L(P_w,w)\)帶入外層的\(max\)函式，即可求解最優的引數\(w^*\)：

\[w^*=arg\max_w L(P_w,w) \]

推導一下模型的梯度更新公式：

\[L(P_w,w)=\sum_{x,y}\tilde{P}(x)P_w(y|x)logP_w(y|x)+\sum_{i=1}^nw_i\sum_{x,y}(\tilde{P}(x,y)f_i(x,y)-\tilde{P}(x)P_w(y|x)f_i(x,y))\\ =\sum_{x,y}\tilde{P}(x,y)\sum_{i=1}^nw_if_i(x,y)+\sum_{x,y}\tilde{P}(x)P_w(y|x)(logP_w(y|x)-\sum_{i=1}^nw_if_i(x,y))\\ =\sum_{x,y}\tilde{P}(x,y)\sum_{i=1}^nw_if_i(x,y)-\sum_{x,y}\tilde{P}(x)P_w(y|x)log(\sum_{y^{'}}exp(\sum_{i=1}^nw_if_i(x,y^{'})))\\ =\sum_{x,y}\tilde{P}(x,y)\sum_{i=1}^nw_if_i(x,y)-\sum_{x}\tilde{P}(x)log(\sum_{y^{'}}exp(\sum_{i=1}^nw_if_i(x,y^{'})))\\ =\sum_{x,y}\tilde{P}(x,y)w^Tf(x,y)-\sum_{x}\tilde{P}(x)log(\sum_{y^{'}}exp(w^Tf(x,y^{'}))) \]

這裡，倒數第三步到倒數第二步用到了\(\sum_yP(y|x)=1\)，最後一步中\(w=[w_1,w_2,...,w_n]^T,f(x,y)=[f_1(x,y),f_2(x,y),...,f_n(x,y)]^T\)，所以：

\[\frac{\partial L(P_w,w)}{\partial w}=\sum_{x,y}\tilde{P}(x,y)f(x,y)-\sum_x\tilde{P}(x)\frac{exp(w^Tf(x,y))f(x,y)}{\sum_{y^{'}}exp(w^Tf(x,y^{'}))} \]

所以，自然\(w\)的更新公式：

\[w=w+\eta\frac{\partial L(P_w,w)}{\partial w} \]

這裡，\(\eta\)是學習率

三.對特徵函式的進一步理解

上面推匯出了最大熵模型的梯度更新公式，想必大家對\(f(x,y)\)還是有點疑惑，“滿足某一事實”這句話該如何理解？這其實與我們的學習目的相關，學習目的決定了我們的“事實”，比如有這樣一個任務，判斷“打”這個詞是量詞還是動詞，我們收集了如下的語料：

句子/\(x\)	目標/\(y\)
\(x_1:\)一打火柴	\(y_1:\)量詞
\(x_2:\)三打啤酒	\(y_2:\)量詞
\(x_3:\)打電話	\(y_3:\) 動詞
\(x_4:\)打籃球	\(y_4:\) 動詞

通過觀察，我們可以設計如下的兩個特徵函式來分別識別"量詞"和"動詞"任務：

\[f_1(x,y)=\left\{\begin{matrix} 1 & "打"前是數字\\ 0 & 否則 \end{matrix}\right. \]

\[f_2(x,y)=\left\{\begin{matrix} 1 & "打"後是名詞，且前面無數字\\ 0 & 否則 \end{matrix}\right. \]

當然，你也可以設計這樣的特徵函式來做識別“量詞”的任務：

\[f(x,y)=\left\{\begin{matrix} 1 & "打"前是"一","打"後是"火柴"\\ 0 & 否則 \end{matrix}\right. \]

\[f(x,y)=\left\{\begin{matrix} 1 & "打"前是"三","打"後是"啤酒"\\ 0 & 否則 \end{matrix}\right. \]

只是，這樣的特徵函式設計會使得模型學習能力變弱，比如遇到“三打火柴”，採用後面的特徵函式設計就識別不出“打”是量詞，而採用第一種特徵函式設計就能很好的識別出來，所以要使模型具有更好的泛化能力，就需要設計更好的特徵函式，而這往往依賴於人工經驗，對於自然語言處理這類任務（比如上面的例子），我們可以較容易的歸納總結出一些有用的經驗知識，但是對於其他情況，人工往往難以總結出一般性的規律，所以對於這些問題，我們需要設計更“一般”的特徵函式。

一種簡單的特徵函式設計

我們可以簡單考慮\(x\)的某個特徵取某個值和\(y\)取某個類的組合做特徵函式（對於連續型特徵，可以採用分箱操作），所以我們可以設計這樣兩類特徵函式：

（1）離散型：

\[f(x,y)=\left\{\begin{matrix} 1 & x_i=某值,y=某類\\ 0 & 否則 \end{matrix}\right. \]

（2）連續型：

\[f(x,y)=\left\{\begin{matrix} 1 & 某值1\leq x_i< 某值2,y=某類\\ 0 & 否則 \end{matrix}\right. \]

四.程式碼實現

為了方便演示，首先構建訓練資料和測試資料

# 測試
from sklearn import datasets
from sklearn import model_selection
from sklearn.metrics import f1_score

iris = datasets.load_iris()
data = iris['data']
target = iris['target']
X_train, X_test, y_train, y_test = model_selection.train_test_split(data, target, test_size=0.2,random_state=0)

為了方便對資料進行分箱操作，封裝一個DataBinWrapper類，並對X_train和X_test進行轉換（該類放到ml_models.wrapper_models中）

class DataBinWrapper(object):
    def __init__(self, max_bins=10):
        # 分段數
        self.max_bins = max_bins
        # 記錄x各個特徵的分段區間
        self.XrangeMap = None

    def fit(self, x):
        n_sample, n_feature = x.shape
        # 構建分段資料
        self.XrangeMap = [[] for _ in range(0, n_feature)]
        for index in range(0, n_feature):
            tmp = x[:, index]
            for percent in range(1, self.max_bins):
                percent_value = np.percentile(tmp, (1.0 * percent / self.max_bins) * 100.0 // 1)
                self.XrangeMap[index].append(percent_value)

    def transform(self, x):
        """
        抽取x_bin_index
        :param x:
        :return:
        """
        if x.ndim == 1:
            return np.asarray([np.digitize(x[i], self.XrangeMap[i]) for i in range(0, x.size)])
        else:
            return np.asarray([np.digitize(x[:, i], self.XrangeMap[i]) for i in range(0, x.shape[1])]).T

data_bin_wrapper=DataBinWrapper(max_bins=10)
data_bin_wrapper.fit(X_train)
X_train=data_bin_wrapper.transform(X_train)
X_test=data_bin_wrapper.transform(X_test)

X_train[:5,:]

array([[7, 6, 8, 7],
       [3, 5, 5, 6],
       [2, 8, 2, 2],
       [6, 5, 6, 7],
       [7, 2, 8, 8]], dtype=int64)

X_test[:5,:]

array([[5, 2, 7, 9],
       [5, 0, 4, 3],
       [3, 9, 1, 2],
       [9, 3, 9, 7],
       [1, 8, 2, 2]], dtype=int64)

由於特徵函式可以有不同的形式，這裡我們將特徵函式解耦出來，構造一個SimpleFeatureFunction類（後續構造其他複雜的特徵函式，需要定義和該類相同的函式名，該類放置到ml_models.linear_model中）

class SimpleFeatureFunction(object):
    def __init__(self):
        """
        記錄特徵函式
        {
            (x_index,x_value,y_index)
        }
        """
        self.feature_funcs = set()

    # 構建特徵函式
    def build_feature_funcs(self, X, y):
        n_sample, _ = X.shape
        for index in range(0, n_sample):
            x = X[index, :].tolist()
            for feature_index in range(0, len(x)):
                self.feature_funcs.add(tuple([feature_index, x[feature_index], y[index]]))

    # 獲取特徵函式總數
    def get_feature_funcs_num(self):
        return len(self.feature_funcs)

    # 分別命中了那幾個特徵函式
    def match_feature_funcs_indices(self, x, y):
        match_indices = []
        index = 0
        for feature_index, feature_value, feature_y in self.feature_funcs:
            if feature_y == y and x[feature_index] == feature_value:
                match_indices.append(index)
            index += 1
        return match_indices

接下來對MaxEnt類進行實現，首先實現一個softmax函式的功能(ml_models.utils)

def softmax(x):
    if x.ndim == 1:
        return np.exp(x) / np.exp(x).sum()
    else:
        return np.exp(x) / np.exp(x).sum(axis=1, keepdims=True)

進行MaxEnt類的具體實現（ml_models.linear_model）

from ml_models import utils
class MaxEnt(object):
    def __init__(self, feature_func, epochs=5, eta=0.01):
        self.feature_func = feature_func
        self.epochs = epochs
        self.eta = eta

        self.class_num = None
        """
        記錄聯合概率分佈:
        {
            (x_0,x_1,...,x_p,y_index):p
        }
        """
        self.Pxy = {}
        """
        記錄邊緣概率分佈:
        {
            (x_0,x_1,...,x_p):p
        }
        """
        self.Px = {}

        """
        w[i]-->feature_func[i]
        """
        self.w = None

    def init_params(self, X, y):
        """
        初始化相應的資料
        :return:
        """
        n_sample, n_feature = X.shape
        self.class_num = np.max(y) + 1

        # 初始化聯合概率分佈、邊緣概率分佈、特徵函式
        for index in range(0, n_sample):
            range_indices = X[index, :].tolist()

            if self.Px.get(tuple(range_indices)) is None:
                self.Px[tuple(range_indices)] = 1
            else:
                self.Px[tuple(range_indices)] += 1

            if self.Pxy.get(tuple(range_indices + [y[index]])) is None:
                self.Pxy[tuple(range_indices + [y[index]])] = 1
            else:
                self.Pxy[tuple(range_indices + [y[index]])] = 1

        for key, value in self.Pxy.items():
            self.Pxy[key] = 1.0 * self.Pxy[key] / n_sample
        for key, value in self.Px.items():
            self.Px[key] = 1.0 * self.Px[key] / n_sample

        # 初始化引數權重
        self.w = np.zeros(self.feature_func.get_feature_funcs_num())

    def _sum_exp_w_on_all_y(self, x):
        """
        sum_y exp(self._sum_w_on_feature_funcs(x))
        :param x:
        :return:
        """
        sum_w = 0
        for y in range(0, self.class_num):
            tmp_w = self._sum_exp_w_on_y(x, y)
            sum_w += np.exp(tmp_w)
        return sum_w

    def _sum_exp_w_on_y(self, x, y):
        tmp_w = 0
        match_feature_func_indices = self.feature_func.match_feature_funcs_indices(x, y)
        for match_feature_func_index in match_feature_func_indices:
            tmp_w += self.w[match_feature_func_index]
        return tmp_w

    def fit(self, X, y):
        self.eta = max(1.0 / np.sqrt(X.shape[0]), self.eta)
        self.init_params(X, y)
        x_y = np.c_[X, y]
        for epoch in range(self.epochs):
            count = 0
            np.random.shuffle(x_y)
            for index in range(x_y.shape[0]):
                count += 1
                x_point = x_y[index, :-1]
                y_point = x_y[index, -1:][0]
                # 獲取聯合概率分佈
                p_xy = self.Pxy.get(tuple(x_point.tolist() + [y_point]))
                # 獲取邊緣概率分佈
                p_x = self.Px.get(tuple(x_point))
                # 更新w
                dw = np.zeros(shape=self.w.shape)
                match_feature_func_indices = self.feature_func.match_feature_funcs_indices(x_point, y_point)
                if len(match_feature_func_indices) == 0:
                    continue
                if p_xy is not None:
                    for match_feature_func_index in match_feature_func_indices:
                        dw[match_feature_func_index] = p_xy
                if p_x is not None:
                    sum_w = self._sum_exp_w_on_all_y(x_point)
                    for match_feature_func_index in match_feature_func_indices:
                        dw[match_feature_func_index] -= p_x * np.exp(self._sum_exp_w_on_y(x_point, y_point)) / (
                            1e-7 + sum_w)
                # 更新
                self.w += self.eta * dw
                # 列印訓練進度
                if count % (X.shape[0] // 4) == 0:
                    print("processing:\tepoch:" + str(epoch + 1) + "/" + str(self.epochs) + ",percent:" + str(
                        count) + "/" + str(X.shape[0]))

    def predict_proba(self, x):
        """
        預測為y的概率分佈
        :param x:
        :return:
        """
        y = []
        for x_point in x:
            y_tmp = []
            for y_index in range(0, self.class_num):
                match_feature_func_indices = self.feature_func.match_feature_funcs_indices(x_point, y_index)
                tmp = 0
                for match_feature_func_index in match_feature_func_indices:
                    tmp += self.w[match_feature_func_index]
                y_tmp.append(tmp)
            y.append(y_tmp)
        return utils.softmax(np.asarray(y))

    def predict(self, x):
        return np.argmax(self.predict_proba(x), axis=1)

# 構建特徵函式類
feature_func=SimpleFeatureFunction()
feature_func.build_feature_funcs(X_train,y_train)

maxEnt = MaxEnt(feature_func=feature_func)
maxEnt.fit(X_train, y_train)
y = maxEnt.predict(X_test)

print('f1:', f1_score(y_test, y, average='macro'))

processing:	epoch:1/5,percent:30/120
processing:	epoch:1/5,percent:60/120
processing:	epoch:1/5,percent:90/120
processing:	epoch:1/5,percent:120/120
processing:	epoch:2/5,percent:30/120
processing:	epoch:2/5,percent:60/120
processing:	epoch:2/5,percent:90/120
processing:	epoch:2/5,percent:120/120
processing:	epoch:3/5,percent:30/120
processing:	epoch:3/5,percent:60/120
processing:	epoch:3/5,percent:90/120
processing:	epoch:3/5,percent:120/120
processing:	epoch:4/5,percent:30/120
processing:	epoch:4/5,percent:60/120
processing:	epoch:4/5,percent:90/120
processing:	epoch:4/5,percent:120/120
processing:	epoch:5/5,percent:30/120
processing:	epoch:5/5,percent:60/120
processing:	epoch:5/5,percent:90/120
processing:	epoch:5/5,percent:120/120
f1: 0.9295631904327557

通過前面的分析，我們知道特徵函式的複雜程度決定了模型的複雜度，下面我們新增更復雜的特徵函式來增強MaxEnt的效果，上面的特徵函式僅考慮了單個特徵與目標的關係，我們進一步考慮二個特徵與目標的關係，即：

\[f(x,y)=\left\{\begin{matrix} 1 & x_i=某值,x_j=某值,y=某類\\ 0 & 否則 \end{matrix}\right. \]

如此，我們可以定義一個新的UserDefineFeatureFunction類（注意:類中的方法名稱要和SimpleFeatureFunction一樣）

class UserDefineFeatureFunction(object):
    def __init__(self):
        """
        記錄特徵函式
        {
            (x_index1,x_value1,x_index2,x_value2,y_index)
        }
        """
        self.feature_funcs = set()

    # 構建特徵函式
    def build_feature_funcs(self, X, y):
        n_sample, _ = X.shape
        for index in range(0, n_sample):
            x = X[index, :].tolist()
            for feature_index in range(0, len(x)):
                self.feature_funcs.add(tuple([feature_index, x[feature_index], y[index]]))
                for new_feature_index in range(0,len(x)):
                    if feature_index!=new_feature_index:
                        self.feature_funcs.add(tuple([feature_index, x[feature_index],new_feature_index,x[new_feature_index],y[index]]))

    # 獲取特徵函式總數
    def get_feature_funcs_num(self):
        return len(self.feature_funcs)

    # 分別命中了那幾個特徵函式
    def match_feature_funcs_indices(self, x, y):
        match_indices = []
        index = 0
        for item in self.feature_funcs:
            if len(item)==5:
                feature_index1, feature_value1,feature_index2,feature_value2, feature_y=item
                if feature_y == y and x[feature_index1] == feature_value1 and x[feature_index2]==feature_value2:
                    match_indices.append(index)
            else:
                feature_index1, feature_value1, feature_y=item
                if feature_y == y and x[feature_index1] == feature_value1:
                    match_indices.append(index)
            index += 1
        return match_indices

# 檢驗
feature_func=UserDefineFeatureFunction()
feature_func.build_feature_funcs(X_train,y_train)

maxEnt = MaxEnt(feature_func=feature_func)
maxEnt.fit(X_train, y_train)
y = maxEnt.predict(X_test)

print('f1:', f1_score(y_test, y, average='macro'))

processing:	epoch:1/5,percent:30/120
processing:	epoch:1/5,percent:60/120
processing:	epoch:1/5,percent:90/120
processing:	epoch:1/5,percent:120/120
processing:	epoch:2/5,percent:30/120
processing:	epoch:2/5,percent:60/120
processing:	epoch:2/5,percent:90/120
processing:	epoch:2/5,percent:120/120
processing:	epoch:3/5,percent:30/120
processing:	epoch:3/5,percent:60/120
processing:	epoch:3/5,percent:90/120
processing:	epoch:3/5,percent:120/120
processing:	epoch:4/5,percent:30/120
processing:	epoch:4/5,percent:60/120
processing:	epoch:4/5,percent:90/120
processing:	epoch:4/5,percent:120/120
processing:	epoch:5/5,percent:30/120
processing:	epoch:5/5,percent:60/120
processing:	epoch:5/5,percent:90/120
processing:	epoch:5/5,percent:120/120
f1: 0.957351290684624

我們可以根據自己對資料的認識，不斷為模型新增一些新特徵函式去增強模型的效果，只需要修改build_feature_funcs和match_feature_funcs_indices這兩個函式即可（但注意控制函式的數量規模）
簡單總結一下MaxEnt的優缺點，優點很明顯：我們可以diy任意複雜的特徵函式進去，缺點也很明顯：訓練很耗時，而且特徵函式的設計好壞需要先驗知識，對於某些任務很難直觀獲取

最大熵模型
2018-03-06
熵模型
機器學習第6篇：線性模型概述
2020-11-04
機器學習模型
模式識別與機器學習——迴歸的線性模型
2019-01-30
模式機器學習模型
機器學習演算法筆記之3：線性模型
2020-04-06
機器學習演算法筆記模型
機器學習模型
2024-03-30
機器學習模型
哪些特徵對我的機器學習模型影響最大？
2018-12-14
特徵機器學習模型
從線性模型（linear model）衍生出的機器學習分類器（classifier）
2019-03-24
模型機器學習
機器學習之支援向量機（線性模型）的最佳化問題
2020-07-04
機器學習模型
最大熵模型詳解與Python實現
2020-11-06
熵模型Python
[DataAnalysis]機器學習演算法——線性模型（邏輯迴歸+LDA）
2018-08-26
機器學習演算法模型邏輯迴歸LDA
談談機器學習模型的可解釋性
2021-07-29
機器學習模型
如何管理機器學習模型
2019-01-12
機器學習模型
欺騙機器學習模型
2018-04-06
機器學習模型
周志華西瓜書《機器學習》第三章線性模型
2018-12-09
機器學習模型
機器學習導圖系列（5）：機器學習模型及神經網路模型
2019-04-11
機器學習模型神經網路
tensorflow機器學習模型的跨平臺上線
2018-07-01
機器學習模型
pytorch使用交叉熵訓練模型學習筆記
2024-06-17
PyTorch熵模型筆記
【機器學習】--隱語義模型
2018-06-12
機器學習模型
機器學習——決策樹模型
2023-12-26
機器學習模型
機器學習之模型選擇
2020-02-14
機器學習模型
機器學習之模型診斷
2020-02-14
機器學習模型
機器學習之模型評估
2019-06-21
機器學習模型
機器學習引數模型與非引數模型/生成模型與判別模型
2018-03-18
機器學習模型
【機器學習基礎】熵、KL散度、交叉熵
2018-09-27
機器學習熵
機器學習模型可解釋性的詳盡介紹
2019-10-30
機器學習模型
機器學習【模型，策略，演算法】
2024-11-27
機器學習模型演算法
【機器學習】乾貨丨機器學習知識點；機器學習模型的“可解釋性”到底有多重要？
2018-03-14
機器學習模型
【火爐煉AI】機器學習014-用SVM構建非線性分類模型
2018-08-07
AI機器學習模型
06_邏輯迴歸演算法和最大熵模型
2020-06-01
邏輯迴歸演算法熵模型
機器學習：線性迴歸
2024-11-19
機器學習
用PMML實現機器學習模型的跨平臺上線
2018-06-24
機器學習模型
機器學習在電商應用中的三個境界：爆款模型、轉化率模型及個性化模型
2019-03-01
機器學習模型
圖解機器學習 | LightGBM模型詳解
2022-03-10
圖解機器學習模型
分散式機器學習中的模型聚合
2021-12-02
分散式機器學習模型
如何解決機器學習樹整合模型的解釋性問題
2019-10-20
機器學習模型
線性規劃模型複習總結
2024-07-15
模型
機器學習7-模型儲存&無監督學習
2021-01-22
機器學習模型
機器學習模型python線上服務部署的兩種例項
2020-09-22
機器學習模型Python