呼叫python的sklearn實現Logistic Reression演算法

bigface1234fdfg發表於2015-01-21

呼叫python的sklearn實現Logistic Reression演算法

先說如何實現，其中的匯入資料庫和類、方法的關係，之前不是很清楚，現在知道了。。。

from numpy import * 
from sklearn.datasets import load_iris     # import datasets

# load the dataset: iris
iris = load_iris() 
samples = iris.data
#print samples 
target = iris.target 

# import the LogisticRegression
from sklearn.linear_model import LogisticRegression 

classifier = LogisticRegression()  # 使用類，引數全是預設的
classifier.fit(samples, target)  # 訓練資料來學習，不需要返回值

x = classifier.predict([5, 3, 5, 2.5])  # 測試資料，分類返回標記

print x 

#其實匯入的是sklearn.linear_model的一個類：LogisticRegression， 它裡面有許多方法
#常用的方法是fit（訓練分類模型）、predict（預測測試樣本的標記）

#不過裡面沒有返回LR模型中學習到的權重向量w，感覺這是一個缺陷

上面使用的

classifier = LogisticRegression()  # 使用類，引數全是預設的

是預設的，所有的引數全都是預設的，其實我們可以自己設定許多。這需要用到官方給定的引數說明，如下：

`sklearn.linear_model`.LogisticRegression

class sklearn.linear_model.LogisticRegression(penalty='l2', dual=False, tol=0.0001, C=1.0, fit_intercept=True,intercept_scaling=1, class_weight=None, random_state=None)

Logistic Regression (aka logit, MaxEnt) classifier.

In the multiclass case, the training algorithm uses a one-vs.-all (OvA) scheme, rather than the “true” multinomial LR.

This class implements L1 and L2 regularized logistic regression using the liblinear library. It can handle both dense and sparse input. Use C-ordered arrays or CSR matrices containing 64-bit floats for optimal performance; any other input format will be converted (and copied).

Parameters:

Parameters:	penalty : string, ‘l1’ or ‘l2’ 懲罰項的種類 Used to specify the norm used in the penalization. dual : boolean Dual or primal formulation. Dual formulation is only implemented for l2 penalty. Prefer dual=False when n_samples > n_features. C : float, optional (default=1.0) Inverse of regularization strength; must be a positive float. Like in support vector machines, smaller values specify stronger regularization. fit_intercept : bool, default: True Specifies if a constant (a.k.a. bias or intercept) should be added the decision function. intercept_scaling : float, default: 1 when self.fit_intercept is True, instance vector x becomes [x, self.intercept_scaling], i.e. a “synthetic” feature with constant value equals to intercept_scaling is appended to the instance vector. The intercept becomes intercept_scaling * synthetic feature weight Note! the synthetic feature weight is subject to l1/l2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) intercept_scaling has to be increased class_weight : {dict, ‘auto’}, optional 考慮類不平衡，類似於代價敏感 Over-/undersamples the samples of each class according to the given weights. If not given, all classes are supposed to have weight one. The ‘auto’ mode selects weights inversely proportional to class frequencies in the training set. random_state: int seed, RandomState instance, or None (default) : The seed of the pseudo random number generator to use when shuffling the data. tol: float, optional : Tolerance for stopping criteria.
Attributes:	`coef_` : array, shape = [n_classes, n_features] Coefficient of the features in the decision function. coef_ is readonly property derived from raw_coef_ that follows the internal memory layout of liblinear. `intercept_` : array, shape = [n_classes] Intercept (a.k.a. bias) added to the decision function. If fit_intercept is set to False, the intercept is set to zero.

penalty : string, ‘l1’ or ‘l2’ 懲罰項的種類

Used to specify the norm used in the penalization.

dual : boolean

Dual or primal formulation. Dual formulation is only implemented for l2 penalty. Prefer dual=False when n_samples > n_features.

C : float, optional (default=1.0)

Inverse of regularization strength; must be a positive float. Like in support vector machines, smaller values specify stronger regularization.

fit_intercept : bool, default: True

Specifies if a constant (a.k.a. bias or intercept) should be added the decision function.

intercept_scaling : float, default: 1

when self.fit_intercept is True, instance vector x becomes [x, self.intercept_scaling], i.e. a “synthetic” feature with constant value equals to intercept_scaling is appended to the instance vector. The intercept becomes intercept_scaling * synthetic feature weight Note! the synthetic feature weight is subject to l1/l2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) intercept_scaling has to be increased

class_weight : {dict, ‘auto’}, optional 考慮類不平衡，類似於代價敏感

Over-/undersamples the samples of each class according to the given weights. If not given, all classes are supposed to have weight one. The ‘auto’ mode selects weights inversely proportional to class frequencies in the training set.

random_state: int seed, RandomState instance, or None (default) :

The seed of the pseudo random number generator to use when shuffling the data.

tol: float, optional :

Tolerance for stopping criteria.

Attributes:

`coef_` : array, shape = [n_classes, n_features]

Coefficient of the features in the decision function.

coef_ is readonly property derived from raw_coef_ that follows the internal memory layout of liblinear.

`intercept_` : array, shape = [n_classes]

Intercept (a.k.a. bias) added to the decision function. If fit_intercept is set to False, the intercept is set to zero.

LogisticRegression類中的方法有如下幾種，我們常用的是fit和predict~

Methods

`decision_function`(X)	Predict confidence scores for samples.
`densify`()	Convert coefficient matrix to dense array format.
`fit`(X, y)	Fit the model according to the given training data. 用來訓練LR分類器，其中的X是訓練樣本，y是對應的標記向量
`fit_transform`(X[, y])	Fit to data, then transform it.
`get_params`([deep])	Get parameters for this estimator.
`predict`(X)	Predict class labels for samples in X. 用來預測測試樣本的標記，也就是分類。X是測試樣本集
`predict_log_proba`(X)	Log of probability estimates.
`predict_proba`(X)	Probability estimates.
`score`(X, y[, sample_weight])	Returns the mean accuracy on the given test data and labels.
`set_params`(**params)	Set the parameters of this estimator.
`sparsify`()	Convert coefficient matrix to sparse format.
`transform`(X[, threshold])	Reduce X to its most important features.

使用predict返回的就是測試樣本的標記向量，其實個人覺得還應有LR分類器中的重要過程引數：權重向量，其size應該是和feature的個數相同。但是就沒有這個方法，所以這就萌生了自己實現LR演算法的念頭，那樣子就可以輸出權重向量了。

參考連結：

http://www.cnblogs.com/xupeizhi/archive/2013/07/05/3174703.html

http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression

【機器學習演算法實現】logistic迴歸__基於Python和Numpy函式庫
2016-05-10
機器學習演算法Python函式
決策樹在sklearn中的實現
2019-03-07
機器學習|決策樹-sklearn實現
2020-12-19
機器學習
【機器學習】線性迴歸sklearn實現
2019-01-17
機器學習
用opencv實現的PCA演算法，非API呼叫
2017-06-03
OpenCVPCA演算法API
機器學習之決策樹在sklearn中的實現
2019-03-06
機器學習
【機器學習】多項式迴歸sklearn實現
2019-03-10
機器學習
基本排序演算法的Python實現
2016-08-18
排序演算法Python
python排序演算法的實現-冒泡
2013-11-08
Python排序演算法
python排序演算法的實現-插入
2013-11-08
Python排序演算法
python中sklearn包的錯誤
2016-12-19
Python
Python: 安裝 sklearn 包出現錯誤的解決方法
2020-11-14
Python
python和C的如何實現互相呼叫？
2021-09-09
Python
機器學習之Logistic迴歸演算法
2017-12-05
機器學習演算法
FM演算法python實現
2019-03-26
演算法Python
python實現冒泡演算法
2019-02-16
Python演算法
python實現FM演算法
2020-12-25
Python演算法
PYTHON實現DFS演算法
2017-05-03
Python演算法
python實現Floyd演算法
2017-07-14
Python演算法
Python實現KNN演算法
2015-01-16
PythonKNN演算法
Python+sklearn使用DBSCAN聚類演算法案例一則
2018-01-30
Python聚類演算法
TensorFlow 呼叫預訓練好的模型—— Python 實現
2018-10-10
模型Python
利用swig實現python呼叫C/C++的方法
2015-04-10
PythonC++
第七篇：Logistic迴歸分類演算法原理分析與程式碼實現
2017-01-19
演算法
python排序演算法的實現-選擇
2013-11-08
Python排序演算法
python排序演算法的實現-快速排序
2013-11-11
Python排序演算法
python如何呼叫subprocess模組實現外部命令？
2021-09-11
Python
模仿sklearn進行機器學習演算法的封裝
2021-09-09
機器學習演算法封裝
python實現希爾排序演算法
2019-04-18
Python排序演算法
RSA演算法與Python實現
2018-08-08
演算法Python
python演算法 - python實現氣泡排序
2013-05-23
Python演算法排序
目標匹配：匈牙利演算法的python實現
2020-12-29
演算法Python
機器學習Sklearn系列：（五）聚類演算法
2021-07-22
機器學習聚類演算法
sklearn調包俠之KNN演算法
2018-06-26
KNN演算法
Python機器學習筆記：sklearn庫的學習
2018-12-29
Python機器學習筆記
python中匯入不了sklearn的問題
2016-08-08
Python
Eureka實現微服務的呼叫
2018-09-10
微服務
機器學習筆記之Logistic迴歸演算法
2017-02-08
機器學習筆記演算法

呼叫python的sklearn實現Logistic Reression演算法

sklearn.linear_model.LogisticRegression

相關文章

`sklearn.linear_model`.LogisticRegression