sklearn建模及評估(分類)

kuqlan發表於2019-09-04

      分類是指構造一個分類模型,輸入樣本的特徵值,輸入出對應的類別,將每個樣本對映到預先定義好的類別。Sklearn中分類演算法很多,用於不同場景,常用的有 linear_model svm neighbors naïve _ bayes tree ensemble R andomForestClassifier )、 ensemble (GradientBoosting C lassifier) 等。

breast_cancer 為例,使用 sklearn 估計器構建支援向量機 SVM 模型,程式碼如下:

## 載入所需的函式,

import numpy as np

from sklearn.datasets import load_breast_cancer

from sklearn.svm import SVC

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

cancer = load_breast_cancer()

cancer_data = cancer['data']

cancer_target = cancer['target']

cancer_names = cancer['feature_names']

## 將資料劃分為訓練集測試集

cancer_data_train,cancer_data_test, \

cancer_target_train,cancer_target_test = \

train_test_split(cancer_data,cancer_target,

      test_size = 0.2,random_state = 22)

## 資料標準化

stdScaler = StandardScaler().fit(cancer_data_train)

cancer_trainStd = stdScaler.transform(cancer_data_train)

cancer_testStd = stdScaler.transform(cancer_data_test)

## 建立SVM模型

svm = SVC().fit(cancer_trainStd,cancer_target_train)

print('建立的SVM模型為:\n',svm)

 

 

## 預測訓練集結果

cancer_target_pred = svm.predict(cancer_testStd)

print('預測前20個結果為:\n',cancer_target_pred[:20])

print('預測前20個結果為:\n',cancer_target_pred[:])

 

 

## 求出預測和真實一樣的數目

true = np.sum(cancer_target_pred == cancer_target_test )

print('預測對的結果數目為:', true)

print('預測錯的的結果數目為:', cancer_target_test.shape[0])

print('預測錯的的結果數目為:', cancer_target_test.shape[0]-true)

print('預測結果準確率為:', true/cancer_target_test.shape[0])

 

 

 

# 分類模型常用的評價方法

from sklearn.metrics import accuracy_score,precision_score, \

recall_score,f1_score,cohen_kappa_score

print('使用SVM預測breast_cancer資料的準確率為:',

      accuracy_score(cancer_target_test,cancer_target_pred))

print('使用SVM預測breast_cancer資料的精確率為:',

      precision_score(cancer_target_test,cancer_target_pred))

print('使用SVM預測breast_cancer資料的召回率為:',

      recall_score(cancer_target_test,cancer_target_pred))

print('使用SVM預測breast_cancer資料的F1值為:',

      f1_score(cancer_target_test,cancer_target_pred))

print('使用SVM預測breast_cancer資料的Cohen’s Kappa係數為:',

      cohen_kappa_score(cancer_target_test,cancer_target_pred))

 

 

 

# 分類模型評價報告

from sklearn.metrics import classification_report

print('使用SVM預測iris資料的分類報告為:','\n',

      classification_report(cancer_target_test,

            cancer_target_pred))


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/18841027/viewspace-2655938/,如需轉載,請註明出處,否則將追究法律責任。

相關文章