機器學習之常見的效能度量

洛陽山發表於2020-12-23

原文網址 : https://blog.csdn.net/u012949658/article/details/110895485

文章目錄

1、簡介

本文是對論文《The Impact of Automated Parameter Optimization on Defect Prediction Models》涉及到的效能度量標準整理。

最初看這篇論文的時候，被震撼了好久。從101個資料集中，選取涉及多個語言、多個領域的18個資料集，用12個效能度量討論現有流行機器學習分類器（模型）引數優化對效能的提升效果。整個實驗過程嚴謹，這種震撼效果直到讀到這個實驗室2017年的一篇論文《An Empirical Comparison of Model Validation Techniques for Defect Prediction Models》才減弱，使用的方法是相同的，對比實驗的設定也大同小異，像是流水線產品，留下了羨慕的淚水（這種綜述類的論文只能領域裡的專家才能發）。

2、效能度量總結

可以參考著這兩篇部落格一起看：
sklearn—評價指標大全
 機器學習效能評估指標

指標很多都是根據二分類的混淆矩陣來的，混淆矩陣示例：
在這裡插入圖片描述

指標名稱	計算公式	含義	參考文獻
查準率(精確率Precision)	$P=\frac{TP}{TP+FP}$		$[3]$
查全率(召回率Recall) TPR	$R=\frac{TP}{TP+FN}$	正類(有缺陷的模組)被正確分類的比例	$[3]$
$F_{measure}$ ( $F_{1}$ )	$\times \frac{P \times R}{P+R}$	查準率和查全率的調和平均值	$[3]$
特異度(Specify)TNR	$S=\frac{TN}{TN+FP}$	負類(無缺陷模組)被正確分類的比例
誤報率FPR	$FPR=\frac{FP}{TN+FP}$	負類(無缺陷模組)被錯誤分類的比例	$[4]$
$G_{mean}$	$G-mean=\sqrt{R \times S}$	R和S的幾何平均數
$G_{measure}$	$G_{measure}=\frac{2 \times pd \times(1-pf)}{pd+(1-pf)}$	TPR和FPR的調和平均值，其中pd=TPR，pf=FPR
$B a l a n c e$	$1-\sqrt{\frac{\left.(0-p f)^{2}+(1+p d)^{2}\right)}{2}}$	負類被誤分類的比例，其中pd=TPR，pf=FPR	$[5]$ , $[6]$
馬修斯係數(Matthews Correlation Coefficient, MCC)	$MCC=\frac{TP \times TN-FP \times FN}{\sqrt{(TP+FP)(TP+FN)(TN+FP)(TN+FN)}}$	實際分類與預測分類之間的相關係數	$[7]$
AUC		ROC曲線下的面積	$[8]$ ， $[9]$ ， $[10]$ ， $[11]$ ， $[12]$ ， $[13]$
Brier	$\frac{1}{N} \sum_{i=1}^{N}\left(p_{i}-y_{i}\right)^{2}$	預測概率和結果之間的差距	$[14]$ ， $[15]$
LogLoss	$logloss=-\frac{1}{N} \sum_{i=1}^{N}\left(y_{i} \log \left(p_{i}\right)+\left(1-y_{i}\right) \log \left(1-p_{i}\right)\right)$	分類損失函式

補充：

MCC：MCC本質上是一個描述實際分類與預測分類之間的相關係數，它的取值範圍為[-1,1]，取值為1時表示對受試物件的完美預測，取值為0時表示預測的結果還不如隨機預測的結果，-1是指預測分類和實際分類完全不一致；
Brier score： $p_{i}$ 是預測概率， $y_{i}$ 是真實的標籤(0或者1)，取值範圍是[0, 1]，0代表效能最好，1代表效能最差，0.25表示隨機分類；
LogLoss： $p_{i}$ 是預測概率， $y_{i}$ 是真實的標籤(0或者1)，Kaggle比賽的標準效能度量；

3、參考文獻

$[1]$ C. Tantithamthavorn, S. McIntosh, A. E. Hassan, and K. Matsumoto, “The Impact of Automated Parameter Optimization on Defect Prediction Models,” IEEE Trans. Softw. Eng., vol. 45, no. 7, pp. 683–711, Jul. 2019, doi: 10.1109/TSE.2018.2794977
$[2]$ C. Tantithamthavorn, S. McIntosh, A. E. Hassan, and K. Matsumoto, “An Empirical Comparison of Model Validation Techniques for Defect Prediction Models,” IEEE Trans. Softw. Eng., vol. 43, no. 1, pp. 1–18, Jan. 2017, doi: 10.1109/TSE.2016.2584050.
$[3]$ W. Fu, T. Menzies, and X. Shen, “Tuning for software analytics: is it really necessary?” Information and Software Technology, vol. 76, pp. 135–146.
$[4]$ T. Menzies, J. Greenwald, and A. Frank, “Data Mining Static Code Attributes to Learn Defect Predictors,” IEEE Transactions on Software Engineering (TSE), vol. 33, no. 1, pp. 2–13, 2007.
$[5]$ H. Zhang and X. Zhang, “Comments on “Data Min- ing Static Code Attributes to Learn Defect Predic- tors”,” IEEE Transactions on Software Engineering (TSE), vol. 33, no. 9, pp. 635–636, 2007.
$[6]$ A. Tosun, “Ensemble of Software Defect Predictors: A Case Study,” in Proceedings of the International Sympo- sium on Empirical Software Engineering and Measurement (ESEM), 2008, pp. 318–320.
$[7]$ M. Shepperd, D. Bowes, and T. Hall, “Researcher Bias: The Use of Machine Learning in Software Defect Prediction,” IEEE Transactions on Software Engineering (TSE), vol. 40, no. 6, pp. 603–616, 2014.
$[8]$ S. den Boer, N. F. de Keizer, and E. de Jonge, “Performance of prognostic models in critically ill cancer patients - a review.” Critical care, vol. 9, no. 4, pp. R458–R463, 2005.
$[9]$ F. E. Harrell Jr., Regression Modeling Strategies, 1st ed. Springer, 2002.
$[10]$ J. Huang and C. X. Ling, “Using AUC and accuracy
in evaluating learning algorithms,” Transactions on Knowledge and Data Engineering, vol. 17, no. 3, pp. 299– 310, 2005.
$[11]$ S. Lessmann, S. Member, B. Baesens, C. Mues, and S. Pietsch, “Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings,” IEEE Transactions on Software Engineering (TSE), vol. 34, no. 4, pp. 485–496, 2008.
$[12]$ E. W. Steyerberg, Clinical prediction models: a practi- cal approach to development, validation, and updating. Springer Science & Business Media, 2008.
$[13]$ E. W. Steyerberg, A. J. Vickers, N. R. Cook, T. Gerds, N. Obuchowski, M. J. Pencina, and M. W. Kattan, “Assessing the performance of prediction models: a framework for some traditional and novel measures,” Epidemiology, vol. 21, no. 1, pp. 128–138, 2010.
$[14]$ G. W. Brier, “Verification of Forecasets Expressed in Terms of Probability,” Monthly Weather Review, vol. 78, no. 1, pp. 25–27, 1950.
$[15]$ K. Rufibach, “Use of Brier score to assess binary predictions,” Journal of Clinical Epidemiology, vol. 63, no. 8, pp. 938–939, 2010.

機器學習中的效能度量指標彙總
2018-09-09
機器學習指標
機器學習: Metric Learning (度量學習)
2018-06-10
機器學習
機器學習之分類問題度量
2020-02-14
機器學習
機器學習常見演算法效能比較與調參建議
2018-08-28
機器學習演算法
常見機器學習演算法背後的數學
2020-08-21
機器學習演算法
[圖解] 機器學習常見的基本演算法
2018-03-09
圖解機器學習演算法
Hive學習之常見屬性配置
2018-11-30
Hive
效能測試之常見效能指標
2020-10-19
指標
「乾貨」22道機器學習常見面試題目
2019-12-05
機器學習面試題
【機器學習基礎】常見損失函式總結
2021-11-09
機器學習函式
面試Python機器學習時，常見的十個面試題
2021-05-20
Python機器學習面試題
機器學習筆記之效能評估指標
2018-04-25
機器學習筆記指標
聊聊效能度量的作弊經濟學
2024-11-18
【機器學習】--Python機器學習庫之Numpy
2018-04-06
機器學習Python
機器學習之學習速率
2020-06-12
機器學習
常見效能計數器及分析
2018-12-24
效能測試學習(1)-效能測試分類與常見術語
2020-10-13
效能度量
2021-12-02
【機器學習】第二節-模型評估與選擇-效能度量、方差與偏差、比較檢驗
2024-05-17
機器學習模型
常見機器學習用例TOP 7，在你身邊無處不在
2019-09-09
機器學習
機器學習之學習曲線
2019-09-18
機器學習
機器學習之pca
2024-06-17
機器學習PCA
機器學習之皮毛
2021-04-14
機器學習
距離度量學習
2019-07-02
虛擬機器的常見問題
2020-09-29
虛擬機
一張圖學習常見this的指向
2018-03-16
學習WebFlux時常見的問題
2019-12-08
WebUX
效能監控之常見 Java Heap Dump 方法
2024-05-22
Java
常見轉義符學習
2024-08-15
機器學習之特徵工程
2020-06-14
機器學習特徵工程
機器學習之梯度下降
2020-02-08
機器學習梯度
機器學習之迭代方法
2020-06-12
機器學習
機器學習之泛化
2020-06-13
機器學習
機器學習之支援向量機的超平面
2020-07-04
機器學習
【Python學習教程】Python常見面試題之Redis篇!
2021-10-21
Python面試題Redis
降維與度量學習
2018-12-04
機器學習基礎：相似度和距離度量究竟是什麼
2019-10-10
機器學習
機器學習之良好特徵的特點
2020-06-16
機器學習特徵

機器學習之常見的效能度量

文章目錄

1、簡介

2、效能度量總結

3、參考文獻

相關文章