less than 1 minute read

Tags: ,

Error Metrics for Skewed Classes

Imgur

Precision / Recall

Imgur

  • Precision
    (Of all patients where we predicted y = 1, what fraction actually has cancer?)

       True positives               True positives
     ___________________   = ______________________________
     # predicted positive    True positive + False Positive
    
  • Recall
    (Of all patients that actually have cancer, what fraction did we correctly detect as having cancer ?)

       True positives            True positives
     __________________ = _____________________________
     # actual positive    True positive + False negative
    

a classifier of a high precision or high recall actually is a good classifier

if a classifier is getting high precision and high recall, then we are actually confident that the algorithm has to be doing well, even if we have very skewed classes. by 大神 !

Trading Off Precision and Recall

  • Trading off precision and recall
    • Logistic regression: 0 ≤ hθ(x) ≤ 1
    • Predict 1 if hθ(x) ≥ 0.5
    • Predict 0 if hθ(x) < 0.5

      Higher precision, lower recall
      • 很確切的知道病患有得癌症的機率才告知,避免病患緊張過度! Imgur

        Higher recall, lower precision
      • 有可能罹患癌症的時候就告知,怕錯過深度觀察或著治療等 Imgur

_以上兩種都可以用各自的觀點解讀唷!


Precision / Recall curve

  • More generally: Predict 1 if hθ(x) ≥ threshold

    Imgur

    • different shape :

      Imgur

F1 Score(F score)

  • How to compare precision/recall numbers?

    Imgur