Skip to content

Measuring Classifiers (Precision, Recall, F1, ROC, AUC)¶

Recall¶

\[ \frac{true positives}{true positives+false negatives}\]

Aka sensitivity, true positive rate, completeness
Percent of positives correctly predicted
Good choice of metric when you care a lot about false negatives
- i.e. fraud detection

	Actual fraud	Actual not fraud
Predicted fraud	5	20
Predicted not fraud	10	100

\[ Recall = \frac{TP}{TP+FN} = \frac{5}{5+10} = \frac{1}{3} = 33\% \]

Precision¶

\[ \frac{true positives}{true positives + false positives} \]

Aka correct positives
Percent of relevant results
Good choice of metric when you care a lot about false positives
- i.e., medical screening, drug testing

Other metrics¶

Specificity \(\frac{TN}{TN+FP} = true negative rate\)
F1 score
- \(\frac{2TP}{2TP+FP+FN}\)
- \(2 * \frac{precision*recall}{precision+recall}\)
- Harmonic mean of precision and sensitivity
- When you care about precision and recall
RMSE
- Root mean squared error
- Accuracy measurement
- Only cares about right & wrong answers

ROC Curve¶

roc-curve

Receiver Operating Characteristic Curve
Plot of true positive rate (recall) vs false positive rate at various threshold settings.
Points above the diagonal represent good classigication (better than random)
Ideal curve would just be a point in the upper left corner
The more i's bent toward the upper left the better.

AUC¶

The are under the ROC curve
Equal to probability that a classifier will rank a randomly chosen positive instance higher than a randomly chosen negative one
ROC AUC of 0.5 is a useless classifier, 1.0 is perfect.
Commonly used metric for comparing classifiers.