Confusion Matrix

Confusion Matrix is a performance measurement tool used in machine learning, particularly for classification problems. It provides a summary of prediction results on a classification problem by comparing the actual labels with those predicted by the model.

What is a Confusion Matrix?

A confusion matrix is a table that describes the performance of a classification model. It shows how many instances were correctly or incorrectly predicted.

Structure of a Confusion Matrix

For a binary classification problem, the confusion matrix looks like this:

Actual \ Predicted Positive Negative
Positive True Positive (TP) False Negative (FN)
Negative False Positive (FP) True Negative (TN)

Terminology

  • True Positive (TP): Correctly predicted positives (e.g., predicting a patient has a disease and they actually do)
  • False Positive (FP): Incorrectly predicted positives (Type I Error)
  • True Negative (TN): Correctly predicted negatives
  • False Negative (FN): Incorrectly predicted negatives (Type II Error)

Simple Example

Let’s say a model is predicting if emails are spam:

  • TP: Spam email correctly predicted as spam.
  • FP: Normal email wrongly predicted as spam.
  • TN: Normal email correctly predicted as not spam.
  • FN: Spam email wrongly predicted as not spam.

Mathematical Metrics Using the Confusion Matrix

Using TP, TN, FP, and FN, we can calculate several key metrics:

Accuracy

Accuracy=TP+TNTP+TN+FP+FN

Precision

Precision=TPTP+FP

Recall (Sensitivity)

Recall=TPTP+FN

F1 Score

F1=2PrecisionRecallPrecision+Recall

Why Use a Confusion Matrix?

  • It gives a complete picture of how a model is performing.
  • It helps in understanding the types of errors made by the classifier.
  • Better than just relying on accuracy, especially when dealing with imbalanced datasets.

Visualization

You can visualize a confusion matrix as a heatmap to easily see where the model is performing well or poorly.

Applications

  • Medical diagnosis (e.g., predicting cancer)
  • Fraud detection
  • Spam detection
  • Sentiment analysis

SEO Keywords

confusion matrix, machine learning metrics, classification model evaluation, precision and recall, accuracy, true positive, false negative, F1-score

Related Pages