Confusion Matrix
Confusion Matrix
Confusion Matrix is a performance measurement tool used in machine learning, particularly for classification problems. It provides a summary of prediction results on a classification problem by comparing the actual labels with those predicted by the model.
What is a Confusion Matrix?
A confusion matrix is a table that describes the performance of a classification model. It shows how many instances were correctly or incorrectly predicted.
Structure of a Confusion Matrix
For a binary classification problem, the confusion matrix looks like this:
Actual \ Predicted | Positive | Negative |
---|---|---|
Positive | True Positive (TP) | False Negative (FN) |
Negative | False Positive (FP) | True Negative (TN) |
Terminology
- True Positive (TP): Correctly predicted positives (e.g., predicting a patient has a disease and they actually do)
- False Positive (FP): Incorrectly predicted positives (Type I Error)
- True Negative (TN): Correctly predicted negatives
- False Negative (FN): Incorrectly predicted negatives (Type II Error)
Simple Example
Let’s say a model is predicting if emails are spam:
- TP: Spam email correctly predicted as spam.
- FP: Normal email wrongly predicted as spam.
- TN: Normal email correctly predicted as not spam.
- FN: Spam email wrongly predicted as not spam.
Mathematical Metrics Using the Confusion Matrix
Using TP, TN, FP, and FN, we can calculate several key metrics:
Accuracy
Precision
Recall (Sensitivity)
F1 Score
Why Use a Confusion Matrix?
- It gives a complete picture of how a model is performing.
- It helps in understanding the types of errors made by the classifier.
- Better than just relying on accuracy, especially when dealing with imbalanced datasets.
Visualization
You can visualize a confusion matrix as a heatmap to easily see where the model is performing well or poorly.
Applications
- Medical diagnosis (e.g., predicting cancer)
- Fraud detection
- Spam detection
- Sentiment analysis
SEO Keywords
confusion matrix, machine learning metrics, classification model evaluation, precision and recall, accuracy, true positive, false negative, F1-score