Accuracy: Difference between revisions
Thakshashila (talk | contribs) |
Thakshashila (talk | contribs) |
||
Line 34: | Line 34: | ||
== Limitations of Accuracy == | == Limitations of Accuracy == | ||
Accuracy can be | Accuracy can be '''misleading''' in cases of [[Imbalanced data]]. | ||
=== Example: Fraud Detection === | === Example: Fraud Detection === |
Revision as of 06:42, 10 June 2025
Accuracy
Accuracy is one of the most commonly used metrics to evaluate the performance of a classification model in machine learning. It tells us the proportion of total predictions that were correct.
Definition
Where:
- TP = True Positives
- TN = True Negatives
- FP = False Positives
- FN = False Negatives
Accuracy answers the question: "Out of all predictions made by the model, how many were actually correct?"
Simple Example
Let’s say a model is used to detect whether emails are spam. Out of 100 emails:
- 60 are correctly identified as spam (TP)
- 30 are correctly identified as not spam (TN)
- 5 are incorrectly marked as spam (FP)
- 5 are incorrectly marked as not spam (FN)
Then:
When is Accuracy Useful?
Accuracy is useful when the dataset is **balanced** (i.e., both classes occur in roughly equal numbers).
Limitations of Accuracy
Accuracy can be misleading in cases of Imbalanced data.
Example: Fraud Detection
Imagine 1000 transactions:
- Only 10 are fraudulent.
- A model labels all as “not fraud” and gets 990 correct.
Even with 99% accuracy, the model is useless because it failed to detect any fraud.
Related Metrics
- Precision – Focuses on correct positive predictions
- Recall – Focuses on correctly identifying actual positives
- F1 Score – Harmonic mean of Precision and Recall
- Confusion Matrix – Underlying table for all classification metrics
Real-World Applications
- Image classification (e.g., cat vs dog detection)
- Email spam filters
- Sentiment analysis (positive vs negative review)
SEO Keywords
accuracy in machine learning, classification accuracy, accuracy formula, model evaluation metric, accuracy vs precision, confusion matrix accuracy, balanced data accuracy