Revision as of 06:42, 10 June 2025

Accuracy

Accuracy is one of the most commonly used metrics to evaluate the performance of a classification model in machine learning. It tells us the proportion of total predictions that were correct.

Definition

Accuracy = \frac{T P + T N}{T P + T N + F P + F N}

Where:

TP = True Positives
TN = True Negatives
FP = False Positives
FN = False Negatives

Accuracy answers the question: "Out of all predictions made by the model, how many were actually correct?"

Simple Example

Let’s say a model is used to detect whether emails are spam. Out of 100 emails:

60 are correctly identified as spam (TP)
30 are correctly identified as not spam (TN)
5 are incorrectly marked as spam (FP)
5 are incorrectly marked as not spam (FN)

Then:

Accuracy = \frac{60 + 30}{60 + 30 + 5 + 5} = \frac{90}{100} = 90 %

When is Accuracy Useful?

Accuracy is useful when the dataset is **balanced** (i.e., both classes occur in roughly equal numbers).

Limitations of Accuracy

Accuracy can be misleading in cases of Imbalanced data.

Example: Fraud Detection

Imagine 1000 transactions:

Only 10 are fraudulent.
A model labels all as “not fraud” and gets 990 correct.

$Accuracy = \frac{990}{1000} = 99 %$

Even with 99% accuracy, the model is useless because it failed to detect any fraud.

Related Metrics

Precision – Focuses on correct positive predictions
Recall – Focuses on correctly identifying actual positives
F1 Score – Harmonic mean of Precision and Recall
Confusion Matrix – Underlying table for all classification metrics

Real-World Applications

Image classification (e.g., cat vs dog detection)
Email spam filters
Sentiment analysis (positive vs negative review)

SEO Keywords

accuracy in machine learning, classification accuracy, accuracy formula, model evaluation metric, accuracy vs precision, confusion matrix accuracy, balanced data accuracy

@@ Line 34: / Line 34: @@
 == Limitations of Accuracy ==
-Accuracy can be **misleading** in cases of [[imbalanced data]].
+Accuracy can be '''misleading''' in cases of [[Imbalanced data]].
 === Example: Fraud Detection ===

Accuracy: Difference between revisions