Weighted F1
Weighted F1 Score
The Weighted F1 Score is a metric used in multi-class classification to evaluate model performance by computing the F1 Score for each class and taking the average, weighted by the number of true instances for each class (i.e., the class "support").
It is especially useful when working with imbalanced datasets, where some classes are more frequent than others.
Definition
Where:
- = Number of classes
- = F1 Score for class
Key Features
- Classes with more data have more influence on the final score.
- Helps prevent small classes from skewing the result disproportionately.
- Often the default setting in many ML libraries like Scikit-learn (Python).
Simple Example
Suppose a dataset has three classes with these F1 Scores and supports:
- F1(Class A) = 0.90, Support = 50
- F1(Class B) = 0.70, Support = 30
- F1(Class C) = 0.50, Support = 20
First calculate total support:
Now calculate weighted F1:
So the Weighted F1 Score is **0.76**, favoring the majority class's performance.
Weighted vs Macro vs Micro F1
Metric | Weighting | Best For |
---|---|---|
Macro F1 | Equal weight for all classes | Equal treatment for each class |
Micro F1 | Global average over all TP, FP, FN | Imbalanced data, overall performance |
Weighted F1 | Weighted by class support | Imbalanced datasets, with performance emphasis on larger classes |
Use Cases
- **Text classification** (e.g., news topics, sentiment analysis)
- **Image classification** where some labels are rare
- **Healthcare diagnosis** with rare but critical outcomes
- **Customer segmentation** with uneven population groups
Limitations
- Might mask poor performance on minority classes if the model performs well on dominant ones.
- If class fairness is a concern, Macro F1 Score might be more appropriate.
Related Pages
SEO Keywords
weighted f1 score, f1 score for imbalanced data, machine learning multi-class metrics, class imbalance performance metric, weighted average f1, scikit-learn f1 weighted, macro vs weighted f1