The Dataset Challenge
Training an accurate object detection model requires high-quality, consistently labeled data. My journey began with four datasets from different sources, each with its own labeling conventions.
| Dataset | Train | Validation | Test | Label Format |
|---|---|---|---|---|
| V1 | 6,075 | 1,737 | 868 | Bee=0, Varroa=1 ✓ |
| V2 | 8,217 | 1,867 | 3,408 | Bee=0, Varroa=1 ✓ |
| V3 | 8,093 | 1,175 | 468 | Bee=0, Varroa=1 ✓ |
| V4 | 5,144 | 0 | 0 | Varroa=0, Bee=1 |
The Label Conflict
Dataset V4 used the opposite labeling convention. It labeled varroa as class 0 and bees as class 1, while all other datasets and my standard convention used class 0 for bees and class 1 for varroa. This required a label conversion step before merging.
Dataset Merging Flow
The Label Conversion Bug
My first attempt to fix the V4 labels had a critical bug:
mapping = {'0': '1', '3': '0'} # Wrong approach!
for line in lines:
parts = line.strip().split()
old_id = parts[0]
if old_id in mapping:
parts[0] = mapping[old_id] # Bug: Sequential mapping causes collision What Went Wrong
Before: 5,851 bees (class 0) + 5,235 varroa (class 1)
After bug: 0 bees + 11,086 varroa — everything merged into class 1!
The script converted all class 0 → class 1 first, then had nothing left to convert the other direction. Bee labels were completely lost.
The Fix: Atomic Swap
for line in lines:
parts = line.strip().split()
if not parts:
continue
old_id = parts[0]
if old_id == '0':
parts[0] = '1' # varroa (was 0) → 1
modified = True
elif old_id == '1':
parts[0] = '0' # bee (was 1) → 0
modified = True Result
All 5,144 V4 files correctly converted with bee=0, varroa=1. Both classes preserved.
Class Distribution Analysis
| Split | Bees (Class 0) | Varroa (Class 1) | Ratio |
|---|---|---|---|
| Training | 55,287 | 13,397 | 4.13:1 |
| Validation | 14,343 | 4,010 | 3.58:1 |
| Total | 69,630 | 17,407 | 4.00:1 |
Analysis
The 4:1 bee-to-varroa ratio is a significant but realistic class imbalance — in real hive frames, bees genuinely outnumber varroa mites. The strategy was to start with default training and apply class weighting only if varroa detection performance proved insufficient.
Class Distribution
Image Size & IMGSZ Analysis
Understanding native image resolution is critical for choosing the right imgsz parameter. Upscaling low-resolution images creates artificial detail and slows training without improving accuracy.
Short Side Distribution (px)
Orientation Distribution
Key Finding: 75% of Images are 160px
Using common YOLO defaults like imgsz=640 would upscale most images by 4×, creating interpolation artifacts and teaching the model from synthetic detail rather than real image content.
Upscaling Impact by imgsz Setting
| imgsz | Upscaled Images | Upscale Factor | Assessment |
|---|---|---|---|
| 160 | 0% | 1.0× | No upscaling |
| 320 | 79.8% | 2.0× | Manageable |
| 512 | 79.8% | 3.2× | Significant upscaling |
| 640 | 97.9% | 4.0× | Excessive upscaling |
Varroa median box size of 47px — well above the 16px detection threshold
Faster training and inference vs larger sizes
Note: 79.8% of images will still be upscaled 2× — source image quality matters
Varroa Bounding Box Analysis
Small objects (<16px) are notoriously difficult for YOLO to detect. This analysis determines how large varroa bounding boxes appear at different imgsz settings.
| imgsz | Median Box Size | 10th Percentile | Boxes <16px | Assessment |
|---|---|---|---|---|
| 160 | 23.5px | 15.0px | 13.3% | Too many tiny boxes |
| 320 | 46.9px | 30.0px | 0.7% | Optimal balance |
| 512 | 75.1px | 48.0px | 0.3% | Good but more upscaling |
| 640 | 93.9px | 60.0px | 0.1% | Excessive upscaling |
imgsz Decision Tree
Model & Training Setup
Model Selection: YOLOv8s
| Factor | YOLOv8s | YOLOv11s |
|---|---|---|
| Maturity | Battle-tested | Relatively new |
| Documentation | Extensive | Growing |
| Small Object Detection | Proven excellent | Similar/slightly better |
| Inference Speed | Fast | 10–15% faster |
| mAP Performance | Excellent | 1–2% better |
| Best For | Production reliability | Research |
Decision: YOLOv8s
For a production varroa detection system, stability and proven performance outweigh marginal speed improvements. YOLOv8s handles 30–90px objects excellently and has a mature ecosystem.
Baseline-First Training Strategy
YOLOv8s, imgsz=320, batch=16] B --> C{Evaluate Results} C --> D[Check Overall mAP50] C --> E[Check Varroa Recall] C --> F[Check Bee Precision] D --> G{mAP > 0.7?} E --> H{Varroa Recall > 0.7?} F --> I{Bee False Positives?} G -->|No| J[Adjust Learning Rate / Increase Epochs] H -->|No| K[Add Class Weights: Varroa=1.0, Bee=0.24] I -->|Yes| L[Try imgsz=512 for More Detail] G -->|Yes| M[Success!] H -->|Yes| M I -->|No| M M --> N[Deploy & Monitor] style A fill:#1f4d2e,stroke:#28a745 style M fill:#1f4d2e,stroke:#28a745 style B fill:#1a2d3a style K fill:#4d3d1f style L fill:#4d3d1f style N fill:#1a3d2a
Baseline Configuration
from ultralytics import YOLO
model = YOLO('yolov8s.pt')
results = model.train(
data='varroa.yaml',
epochs=100,
imgsz=320,
batch=16,
patience=20,
project='varroa_detection',
name='baseline_v8s_imgsz320'
) Why Start Simple?
- Establishes a performance baseline without confounding variables
- Reveals whether the data quality itself is sufficient
- Makes it clear what each subsequent optimization actually contributes
- Prevents premature tuning of the wrong metrics
Key Takeaways
Critical Success Factors
- Label standardization is essential: Even simple mapping errors can destroy entire datasets
- Analyze before training: Understanding image sizes prevents wasted compute
- Match imgsz to data reality: Don't blindly use defaults — 640px isn't always optimal
- Small object detection has thresholds: Keep bounding boxes above 16px when possible
- Baseline first, optimize later: Systematic experimentation beats premature tuning
Common Pitfalls to Avoid
- Sequential label mapping without collision checking
- Assuming all datasets follow the same convention
- Using
imgsz=640without analyzing native resolutions - Ignoring class imbalance until after training fails
- Not validating label conversions with sample visualizations