Spaces:
Sleeping
A newer version of the Gradio SDK is available: 6.14.0
title: Plant Disease Assistant
emoji: π±
colorFrom: green
colorTo: yellow
sdk: gradio
sdk_version: 5.0.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Diagnose 22 crop diseases from a leaf. DINOv2-L on MI300X.
tags:
- plant-disease
- agriculture
- dinov2
- amd
- mi300x
- rocm
- image-classification
π± Plant Disease Assistant
Snap a leaf, name the disease, get a fix.
Upload a photo of a plant leaf and this Space will identify which of 22 crop diseases it has, rate the model's confidence, and return a structured treatment + prevention guide.
What's under the hood
| Component | Details |
|---|---|
| Classifier | DINOv2-Large (304M params, ViT-L/14), fine-tuned with a linear head |
| Accuracy | 97.06% top-1, 0.9713 macro F1 on the held-out test split |
| Dataset | CCMT Crop Pest and Disease Detection β cashew, cassava, maize, tomato |
| Hardware | Fine-tuned on a single AMD Instinct MI300X (192 GB HBM3) via the AMD Developer Cloud |
| Framework | PyTorch 2.x + ROCm, timm |
| Knowledge base | Hand-curated treatment, prevention, and severity notes per class |
The classifier runs on CPU here for accessibility β inference takes a few seconds per image. The original training run used ROCm on MI300X.
Crops & diseases covered
- Cashew β anthracnose, gumosis, leaf miner, red rust, healthy
- Cassava β bacterial blight, brown spot, green mite, mosaic, healthy
- Maize β fall armyworm, grasshopper, leaf beetle, leaf blight, leaf spot, streak virus, healthy
- Tomato β leaf blight, leaf curl, septoria leaf spot, verticilium wilt, healthy
How it was built
This is Track 2 of a multi-track entry in the lablab.ai AMD Developer Hackathon:
- Track 2 β Fine-tune DINOv2-L on CCMT for plant disease classification (this Space)
- Track 3 β Fine-tune Llama 3.2 11B Vision (LoRA) on the same data for conversational diagnosis (adapter on HF)
- Build in Public β Documented the journey end-to-end on social
Both tracks were trained on the same MI300X droplet, demonstrating that a single AMD GPU can comfortably handle both a 304M-param classifier and an 11B-param vision-language model in the same workflow.
Limitations
- Trained on a single dataset (CCMT) β performance on field photos with very different lighting, angles, or unseen crops will degrade.
- The treatment guidance is informational only and not a substitute for advice from a qualified agronomist or extension officer.
- CPU inference is intentionally slow (~5β10s/image). The original GPU pipeline runs in milliseconds.
License
Apache 2.0. Model weights and code are open. CCMT dataset licensing applies to the training data only.
Acknowledgements
- AMD for the Developer Cloud credits and MI300X access
- Meta for DINOv2
- The CCMT dataset authors
- lablab.ai for organizing the hackathon