merolav-space / README.md
iamcode6's picture
Initial commit from automated deployment script
14190f3 verified
---
title: Plant Disease Assistant
emoji: 🌱
colorFrom: green
colorTo: yellow
sdk: gradio
sdk_version: 5.0.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Diagnose 22 crop diseases from a leaf. DINOv2-L on MI300X.
tags:
- plant-disease
- agriculture
- dinov2
- amd
- mi300x
- rocm
- image-classification
---
# 🌱 Plant Disease Assistant
Snap a leaf, name the disease, get a fix.
Upload a photo of a plant leaf and this Space will identify which of **22 crop diseases** it has, rate the model's confidence, and return a structured treatment + prevention guide.
## What's under the hood
| Component | Details |
|---|---|
| **Classifier** | DINOv2-Large (304M params, ViT-L/14), fine-tuned with a linear head |
| **Accuracy** | 97.06% top-1, 0.9713 macro F1 on the held-out test split |
| **Dataset** | [CCMT Crop Pest and Disease Detection](https://www.kaggle.com/datasets) β€” cashew, cassava, maize, tomato |
| **Hardware** | Fine-tuned on a single AMD Instinct **MI300X** (192 GB HBM3) via the AMD Developer Cloud |
| **Framework** | PyTorch 2.x + ROCm, [timm](https://github.com/huggingface/pytorch-image-models) |
| **Knowledge base** | Hand-curated treatment, prevention, and severity notes per class |
The classifier runs on CPU here for accessibility β€” inference takes a few seconds per image. The original training run used ROCm on MI300X.
## Crops & diseases covered
- **Cashew** β€” anthracnose, gumosis, leaf miner, red rust, healthy
- **Cassava** β€” bacterial blight, brown spot, green mite, mosaic, healthy
- **Maize** β€” fall armyworm, grasshopper, leaf beetle, leaf blight, leaf spot, streak virus, healthy
- **Tomato** β€” leaf blight, leaf curl, septoria leaf spot, verticilium wilt, healthy
## How it was built
This is **Track 2** of a multi-track entry in the lablab.ai AMD Developer Hackathon:
- **Track 2** β€” Fine-tune DINOv2-L on CCMT for plant disease classification (this Space)
- **Track 3** β€” Fine-tune Llama 3.2 11B Vision (LoRA) on the same data for conversational diagnosis ([adapter on HF](https://huggingface.co/iamcode6/llama32-vision-ccmt-mi300x))
- **Build in Public** β€” Documented the journey end-to-end on social
Both tracks were trained on the same MI300X droplet, demonstrating that a single AMD GPU can comfortably handle both a 304M-param classifier and an 11B-param vision-language model in the same workflow.
## Limitations
- Trained on a single dataset (CCMT) β€” performance on field photos with very different lighting, angles, or unseen crops will degrade.
- The treatment guidance is informational only and **not a substitute for advice from a qualified agronomist or extension officer**.
- CPU inference is intentionally slow (~5–10s/image). The original GPU pipeline runs in milliseconds.
## License
Apache 2.0. Model weights and code are open. CCMT dataset licensing applies to the training data only.
## Acknowledgements
- AMD for the Developer Cloud credits and MI300X access
- Meta for [DINOv2](https://github.com/facebookresearch/dinov2)
- The CCMT dataset authors
- lablab.ai for organizing the hackathon