merolav-space / README.md
iamcode6's picture
Initial commit from automated deployment script
14190f3 verified

A newer version of the Gradio SDK is available: 6.14.0

Upgrade
metadata
title: Plant Disease Assistant
emoji: 🌱
colorFrom: green
colorTo: yellow
sdk: gradio
sdk_version: 5.0.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Diagnose 22 crop diseases from a leaf. DINOv2-L on MI300X.
tags:
  - plant-disease
  - agriculture
  - dinov2
  - amd
  - mi300x
  - rocm
  - image-classification

🌱 Plant Disease Assistant

Snap a leaf, name the disease, get a fix.

Upload a photo of a plant leaf and this Space will identify which of 22 crop diseases it has, rate the model's confidence, and return a structured treatment + prevention guide.

What's under the hood

Component Details
Classifier DINOv2-Large (304M params, ViT-L/14), fine-tuned with a linear head
Accuracy 97.06% top-1, 0.9713 macro F1 on the held-out test split
Dataset CCMT Crop Pest and Disease Detection β€” cashew, cassava, maize, tomato
Hardware Fine-tuned on a single AMD Instinct MI300X (192 GB HBM3) via the AMD Developer Cloud
Framework PyTorch 2.x + ROCm, timm
Knowledge base Hand-curated treatment, prevention, and severity notes per class

The classifier runs on CPU here for accessibility β€” inference takes a few seconds per image. The original training run used ROCm on MI300X.

Crops & diseases covered

  • Cashew β€” anthracnose, gumosis, leaf miner, red rust, healthy
  • Cassava β€” bacterial blight, brown spot, green mite, mosaic, healthy
  • Maize β€” fall armyworm, grasshopper, leaf beetle, leaf blight, leaf spot, streak virus, healthy
  • Tomato β€” leaf blight, leaf curl, septoria leaf spot, verticilium wilt, healthy

How it was built

This is Track 2 of a multi-track entry in the lablab.ai AMD Developer Hackathon:

  • Track 2 β€” Fine-tune DINOv2-L on CCMT for plant disease classification (this Space)
  • Track 3 β€” Fine-tune Llama 3.2 11B Vision (LoRA) on the same data for conversational diagnosis (adapter on HF)
  • Build in Public β€” Documented the journey end-to-end on social

Both tracks were trained on the same MI300X droplet, demonstrating that a single AMD GPU can comfortably handle both a 304M-param classifier and an 11B-param vision-language model in the same workflow.

Limitations

  • Trained on a single dataset (CCMT) β€” performance on field photos with very different lighting, angles, or unseen crops will degrade.
  • The treatment guidance is informational only and not a substitute for advice from a qualified agronomist or extension officer.
  • CPU inference is intentionally slow (~5–10s/image). The original GPU pipeline runs in milliseconds.

License

Apache 2.0. Model weights and code are open. CCMT dataset licensing applies to the training data only.

Acknowledgements

  • AMD for the Developer Cloud credits and MI300X access
  • Meta for DINOv2
  • The CCMT dataset authors
  • lablab.ai for organizing the hackathon