YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Agri-llama: Official Vision-Language Model by Aqib Mehedi
Agri-llama is a powerful multimodal model designed for advanced agricultural analysis, combining visual perception with sophisticated language reasoning. It is optimized for both precision vision tasks and interactive agricultural consultation.
π Key Features
- Vision-Language Integration: Analyze agricultural imagery (crops, pests, soil) alongside textual queries.
- High Performance: Optimized for efficiency and accuracy in specialized domains.
- Multimodal Chat: Interactive dialogue support for complex agricultural problem-solving.
- Flexible Deployment: Available in both Hugging Face Safetensors (FP16) and GGUF (Quantized) formats.
π» System Configuration (Development Environment)
This model was developed and verified on the following configuration:
- OS: Microsoft Windows 11 Pro (Build 26200)
- GPU: NVIDIA GeForce RTX 3060 (12GB VRAM)
- CUDA: 13.0
- Driver: 581.29
- Python: 3.11.9
Recommended Hardware for Inference
- GGUF (Quantized): 8GB+ VRAM (Full GPU offloading) or 16GB+ System RAM (CPU only).
- Safetensors (FP16): 12GB+ VRAM (NVIDIA RTX 3060 or better recommended).
π Beginner's Quick Start Guide
Follow these steps to get Agri-llama running on your local machine.
1. Prerequisites
Ensure you have Python 3.10+ and Git installed. If you have an NVIDIA GPU, install the CUDA Toolkit.
2. Setup Environment
Open your terminal (PowerShell or Command Prompt) and run:
# Clone the repository
git clone https://huggingface.co/aqibcareer007/agri-llama
cd agri-llama
# Create a virtual environment (recommended)
python -m venv venv
venv\Scripts\activate
# Install requirements
pip install -r requirements.txt
3. Running the Model
Option A: Interactive Chat (Easiest for Beginners)
This uses the quantized GGUF model which is fast and memory-efficient.
python scripts/run_chat.py
Wait for the π€ Model Loaded! message and start typing your agricultural questions.
Option B: Vision Analysis (Python API)
Use this for programmatically analyzing images.
import torch
from transformers import AutoProcessor, AgriLlamaForConditionalGeneration
from PIL import Image
model_id = "aqibcareer007/agri-llama"
# Load Model
model = AgriLlamaForConditionalGeneration.from_pretrained(
model_id,
torch_dtype=torch.float16,
device_map="auto"
)
processor = AutoProcessor.from_pretrained(model_id)
# Example: Analyze a crop image
image = Image.open("path_to_your_crop_image.jpg")
prompt = "<bos><start_of_turn>user\n<image>\nWhat is wrong with this leaf?<end_of_turn>\n<start_of_turn>model\n"
inputs = processor(text=prompt, images=image, return_tensors="pt").to("cuda")
output = model.generate(**inputs, max_new_tokens=200)
print(processor.decode(output[0], skip_special_tokens=True))
π Technical Specifications
- Model Type: Multimodal (Vision + Text)
- Base Architecture: 4B Parameters
- Context Window: 131,072 tokens
- Quantization: GGUF Q4_K_M (Included for efficiency)
π License
[Insert License Here - e.g., Apache 2.0]
π€ Acknowledgments
Developed by Aqib Mehedi, Senior AI Engineer at Kamal-Paterson Ltd. For support or inquiries, please visit the official Hugging Face repository.
- Downloads last month
- 94