Instructions to use MohitML10/urban-expansion-detector-72b-v3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use MohitML10/urban-expansion-detector-72b-v3 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="MohitML10/urban-expansion-detector-72b-v3")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("MohitML10/urban-expansion-detector-72b-v3", dtype="auto")

PEFT
How to use MohitML10/urban-expansion-detector-72b-v3 with PEFT:
```
Task type is invalid.
```
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use MohitML10/urban-expansion-detector-72b-v3 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "MohitML10/urban-expansion-detector-72b-v3"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MohitML10/urban-expansion-detector-72b-v3",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/MohitML10/urban-expansion-detector-72b-v3

SGLang

How to use MohitML10/urban-expansion-detector-72b-v3 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "MohitML10/urban-expansion-detector-72b-v3" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MohitML10/urban-expansion-detector-72b-v3",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "MohitML10/urban-expansion-detector-72b-v3" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MohitML10/urban-expansion-detector-72b-v3",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use MohitML10/urban-expansion-detector-72b-v3 with Docker Model Runner:
```
docker model run hf.co/MohitML10/urban-expansion-detector-72b-v3
```

Urban Expansion Detector — Qwen2.5-VL-72B

Fine-tuned Qwen2.5-VL-72B-Instruct on AMD MI300X for satellite imagery urban expansion detection. Analyzes Sentinel-2 overhead tiles, identifies built-up areas with normalized bounding boxes, estimates urban coverage fraction, and generates plain-language corridor reports for transit infrastructure monitoring.

Demonstrated on the Delhi-Meerut RRTS corridor — India's first operational regional rapid transit system.

Model Description

Developed by: MohitML10
Model type: Vision-Language Model (LoRA fine-tune)
Language: English
License: Apache 2.0
Finetuned from: Qwen/Qwen2.5-VL-72B-Instruct

Model Sources

Demo: HuggingFace Space
Dataset: NuTonic/sat-bbox-metadata-sft-v1

Direct Use

Upload any overhead satellite image (Sentinel-2 or equivalent). The model returns:

Built area fraction as a decimal (0.0 to 1.0)
Bounding box coordinates for detected urban clusters in [x1, y1, x2, y2] normalized format
Plain-language spatial analysis describing urban development patterns

Downstream Use

Multi-tile corridor analysis — feed sequential tiles along a transit route and the model synthesizes a corridor-level urban development summary with PDF export. Intended for urban planners, policy researchers, and transit infrastructure teams.

Out-of-Scope Use

High-resolution aerial imagery (trained on Sentinel-2 resolution)
Non-overhead ground-level photography
Precise cadastral or property-level boundary detection

How to Get Started with the Model

from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor
from peft import PeftModel
import torch
from PIL import Image

base_model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
    "Qwen/Qwen2.5-VL-72B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model = PeftModel.from_pretrained(base_model, "MohitML10/urban-expansion-detector-72b-v3")
processor = AutoProcessor.from_pretrained("Qwen/Qwen2.5-VL-72B-Instruct")

image = Image.open("your_satellite_tile.png").convert("RGB")
messages = [{
    "role": "user",
    "content": [
        {"type": "image", "image": image},
        {"type": "text", "text": "Analyze this satellite image for urban expansion. Provide bounding boxes [x1,y1,x2,y2] normalized 0-1 for each urban cluster and estimate the built area fraction."}
    ]
}]
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=[text], images=[image], return_tensors="pt").to(model.device)
with torch.no_grad():
    output = model.generate(**inputs, max_new_tokens=512)
print(processor.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Training Data

8,000 curated examples from NuTonic/sat-bbox-metadata-sft-v1 by Joseph Pollack, filtered to high built-fraction tiles (built_fraction >= 0.15). Dataset contains Sentinel-2 satellite imagery paired with TiM-style land cover analytics JSON and expert geospatial analysis text.

437 India-specific tiles included covering urbanizing regions across the subcontinent.

Training Hyperparameters

Training regime: bfloat16
LoRA rank: 16
LoRA alpha: 32
Target modules: q_proj, v_proj
LoRA dropout: 0.05
Trainable parameters: 32,768,000 (0.0446% of total)
Total parameters: 73,443,545,344
Learning rate: 2e-4
Optimizer: AdamW
Steps: 3,000
Gradient clipping: 1.0

Speeds, Sizes, Times

Hardware: AMD MI300X (192GB HBM3) via AMD Developer Cloud
Framework: ROCm 6.2, PyTorch 2.5.1+rocm6.2
Inference time: ~30 seconds per tile on MI300X
Adapter size: 131MB

Results

Qualitative evaluation on Delhi-Meerut RRTS corridor tiles:

Station	Built Fraction	Clusters Detected
Meerut South	0.34	2
Muradnagar	0.36	2
Sarai Kale Khan	0.36	1

Model correctly identifies urban cluster concentration patterns and produces coherent corridor-level synthesis describing transit-induced urbanization gradients.

Environmental Impact

Hardware Type: AMD MI300X
Cloud Provider: AMD Developer Cloud (DigitalOcean)
Compute Region: US East
Hours used: ~29 hours total (training + inference testing)

Full Pipeline Capability

Input: one or more Sentinel-2 satellite tiles Output: annotated images with bounding boxes drawn over detected urban clusters, built area fraction per tile, plain-language spatial analysis, and a multi-page PDF corridor report synthesizing findings across all tiles.

Model Architecture

LoRA adapters applied to q_proj and v_proj layers of Qwen2.5-VL-72B-Instruct. Base model handles multimodal vision-language understanding; adapters steer output toward geospatial analytical format with normalized bounding box coordinates and built fraction estimates.

Compute Infrastructure

AMD MI300X — 192GB HBM3 unified memory. Required for loading 72B parameter model at bfloat16 (144GB) with headroom for LoRA adapter states and activations.

Citation

@misc{urban-expansion-detector-2026,
  author = {MohitML10},
  title = {Urban Expansion Detector: Fine-tuned Qwen2.5-VL-72B for Satellite Urban Expansion Detection},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/MohitML10/urban-expansion-detector-72b-v3}
}

Dataset citation:

@dataset{nutonic-sat-bbox-2024,
  author = {Pollack, Joseph},
  title = {sat-bbox-metadata-sft-v1},
  publisher = {HuggingFace},
  url = {https://huggingface.co/datasets/NuTonic/sat-bbox-metadata-sft-v1}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for MohitML10/urban-expansion-detector-72b-v3

Base model

Qwen/Qwen2.5-VL-72B-Instruct

Adapter

(23)

this model

MohitML10
/

urban-expansion-detector-72b-v3