mlx-community/rfdetr-base-fp32

This model was converted to MLX format from RF-DETR (ICLR 2026) using mlx-vlm version 0.4.3.

RF-DETR is a real-time detection transformer achieving state-of-the-art performance on COCO.

Use with mlx

pip install -U mlx-vlm
from pathlib import Path
from PIL import Image
from mlx_vlm.utils import load_model
from mlx_vlm.models.rfdetr.processing_rfdetr import RFDETRProcessor
from mlx_vlm.models.rfdetr.generate import RFDETRPredictor

model = load_model(Path("mlx-community/rfdetr-base-fp32"))
processor = RFDETRProcessor.from_pretrained("mlx-community/rfdetr-base-fp32")
predictor = RFDETRPredictor(model, processor, score_threshold=0.3, nms_threshold=0.5)

result = predictor.predict(Image.open("image.jpg"))
for name, score, box in zip(result.class_names, result.scores, result.boxes):
    print(f"{name}: {score:.2f} [{box[0]:.0f}, {box[1]:.0f}, {box[2]:.0f}, {box[3]:.0f}]")

CLI

# Image
python -m mlx_vlm.models.rfdetr.generate --image photo.jpg --model mlx-community/rfdetr-base-fp32

# Video
python -m mlx_vlm.models.rfdetr.generate --video input.mp4 --model mlx-community/rfdetr-base-fp32

# Realtime camera
python -m mlx_vlm.models.rfdetr.generate --task realtime --model mlx-community/rfdetr-base-fp32

Model Details

Architecture DINOv2-small backbone + C2f projector + Deformable DETR decoder
Task Object detection (COCO 80 classes)
Parameters ~32M
Input resolution 560x560
Dtype float32
Inference (M4 Max) 32ms per image (31 FPS)

Reference

Downloads last month
40
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for mlx-community/rfdetr-base-fp32