mlx-community/rfdetr-seg-small-fp32

This model was converted to MLX format from RF-DETR (ICLR 2026) using mlx-vlm version 0.4.3.

RF-DETR Seg-Small supports both object detection and instance segmentation on COCO 80 classes.

Use with mlx

pip install -U mlx-vlm

from pathlib import Path
from PIL import Image
from mlx_vlm.utils import load_model
from mlx_vlm.models.rfdetr.processing_rfdetr import RFDETRProcessor
from mlx_vlm.models.rfdetr.generate import RFDETRPredictor

model = load_model(Path("mlx-community/rfdetr-seg-small-fp32"))
processor = RFDETRProcessor.from_pretrained("mlx-community/rfdetr-seg-small-fp32")
predictor = RFDETRPredictor(model, processor, score_threshold=0.3, nms_threshold=0.5)

result = predictor.predict(Image.open("image.jpg"))
# result.boxes   - (N, 4) xyxy pixel coordinates
# result.scores  - (N,) confidence scores
# result.masks   - (N, H, W) binary instance masks
# result.class_names - list of class names

CLI

# Image segmentation
python -m mlx_vlm.models.rfdetr.generate --image photo.jpg --model mlx-community/rfdetr-seg-small-fp32

# Video segmentation
python -m mlx_vlm.models.rfdetr.generate --video input.mp4 --model mlx-community/rfdetr-seg-small-fp32

# Realtime camera
python -m mlx_vlm.models.rfdetr.generate --task realtime --model mlx-community/rfdetr-seg-small-fp32

# Blur background (focus on detections)
python -m mlx_vlm.models.rfdetr.generate --image photo.jpg --model mlx-community/rfdetr-seg-small-fp32 --annotator blur+bg

# Pixelate detections
python -m mlx_vlm.models.rfdetr.generate --image photo.jpg --model mlx-community/rfdetr-seg-small-fp32 --annotator pixelate

Annotator presets

Preset	Effect
`mask+box`	Mask overlay + boxes + labels (default)
`blur`	Blur detections
`blur+bg`	Blur background
`pixelate`	Pixelate detections
`pixelate+bg`	Pixelate background
`halo+box`	Halo effect + boxes
`box`	Boxes + labels only

Model Details


Architecture	DINOv2-small backbone + C2f projector + Deformable DETR decoder + Segmentation head
Task	Object detection + instance segmentation (COCO 80 classes)
Parameters	~34M
Input resolution	384x384
Dtype	float32
Inference (M4 Max)	~~26ms per frame (~~38 FPS)

Reference

RF-DETR: Real-Time Detection Transformer (ICLR 2026)
Roboflow RF-DETR

Downloads last month: 27

MLX

Hardware compatibility

Quantized

Paper for mlx-community/rfdetr-seg-small-fp32

RF-DETR: Neural Architecture Search for Real-Time Detection Transformers

Paper • 2511.09554 • Published Nov 12, 2025 • 9