mlx-community/rfdetr-seg-large-fp32

This model was converted to MLX format from RF-DETR (ICLR 2026) using mlx-vlm version 0.4.3.

RF-DETR Seg-Large produces higher quality instance masks than Seg-Small thanks to higher input resolution (504px vs 384px), more decoder layers (5 vs 4), and more segmentation blocks (5 vs 4).

Use with mlx

pip install -U mlx-vlm
from pathlib import Path
from PIL import Image
from mlx_vlm.utils import load_model
from mlx_vlm.models.rfdetr.processing_rfdetr import RFDETRProcessor
from mlx_vlm.models.rfdetr.generate import RFDETRPredictor

model = load_model(Path("mlx-community/rfdetr-seg-large-fp32"))
processor = RFDETRProcessor.from_pretrained("mlx-community/rfdetr-seg-large-fp32")
predictor = RFDETRPredictor(model, processor, score_threshold=0.3, nms_threshold=0.5)

result = predictor.predict(Image.open("image.jpg"))
# result.boxes   - (N, 4) xyxy pixel coordinates
# result.scores  - (N,) confidence scores
# result.masks   - (N, H, W) binary instance masks
# result.class_names - list of class names

CLI

# Image segmentation
python -m mlx_vlm.models.rfdetr.generate --task segment --image photo.jpg --model mlx-community/rfdetr-seg-large-fp32

# Video processing
python -m mlx_vlm.models.rfdetr.generate --task track --video input.mp4 --model mlx-community/rfdetr-seg-large-fp32

# Realtime camera
python -m mlx_vlm.models.rfdetr.generate --task realtime --model mlx-community/rfdetr-seg-large-fp32

# Blur background
python -m mlx_vlm.models.rfdetr.generate --task segment --image photo.jpg --model mlx-community/rfdetr-seg-large-fp32 --annotator blur+bg

Model Details

Architecture DINOv2-small backbone + C2f projector + 5-layer Deformable DETR decoder + 5-block Segmentation head
Task Object detection + instance segmentation (COCO 80 classes)
Parameters ~36M
Input resolution 504x504
Mask resolution 126x126
Dtype float32
Inference (M4 Max) 89ms per image (11 FPS)
Peak memory ~1.2 GB

Comparison

Model Resolution Mask quality Inference Memory
rfdetr-seg-small-fp32 384px (96x96 masks) Good ~26ms (38 FPS) 683 MB
rfdetr-seg-large-fp32 504px (126x126 masks) Better ~89ms (11 FPS) 1.2 GB

Reference

Downloads last month
10
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for mlx-community/rfdetr-seg-large-fp32