Falcon-OCR (MLX, bf16)

MLX-converted weights of tiiuae/Falcon-OCR for inference on Apple Silicon via mlx-vlm.

Source

Base model: tiiuae/Falcon-OCR — Falcon Perception Team, Technology Innovation Institute (TII).
License: Apache 2.0 (matches the upstream base model).
Architecture: early-fusion vision-language model, 300M parameters.

Conversion details

Tool: mlx_vlm.convert (mlx-vlm 0.4.4).
Dtype: bfloat16.
Source revision: 3a4d95a8b0008f7430df30a82cf35e6c3b6bcb66.
trust_remote_code=True — the repository ships a custom FalconOCRProcessor / FalconOCRForCausalLM that is loaded via dynamic module import.

Known caveat

mlx_vlm.convert raises AttributeError: 'FalconOCRProcessor' object has no attribute 'save_pretrained' at the very end of the conversion step. The weights and tokenizer are written successfully before the error — so the artifacts uploaded here are complete and mlx_vlm.load(...) / docling's MlxVlmEngine can consume them.

Tracked upstream: https://github.com/Blaizzy/mlx-vlm/issues.

Usage

from mlx_vlm import load, generate
model, processor = load("mlx-community/Falcon-OCR-bf16", trust_remote_code=True)
output = generate(model, processor, prompt="", image=["path/to/page.png"])
print(output)

Attribution

All credit for the underlying model goes to the Falcon Perception Team at TII. Cite the model card for academic references.

Downloads last month: -

Safetensors

Model size

0.3B params

Tensor type

BF16

MLX

Hardware compatibility

Quantized

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mlx-community/Falcon-OCR-bf16

Base model

tiiuae/Falcon-OCR

Finetuned

(1)

this model

Evaluation results

Overall on allenai/olmOCR-bench View evaluation results

source leaderboard

80.3 ^*
Arxiv Math on allenai/olmOCR-bench View evaluation results

source leaderboard

80.5 ^*
Old Scans Math on allenai/olmOCR-bench View evaluation results

source leaderboard

69.2 ^*
Table Tests on allenai/olmOCR-bench View evaluation results

source leaderboard

90.3 ^*
Old Scans on allenai/olmOCR-bench View evaluation results

source leaderboard

43.5 ^*
Headers Footers on allenai/olmOCR-bench View evaluation results

source leaderboard

94 ^*
Multi Column on allenai/olmOCR-bench View evaluation results

source leaderboard

87.1 ^*
Long Tiny Text on allenai/olmOCR-bench View evaluation results

source leaderboard

78.5 ^*
Baseline on allenai/olmOCR-bench View evaluation results

source leaderboard

99.5 ^*