Instructions to use mlx-community/HiDream-O1-Image-Dev-mlx-bf16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/HiDream-O1-Image-Dev-mlx-bf16 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir HiDream-O1-Image-Dev-mlx-bf16 mlx-community/HiDream-O1-Image-Dev-mlx-bf16
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
README: add Mrbizarro byline + authors YAML
Browse files
README.md
CHANGED
|
@@ -13,10 +13,14 @@ language:
|
|
| 13 |
pipeline_tag: text-to-image
|
| 14 |
library_name: mlx
|
| 15 |
inference: false
|
|
|
|
|
|
|
| 16 |
---
|
| 17 |
|
| 18 |
# HiDream-O1-Image-Dev — MLX port for Apple Silicon
|
| 19 |
|
|
|
|
|
|
|
| 20 |
A native MLX port of [HiDream-ai/HiDream-O1-Image-Dev](https://huggingface.co/HiDream-ai/HiDream-O1-Image-Dev) for fast local image generation on Apple Silicon Macs. **No PyTorch, no CUDA, no flash-attn required at inference time.**
|
| 21 |
|
| 22 |
HiDream-O1 is an 8B Qwen3-VL-based **unified pixel-patch transformer** — it predicts raw 32×32 RGB patches directly through the same backbone that handles text, with no separate VAE. The Dev variant is a 28-step distillation of the 50-step Full model, released under the MIT license.
|
|
|
|
| 13 |
pipeline_tag: text-to-image
|
| 14 |
library_name: mlx
|
| 15 |
inference: false
|
| 16 |
+
authors:
|
| 17 |
+
- Mrbizarro
|
| 18 |
---
|
| 19 |
|
| 20 |
# HiDream-O1-Image-Dev — MLX port for Apple Silicon
|
| 21 |
|
| 22 |
+
> Ported by **[Mrbizarro](https://huggingface.co/Mrbizarro)** · MIT licensed · published to mlx-community
|
| 23 |
+
|
| 24 |
A native MLX port of [HiDream-ai/HiDream-O1-Image-Dev](https://huggingface.co/HiDream-ai/HiDream-O1-Image-Dev) for fast local image generation on Apple Silicon Macs. **No PyTorch, no CUDA, no flash-attn required at inference time.**
|
| 25 |
|
| 26 |
HiDream-O1 is an 8B Qwen3-VL-based **unified pixel-patch transformer** — it predicts raw 32×32 RGB patches directly through the same backbone that handles text, with no separate VAE. The Dev variant is a 28-step distillation of the 50-step Full model, released under the MIT license.
|