README: Phosphene one-click banner + edit support clarified

2d7abab verified 12 days ago

3.8 kB

	---
	license: mit
	base_model: HiDream-ai/HiDream-O1-Image-Dev
	tags:
	- mlx
	- mlx-vlm
	- hidream
	- text-to-image
	- apple-silicon
	- quantized
	- q6
	language:
	- en
	pipeline_tag: text-to-image
	library_name: mlx
	inference: false
	authors:
	- Mrbizarro
	---

	# HiDream-O1-Image-Dev — MLX Q6 (Apple Silicon)

	> Ported by [Mrbizarro](https://huggingface.co/Mrbizarro) · MIT licensed · published to mlx-community

	## 🎛️ Run it one-click in [Phosphene](https://github.com/mrbizarro/phosphene)

	Phosphene is a free local generative-video panel for Apple Silicon. HiDream is wired into its Image Studio. [Install Pinokio](https://pinokio.computer), then in Pinokio install [Phosphene](https://github.com/mrbizarro/phosphene). Note: Phosphene's HiDream integration uses BF16 by default since edit requires BF16 — this Q6 repo is for text-to-image-only workflows on RAM-constrained machines.

	---

	A 6-bit quantized MLX port of [HiDream-ai/HiDream-O1-Image-Dev](https://huggingface.co/HiDream-ai/HiDream-O1-Image-Dev).

	⚠ Q6 does NOT support edit / multi-ref. Per-group dequantization noise compounds against reference-image features in attention and produces degenerate output. For edit / multi-reference workflows use the [BF16 sibling repo](https://huggingface.co/mlx-community/HiDream-O1-Image-Dev-mlx-bf16) instead.

	## Sibling repos

	- 🟢 [BF16 (full precision)](https://huggingface.co/mlx-community/HiDream-O1-Image-Dev-mlx-bf16) — 17.5 GB, ~16 GB RAM, clean across all dimensions. Use this when in doubt.
	- 🟡 Q6 (this repo) — 8 GB, ~8.5 GB RAM, fast. Best at square 2048×2048 or 1024×1024. Visible 32-pixel patch grid in flat regions at non-square dims.
	- 🟡 [Q8](https://huggingface.co/mlx-community/HiDream-O1-Image-Dev-mlx-q8) — 10 GB, ~11.5 GB RAM, same artifact behaviour as Q6 at non-square dims.

	## When to use Q6

	- ✅ Square 1024×1024 or 2048×2048 — clean output, half the time of BF16
	- ✅ RAM-constrained — fits 16 GB Macs alongside other apps
	- ❌ Non-square dims (1440×2560, 3104×1312, etc) — visible 32-pixel patch grid in skies, walls, water → use BF16

	## What's in this repo

	- `model.safetensors` — Q6 quantized backbone (8 GB)
	- `extras/custom_heads.safetensors` — diffusion-side heads (75 MB, BF16)
	- `config.json` (with `quantization: {bits: 6, group_size: 64}` so mlx-vlm wraps `Linear → QuantizedLinear` correctly)
	- Tokenizer + processor configs

	## Code

	The inference scripts are not in this repo — they live in the [BF16 sibling repo](https://huggingface.co/mlx-community/HiDream-O1-Image-Dev-mlx-bf16) under `scripts/hidream_o1/`. Clone that for code, this for weights only.

	## Quick start

	```bash
	# Get the code
	hf download mlx-community/HiDream-O1-Image-Dev-mlx-bf16 --local-dir hidream-o1-mlx \
	--include "scripts/" --include ".md" --include ".txt" --include ".gitattributes"
	cd hidream-o1-mlx
	uv venv --python 3.11 && uv pip install -r requirements.txt

	# Get the Q6 weights
	hf download mlx-community/HiDream-O1-Image-Dev-mlx-q6 --local-dir mlx_models/hidream-o1-dev-q6

	# Run (square dims only for clean output)
	.venv/bin/python scripts/hidream_o1/generate_hidream_o1_mlx.py \
	--model-path mlx_models/hidream-o1-dev-q6 \
	--prompt "your prompt here" \
	--width 2048 --height 2048 \
	--output out.png
	```

	## Performance

	\| Resolution \| Per step \| Total (28 steps) \| Peak RAM \| Quality \|
	\|---\|---\|---\|---\|---\|
	\| 1024×1024 \| 1.30 s \| 36 s \| 8.5 GB \| ✅ clean \|
	\| 2048×2048 \| 5.51 s \| 154 s \| 9 GB \| ✅ clean \|
	\| 1440×2560 (non-square) \| 4.50 s \| 127 s \| 8.5 GB \| ⚠ patch grid visible \|

	## License

	MIT — see the [BF16 repo](https://huggingface.co/mlx-community/HiDream-O1-Image-Dev-mlx-bf16) for the full LICENSE file and acknowledgements.