README: Phosphene one-click banner + edit support clarified

aa8261c verified 11 days ago

4 kB

	---
	license: mit
	base_model: HiDream-ai/HiDream-O1-Image-Dev
	tags:
	- mlx
	- mlx-vlm
	- hidream
	- text-to-image
	- apple-silicon
	- quantized
	- q8
	language:
	- en
	pipeline_tag: text-to-image
	library_name: mlx
	inference: false
	authors:
	- Mrbizarro
	---

	# HiDream-O1-Image-Dev — MLX Q8 (Apple Silicon)

	> Ported by [Mrbizarro](https://huggingface.co/Mrbizarro) · MIT licensed · published to mlx-community

	## 🎛️ Run it one-click in [Phosphene](https://github.com/mrbizarro/phosphene)

	Phosphene is a free local generative-video panel for Apple Silicon. HiDream is wired into its Image Studio. [Install Pinokio](https://pinokio.computer), then in Pinokio install [Phosphene](https://github.com/mrbizarro/phosphene). Note: Phosphene's HiDream integration uses BF16 by default since edit requires BF16 — this Q8 repo is for text-to-image-only workflows that want a deterministic memory upper bound.

	---

	An 8-bit quantized MLX port of [HiDream-ai/HiDream-O1-Image-Dev](https://huggingface.co/HiDream-ai/HiDream-O1-Image-Dev).

	⚠ Q8 does NOT support edit / multi-ref. Per-group dequantization noise compounds against reference-image features in attention and produces degenerate output. For edit / multi-reference workflows use the [BF16 sibling repo](https://huggingface.co/mlx-community/HiDream-O1-Image-Dev-mlx-bf16) instead.

	## Sibling repos

	- 🟢 [BF16 (full precision)](https://huggingface.co/mlx-community/HiDream-O1-Image-Dev-mlx-bf16) — 17.5 GB, ~16 GB RAM, clean across all dimensions. Use this when in doubt.
	- 🟡 [Q6](https://huggingface.co/mlx-community/HiDream-O1-Image-Dev-mlx-q6) — 8 GB, ~8.5 GB RAM, fastest. Same artifact behaviour as Q8.
	- 🟡 Q8 (this repo) — 10 GB, ~11.5 GB RAM, balanced. Best at square 2048×2048 or 1024×1024. Visible 32-pixel patch grid in flat regions at non-square dims.

	## When to use Q8

	- ✅ Square 1024×1024 or 2048×2048 — clean output, less RAM than BF16
	- ✅ When you want a deterministic memory upper bound (Q8 doesn't depend on activation distribution)
	- ❌ Non-square dims (1440×2560, 3104×1312, etc) — visible 32-pixel patch grid in skies, walls, water → use BF16
	- ⚠ Q8 is not faster than Q6 here — both are bandwidth-bound on this hardware. Pick Q8 only if you need its slightly tighter quality margin at square dims.

	## What's in this repo

	- `model.safetensors` — Q8 quantized backbone (10 GB)
	- `extras/custom_heads.safetensors` — diffusion-side heads (75 MB, BF16)
	- `config.json` (with `quantization: {bits: 8, group_size: 64}` so mlx-vlm wraps `Linear → QuantizedLinear` correctly)
	- Tokenizer + processor configs

	## Code

	Inference scripts are in the [BF16 sibling repo](https://huggingface.co/mlx-community/HiDream-O1-Image-Dev-mlx-bf16) under `scripts/hidream_o1/`. Clone that for code, this for weights only.

	## Quick start

	```bash
	# Get the code from the BF16 sibling
	hf download mlx-community/HiDream-O1-Image-Dev-mlx-bf16 --local-dir hidream-o1-mlx \
	--include "scripts/" --include ".md" --include ".txt" --include ".gitattributes"
	cd hidream-o1-mlx
	uv venv --python 3.11 && uv pip install -r requirements.txt

	# Get the Q8 weights
	hf download mlx-community/HiDream-O1-Image-Dev-mlx-q8 --local-dir mlx_models/hidream-o1-dev-q8

	# Run (square dims only for clean output)
	.venv/bin/python scripts/hidream_o1/generate_hidream_o1_mlx.py \
	--model-path mlx_models/hidream-o1-dev-q8 \
	--prompt "your prompt here" \
	--width 2048 --height 2048 \
	--output out.png
	```

	## Performance

	\| Resolution \| Per step \| Total (28 steps) \| Peak RAM \| Quality \|
	\|---\|---\|---\|---\|---\|
	\| 1024×1024 \| 2.36 s \| 67 s \| 11.5 GB \| ✅ clean \|
	\| 2048×2048 \| 6.68 s \| 187 s \| 11.5 GB \| ✅ clean \|
	\| 1440×2560 (non-square) \| ~4.5 s \| ~127 s \| ~10 GB \| ⚠ patch grid visible \|

	## License

	MIT — see the [BF16 repo](https://huggingface.co/mlx-community/HiDream-O1-Image-Dev-mlx-bf16) for the full LICENSE file and acknowledgements.