Instructions to use mlx-community/HiDream-O1-Image-Dev-mlx-bf16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/HiDream-O1-Image-Dev-mlx-bf16 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir HiDream-O1-Image-Dev-mlx-bf16 mlx-community/HiDream-O1-Image-Dev-mlx-bf16
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
File size: 7,386 Bytes
ffe929e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 | # HiDream-O1 β Phosphene integration plan
**Status:** plan only. No edits to Phosphene yet. Show this to Salo for approval first.
## Where it slots in
Phosphene's `agent/image_engine.py` already abstracts image generation behind
`generate(prompt, n, output_dir, ..., config)` with a `kind` discriminator.
Three kinds exist today: `mock`, `mflux`, `bfl`. We add a fourth: `hidream`.
Pattern matches `mflux`: subprocess invocation of an external Python that owns
its own venv. Phosphene stays clean, dependencies stay isolated.
## Files touched (3)
### 1. `agent/image_engine.py` β add config fields, dispatch, generator
```python
# Inside ImageEngineConfig (after mflux_quantize):
hidream_python: str = "" # path to lab venv python; empty = autodetect
hidream_model_path: str = "" # path to converted MLX model dir; empty = autodetect
hidream_steps: int = 28
hidream_noise_scale: float = 7.5 # Dev recipe default; do not change
hidream_noise_clip_std: float = 2.5
```
```python
# Inside generate():
if config.kind == "hidream":
return _generate_hidream(prompt, n, width, height, output_dir, base_seed, config, on_log=on_log)
```
```python
# Inside health_check():
if config.kind == "hidream":
py = _resolve_hidream_python(config)
model = _resolve_hidream_model(config)
if not py:
return False, "HiDream python not found. Install lab at /Users/salo/HIDREAM-O1-MLX-LAB-active/"
if not model:
return False, f"HiDream model dir not found at {config.hidream_model_path or 'autodetect'}"
return True, f"HiDream ready: {py} + {model}"
```
```python
# New module-level constants + helpers:
HIDREAM_LAB_DIR = Path("/Users/salo/HIDREAM-O1-MLX-LAB-active")
HIDREAM_DEFAULT_PY = HIDREAM_LAB_DIR / ".venv" / "bin" / "python"
HIDREAM_DEFAULT_MODEL = HIDREAM_LAB_DIR / "mlx_models" / "hidream-o1-dev-q8"
HIDREAM_GENERATE_SCRIPT = HIDREAM_LAB_DIR / "scripts" / "hidream_o1" / "generate_hidream_o1_mlx.py"
def _resolve_hidream_python(config) -> str | None:
p = Path(config.hidream_python) if config.hidream_python else HIDREAM_DEFAULT_PY
return str(p) if p.is_file() and os.access(p, os.X_OK) else None
def _resolve_hidream_model(config) -> str | None:
p = Path(config.hidream_model_path) if config.hidream_model_path else HIDREAM_DEFAULT_MODEL
return str(p) if (p / "model.safetensors").exists() else None
def _generate_hidream(prompt, n, width, height, output_dir, base_seed, config, on_log=None):
"""Subprocess pattern matching _generate_mflux. One PNG per call to the
generator script, n calls total. Each candidate uses base_seed+i."""
py = _resolve_hidream_python(config) or sys.exit("HiDream python missing")
model = _resolve_hidream_model(config) or sys.exit("HiDream model missing")
script = str(HIDREAM_GENERATE_SCRIPT)
out: list[dict] = []
for i in range(n):
seed = (base_seed + i) if base_seed is not None else random.randint(0, 2**31 - 1)
png = output_dir / f"hidream_{int(time.time()*1000)}_{i:02d}.png"
cmd = [
py, script,
"--model-path", model,
"--prompt", prompt,
"--width", str(width),
"--height", str(height),
"--output", str(png),
"--seed", str(seed),
"--num-inference-steps", str(config.hidream_steps),
"--noise-scale-start", str(config.hidream_noise_scale),
"--noise-scale-end", str(config.hidream_noise_scale),
"--noise-clip-std", str(config.hidream_noise_clip_std),
]
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True)
for line in proc.stdout:
if on_log: on_log(line.rstrip())
rc = proc.wait()
if rc != 0 or not png.exists():
raise RuntimeError(f"hidream gen failed (rc={rc})")
out.append({
"png_path": str(png),
"seed": seed,
"engine": "hidream-o1-dev-q8",
"width": width,
"height": height,
})
return out
```
### 2. `mlx_ltx_panel.py` β settings UI option (one dropdown entry)
`update_settings()` and `_load_agent_image_config()` already accept `kind`
strings. Just add `"hidream"` to whatever validation lists exist (likely a
single line). The panel already shows config.kind in the agent settings card.
### 3. `docs/IMAGE_GEN_RESEARCH_2026-05.md` β note the new option
Add a row to the engine comparison table:
| Engine | Local | Speed (1024) | RAM | Quality | License |
|---|---|---|---|---|---|
| FLUX.2 klein 4B / mflux | yes | ~50 s | ~16 GB | great | Apache 2.0 |
| Z-Image-Turbo / mflux | yes | ~30 s | ~6 GB | good | Apache 2.0 |
| **HiDream-O1-Image-Dev / Q8** | **yes** | **~67 s** | **~11 GB** | **great** | **MIT** |
## What does NOT need to change
- `start.js` / `install.js` / `pinokio.js` β HiDream's lab is **outside**
Pinokio; Phosphene just shells out to the lab's python. No new install step.
- `mlx_warm_helper.py` β that's LTX-only. HiDream is sub-minute, no warm
helper needed for now (could add one later if we go to a long session of
many shots).
- Phosphene's venv (`ltx-2-mlx/env`) β untouched. mlx-vlm is in the lab's
separate `.venv`.
## Risks & mitigations
| Risk | Mitigation |
|---|---|
| Lab path is hard-coded β moves break it | Configurable via `hidream_python` / `hidream_model_path`. Defaults are absolute; users can override in `state/agent_image_config.json`. |
| HiDream + LTX run at the same time (both want GPU) | Already a problem with mflux + LTX; Phosphene queue serialises shot generation. No new mitigation needed. |
| Lab dir gets nuked again | `README.md` marker is in place; user is aware. If it goes, Phosphene's `health_check` returns clearly and panel surfaces it. |
| Quality-tier defaults: most users won't have a 64 GB Mac | Mark HiDream as **Comfortable+ (32 GB+)** tier in the docs. Don't make it the default β keep mflux Z-Image-Turbo as default for compact tier, FLUX.2 klein as default for comfortable. |
## Cost / size
- Disk: ~10 GB additional in lab (already there)
- RAM at 1024Γ1024: ~11.5 GB (Q8). Same RAM tier as FLUX.2 klein.
- One-time setup: lab venv install (~1.5 GB, already done).
## Roll-out
1. Patch `image_engine.py` (above).
2. Add `"hidream"` to settings validation in `mlx_ltx_panel.py`.
3. Switch agent_image_config.json kind to `"hidream"` in a single test session.
4. Generate one shot through the agent UI; confirm PNG lands.
5. Compare to the same prompt through `mflux qwen-image-edit`.
6. If quality wins on at least 3 prompts β make it a real option in docs.
7. Don't switch the default until we have β₯5 prompts where HiDream is clearly better than mflux Z-Image-Turbo, AND the dark-aesthetic concern is fully ruled out.
## What I'd want before merging this
1. β
Q8 conversion of HiDream-O1-Image-Dev (DONE)
2. β
Stable single-shot text-to-image (DONE β sample images in `sample_outputs/`)
3. π‘ Showcase pass to characterise quality across genres (RUNNING)
4. β Side-by-side vs Phosphene's existing mflux engines on β₯5 matched prompts (NOT YET β needs the showcase to finish + a parallel run on mflux)
5. β One real agent-flow render that uses HiDream as the anchor engine and
feeds the result into LTX 2.3 (NOT YET β easy once health_check passes)
|