docs: add Tested inference path section, reorganize model card

cd3ef8c verified 2 days ago

4.37 kB

	---
	license: apache-2.0
	language:
	- en
	- pl
	- multilingual
	base_model:
	- huihui-ai/Huihui4-48B-A4B-abliterated
	library_name: mlx
	pipeline_tag: image-text-to-text
	tags:
	- mlx
	- apple-silicon
	- gemma
	- gemma4
	- gemma-4
	- abliterated
	- uncensored
	- moe
	- multimodal
	- vision
	- image-text-to-text
	- vmlx
	- nvfp4
	- 4bit
	- quantized
	- huihui
	inference: false
	---

	# Huihui4-48B-A4B-vmlx-nvfp4

	`Huihui4-48B-A4B-vmlx-nvfp4` is an MLX vision-language checkpoint derived from `huihui-ai/Huihui4-48B-A4B-abliterated`, packaged for local multimodal prompting on Apple Silicon.

	## Intended use

	- Local image-and-text reasoning on Apple Silicon
	- Document, screenshot, chart, and visual question answering experiments
	- Operator-controlled multimodal prototyping where hosted inference is not desired

	## Out of scope

	- Safety-critical decisions without domain expert review
	- Claims of benchmark superiority not backed by published evaluation data
	- Non-MLX runtime guarantees; this card documents the shipped HF checkpoint, not every possible serving stack
	- High-stakes visual interpretation without human review

	## Training and conversion metadata

	\| Parameter \| Value \|
	\|---\|---\|
	\| Repository \| `LibraxisAI/Huihui4-48B-A4B-vmlx-nvfp4` \|
	\| Base model \| `huihui-ai/Huihui4-48B-A4B-abliterated` \|
	\| Task \| `image-text-to-text` \|
	\| Library \| `mlx` \|
	\| Format \| MLX / Apple Silicon checkpoint \|
	\| Quantization \| NVFP4 \|
	\| Architecture \| Gemma4ForConditionalGeneration \|
	\| Model files \| 6 \|
	\| Config model_type \| `gemma4` \|

	This card only reports metadata present in the Hugging Face repository, existing card frontmatter, or public config files. Missing benchmark, dataset, or training-run details are left explicit rather than reconstructed.

	## Tested inference path

	> Inference for this checkpoint has been tested with [`LibraxisAI/mlx-batch-server`](https://github.com/LibraxisAI/mlx-batch-server).\
	> This is the recommended tested path for operator-controlled local inference on Apple Silicon.

	\| Aspect \| Status \|
	\|---\|---\|
	\| Tested runtime \| `LibraxisAI/mlx-batch-server` \|
	\| Target hardware \| Apple Silicon \|
	\| Inference mode \| Local / self-hosted \|
	\| Hugging Face Hosted Inference \| Disabled for this repository (`inference: false`) \|

	This does not claim compatibility with every possible serving stack. It documents the path that has been exercised for this published checkpoint.

	## Usage

	### CLI

	```bash
	pip install mlx-vlm

	python -m mlx_vlm.generate \
	--model LibraxisAI/Huihui4-48B-A4B-vmlx-nvfp4 \
	--image image.jpg \
	--prompt "Summarize the key signals in this document and list the next action items." \
	--max-tokens 256
	```

	### Python

	```python
	from mlx_vlm import generate, load

	model, processor = load("LibraxisAI/Huihui4-48B-A4B-vmlx-nvfp4")
	response = generate(
	model,
	processor,
	prompt="Summarize the key signals in this document and list the next action items.",
	image="image.jpg",
	max_tokens=256,
	)
	print(response)
	```

	## Example output

	No public sample output is currently declared for this checkpoint.

	## Quantization notes

	\| Aspect \| Original/base checkpoint \| This checkpoint \|
	\|---\|---\|---\|
	\| Lineage \| `huihui-ai/Huihui4-48B-A4B-abliterated` \| `LibraxisAI/Huihui4-48B-A4B-vmlx-nvfp4` \|
	\| Runtime target \| Upstream runtime format \| MLX on Apple Silicon \|
	\| Quantization \| Base precision or upstream-declared format \| NVFP4 \|
	\| Published quality delta \| Not declared in public metadata \| Not declared in public metadata \|

	## Limitations

	- No public benchmarks for this checkpoint are declared in the model metadata.
	- No public benchmark claims are made by this card unless listed in the frontmatter.
	- Validate outputs on your own domain data before relying on this checkpoint.
	- Memory use and speed depend heavily on the exact Apple Silicon generation, unified-memory size, and prompt length.

	## License

	`apache-2.0`. Check the upstream/base model license as well when a base model is declared.

	## Citation

	```bibtex
	@misc{libraxisai-huihui4-48b-a4b-vmlx-nvfp4,
	title = {Huihui4-48B-A4B-vmlx-nvfp4},
	author = {LibraxisAI},
	year = {2026},
	howpublished = {\url{https://huggingface.co/LibraxisAI/Huihui4-48B-A4B-vmlx-nvfp4}},
	note = {MLX checkpoint published by LibraxisAI}
	}
	```
	---

	𝚅𝚒𝚋𝚎𝚌𝚛𝚊𝚏𝚝𝚎𝚍. with AI Agents by VetCoders (c)2024-2026 LibraxisAI