qwen3.5-0.8B-JANGTQ4

This repository contains a JANGTQ4-converted version of qwen3.5-0.8B intended for local inference with vMLX / MLX-style runtimes.

The model was converted for efficient local text-generation workloads, with the goal of reducing memory usage and improving practical inference performance on Apple Silicon and compatible local setups.

Model details

  • Base model: qwen3.5-0.8B
  • Format / variant: JANGTQ4
  • Primary use: Local text generation and assistant-style inference
  • Target runtime: vMLX / MLX-compatible local inference stacks
  • Quantization: TQ4-style JANG conversion

Intended use

This model is intended for local experimentation, development, and inference workflows where a compact Qwen-family model is useful. It may be suitable for:

  • Lightweight assistant tasks
  • Local coding-agent experiments
  • Prompt-format and cache-behavior testing
  • Low-memory local inference
  • Fast iteration on Apple Silicon systems

Limitations

This is a converted / quantized model and may differ from the original base model in quality, numerical behavior, formatting behavior, and edge-case reliability. Small models may be more prone to instruction-following mistakes, hallucinations, malformed tool calls, or repetitive output than larger models.

Please test carefully for your own use case.

License

This repository contains a converted model derived from the upstream Qwen model. Use of this model is subject to the license terms of the original base model and any applicable restrictions from the upstream provider. Please review the upstream model license before using or redistributing this conversion.

Downloads last month
222
Safetensors
Model size
0.3B params
Tensor type
U32
·
F16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support