Qwen3.5-4B-q4f16_1-MLC

MLC/WebLLM artifacts compiled from the exact Qwen/Qwen3.5-4B checkpoint.

  • Quantization: q4f16_1
  • Context window: 4096
  • Prefill chunk size: 1024
  • Target runtime: WebLLM / WebGPU

This package contains the MLC chat config, tokenizer assets, and quantized parameter shards for browser-side chat inference.

Downloads last month
52
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for emb1ter/Qwen3.5-4B-q4f16_1-MLC

Finetuned
Qwen/Qwen3.5-4B
Finetuned
(149)
this model