Qwen3.5-4B-q4f16_1-MLC
MLC/WebLLM artifacts compiled from the exact Qwen/Qwen3.5-4B checkpoint.
- Quantization:
q4f16_1 - Context window:
4096 - Prefill chunk size:
1024 - Target runtime: WebLLM / WebGPU
This package contains the MLC chat config, tokenizer assets, and quantized parameter shards for browser-side chat inference.
- Downloads last month
- 52
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support