mlx-community
/

DeepSeek-V4-Flash-2bit-DQ

 Made possible by [Lambda.ai](https://huggingface.co/lambda) ❤️
+DeepSeek-V4-Flash-2bit-DQ uses a dynamic mixed-precision quantization policy. Most routed MoE expert weights are packed to 2-bit, while sensitive layers and projections remain in higher-quality 4-bit, 6-bit or 8-bit quantization. This keeps memory use much lower than the baseline 4-bit checkpoint.
 ## Use with mlx
 ```bash