NLLB-200-Distilled-600M Unsloth GGUF
GGUF-quantized version of facebook/nllb-200-distilled-600M (No Language Left Behind) produced with Unsloth for efficient inference.
Model
- Base: facebook/nllb-200-distilled-600M
- Format: GGUF (e.g.
fast_quantized~4-bit) - Use case: Multilingual translation (200+ languages), including English ↔ Hausa (
eng_Latn↔hau_Latn)
How it was created
- Loaded the seq2seq model with Unsloth using
AutoModelForSeq2SeqLM. - Saved a merged 16-bit HF-format checkpoint.
- Converted that checkpoint to GGUF with Unsloth’s
save_pretrained_gguf(e.g.fast_quantized). - Uploaded the GGUF file(s) to this repo.
How to use
- GGUF runtimes: Use llama.cpp or any GGUF-compatible runtime that supports this architecture. Download the
.gguffile(s) from this repo and run inference there. - Hugging Face Transformers: For 16-bit inference, use the base model
facebook/nllb-200-distilled-600Mwith the standard NLLB pipeline; for translation, setsrc_langandtgt_lang(e.g.eng_Latn→hau_Latn).
License
Same as the base model (see facebook/nllb-200-distilled-600M).
- Downloads last month
- 568
Hardware compatibility
Log In to add your hardware
We're not able to determine the quantization variants.