NLLB-200-Distilled-600M Unsloth GGUF

GGUF-quantized version of facebook/nllb-200-distilled-600M (No Language Left Behind) produced with Unsloth for efficient inference.

Model

  • Base: facebook/nllb-200-distilled-600M
  • Format: GGUF (e.g. fast_quantized ~4-bit)
  • Use case: Multilingual translation (200+ languages), including English ↔ Hausa (eng_Latnhau_Latn)

How it was created

  1. Loaded the seq2seq model with Unsloth using AutoModelForSeq2SeqLM.
  2. Saved a merged 16-bit HF-format checkpoint.
  3. Converted that checkpoint to GGUF with Unsloth’s save_pretrained_gguf (e.g. fast_quantized).
  4. Uploaded the GGUF file(s) to this repo.

How to use

  • GGUF runtimes: Use llama.cpp or any GGUF-compatible runtime that supports this architecture. Download the .gguf file(s) from this repo and run inference there.
  • Hugging Face Transformers: For 16-bit inference, use the base model facebook/nllb-200-distilled-600M with the standard NLLB pipeline; for translation, set src_lang and tgt_lang (e.g. eng_Latnhau_Latn).

License

Same as the base model (see facebook/nllb-200-distilled-600M).

Downloads last month
568
GGUF
Model size
0 params
Architecture
llama
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support