NLLB-200-Distilled-600M Unsloth GGUF

GGUF-quantized version of facebook/nllb-200-distilled-600M (No Language Left Behind) produced with Unsloth for efficient inference.

Model

Base: facebook/nllb-200-distilled-600M
Format: GGUF (e.g. fast_quantized ~4-bit)
Use case: Multilingual translation (200+ languages), including English ↔ Hausa (eng_Latn ↔ hau_Latn)

Loaded the seq2seq model with Unsloth using AutoModelForSeq2SeqLM.
Saved a merged 16-bit HF-format checkpoint.
Converted that checkpoint to GGUF with Unsloth’s save_pretrained_gguf (e.g. fast_quantized).
Uploaded the GGUF file(s) to this repo.

GGUF runtimes: Use llama.cpp or any GGUF-compatible runtime that supports this architecture. Download the .gguf file(s) from this repo and run inference there.
Hugging Face Transformers: For 16-bit inference, use the base model facebook/nllb-200-distilled-600M with the standard NLLB pipeline; for translation, set src_lang and tgt_lang (e.g. eng_Latn → hau_Latn).

GGUF

Model size

0 params

Architecture

llama

Hardware compatibility

We're not able to determine the quantization variants.