Desta 1B Question-Answering v4552 Roza

This model is a full fine-tuned variant of mewaeltsegay/desta_1b for Tigrinya question answering using the TiQuAD dataset.

Model Details

Model Description

  • Model type: Causal language model (LLaMA-style, decoder-only)
  • Base model: mewaeltsegay/desta_1b
  • Fine-tuning type: Full-parameter fine-tuning (no LoRA adapters)
  • Primary task: Text generation for QA-style prompts
  • Primary language: Tigrinya

Intended Use

  • Tigrinya question answering research and prototyping
  • Educational and low-resource NLP experimentation
  • Baseline model for further domain adaptation

Out-of-Scope Use

  • Medical, legal, financial, or other high-stakes decisions
  • Fully autonomous use without human verification
  • Harmful, abusive, or disinformation content generation

Training Details

Training Data

  • Dataset: TiQuAD (Tigrinya Question Answering Dataset)
  • Train split size: 8,857
  • Validation split size: 1,115
  • Test split size: 1,317

Training Procedure

  • Epochs: 10
  • Learning rate: 5e-5
  • Batch size: 16
  • Max sequence length: 1024
  • Precision: bf16
  • Hardware: NVIDIA H100 NVL

Framework Versions

  • Transformers: 4.57.1
  • Architecture in config: LlamaForCausalLM

Evaluation

Metrics were computed from the saved evaluation outputs in this training run.

Split EM F1 N
Validation 42.3690 50.2434 1317
Test 42.4450 49.4997 1317

Metric Notes

  • EM (Exact Match): Strict string-level answer match.
  • F1: Token-overlap F1 between prediction and reference answer.

Bias, Risks, and Limitations

  • The model may hallucinate or produce incorrect factual answers.
  • It may inherit social, cultural, or topical biases from training data.
  • Performance can degrade on inputs outside TiQuAD domains and styles.
  • Prompt sensitivity can lead to output variability for semantically similar questions.

Recommendations

  • Use retrieval or source-grounding for factual QA applications.
  • Add moderation/safety filters before production deployment.
  • Keep a human-in-the-loop for sensitive use cases.
  • Evaluate with domain-specific test sets before real-world rollout.

How to Use

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "mewaeltsegay/desta_1b_QA_v4552_Rosa"

device = "cuda" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
).to(device)

model.eval()

if torch.cuda.is_available():
    print(f"VRAM: {torch.cuda.memory_allocated()/1e9:.1f} GB")
print("✓ Ready")

example = {"question": "ሌስተር ሲቲ ብኣርሰናል ብኽንደይ ተሳዒራ?", "context": "ቅንጥብጣብ\nብቕድሚ ትማሊ ኣብ ዝተኻየደ ግጥም ፕሪመር ሊግ እንግሊዝ፡ ኪውፒኣር ንኣስቶን ቪላ 2ብ0 ረቲዓ። እቲ ውጽኢት፡ ንኪውፒኣር ኣብዚ ዓመተ ስፖርት’ዚ ናይ ፈለማ ነጥቢ ካብ ሜዳኣ ወጻኢ ኮይኑ ተሰኒዱ ኣሎ። = ኣብ ዝሓለፈ መስኮት ምስግጋር ካብ ፊዮረንቲና ናብ ቸልሲ ዝተሰጋገረ ኳድራዶ ደጋፊ ማን ዩናይትድ ምዃኑ ተኣሚኑ። እቲ ኮሎምብያዊ ተጻዋታይ፡ ካብ ወዲ 10 ዓመት ጀሚሩ ብፍቕሪ ናይታ ማንቸስተራዊት ክለብ ከምዝተሓመሰን ሕጂ እውን ነታ ጋንታ ብልቡ ከምዝድግፍ ዓላሚ ሃገራዊት ጋንታ ኮሎምብያ ሓቢሩ። = ማርክ ሩይስ ምስ ቦሩስያ ዶርትመንድ ዘለዎ ውዕል ኣናዊሑ። ምንዋሕ ውዕል ናይቲ ተጻዋታይ ንሓያለ ሃደንቱ ክለባት ሕማቕ ዜና ኮይኑ ኣሎ። = ኣጥቃዒ ኒውካስል ሴም ደ ዮንግ ብሰንኪ ሕማም ሳምቡእ ንኣስታት ሸሞንተ ሳምንታት ካብ ጸወታ ከምዝርሕቕ ተሓቢሩ። = ኣርሰናል ንሌስተር ሲቲ 2ብ1 ኣብ ዝሰዓረትሉ ግጥም፡ ኣከፋፋሊኣ ኣሮን ራምሲ ማህሰይቲ ከምዝገጠሞ ኣሰልጣኒ ኣርሰን ቨንገር ኣፍሊጡ።", "answers": "2ብ1", "source": "original"}

max_length = 1024
newline_ids = tokenizer.encode("\n", add_special_tokens=False)
stop_token_ids = [tokenizer.eos_token_id] + newline_ids



def answer_question(context: str, question: str, max_new_tokens: int = 30) -> str:
    prompt = f"ጽሑፍ: {context}\n\nሕቶ: {question}\n\nመልሲ:"
    inputs = tokenizer(
        prompt, return_tensors="pt",
        truncation=True, max_length=max_length
    )
    inputs = {k: v.to(model.device) for k, v in inputs.items()}
    with torch.no_grad():
        out = model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            num_beams=4,
            repetition_penalty=1.3,
            no_repeat_ngram_size=3,
            pad_token_id=tokenizer.pad_token_id,
            eos_token_id=stop_token_ids,
            early_stopping=True,
        )
    new = out[0][inputs["input_ids"].shape[1]:]
    return tokenizer.decode(new, skip_special_tokens=True).strip()


print("Inference ready.")

# ሕቶ: ሌስተር ሲቲ ብኣርሰናል ብኽንደይ ተሳዒራ? መልሲ: 2ብ1

f"ሕቶ: {example["question"]} መልሲ: {answer_question(example["context"], example["question"])}"

License

This model is released under Apache-2.0.
Make sure your usage also complies with the license and terms of the base model and TiQuAD data sources.

Citation

If you use this model in your research, please cite:

@misc{desta-1b-2026,
  title={DESTA-1B: Dedicated Eritrean Semitic Text Autoregressor},
  author={Mewael Tsegay Desta},
  year={2026},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/mewaeltsegay/desta_1b}}
}

Acknowledgments

Model Card Contact

For questions, issues, or contributions, please open an issue on the model repository.


Downloads last month
184
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mewaeltsegay/desta_1b_QA_v4552_Rosa

Finetuned
(1)
this model