Desta 1B Question-Answering v4552 Roza

This model is a full fine-tuned variant of mewaeltsegay/desta_1b for Tigrinya question answering using the TiQuAD dataset.

Model Details

Model Description

Model type: Causal language model (LLaMA-style, decoder-only)
Base model: mewaeltsegay/desta_1b
Fine-tuning type: Full-parameter fine-tuning (no LoRA adapters)
Primary task: Text generation for QA-style prompts
Primary language: Tigrinya

Intended Use

Tigrinya question answering research and prototyping
Educational and low-resource NLP experimentation
Baseline model for further domain adaptation

Out-of-Scope Use

Medical, legal, financial, or other high-stakes decisions
Fully autonomous use without human verification
Harmful, abusive, or disinformation content generation

Training Details

Training Data

Dataset: TiQuAD (Tigrinya Question Answering Dataset)
Train split size: 8,857
Validation split size: 1,115
Test split size: 1,317

Training Procedure

Epochs: 10
Learning rate: 5e-5
Batch size: 16
Max sequence length: 1024
Precision: bf16
Hardware: NVIDIA H100 NVL

Framework Versions

Transformers: 4.57.1
Architecture in config: LlamaForCausalLM

Evaluation

Metrics were computed from the saved evaluation outputs in this training run.

Split	EM	F1	N
Validation	42.3690	50.2434	1317
Test	42.4450	49.4997	1317

Metric Notes

EM (Exact Match): Strict string-level answer match.
F1: Token-overlap F1 between prediction and reference answer.

Bias, Risks, and Limitations

The model may hallucinate or produce incorrect factual answers.
It may inherit social, cultural, or topical biases from training data.
Performance can degrade on inputs outside TiQuAD domains and styles.
Prompt sensitivity can lead to output variability for semantically similar questions.

Recommendations

Use retrieval or source-grounding for factual QA applications.
Add moderation/safety filters before production deployment.
Keep a human-in-the-loop for sensitive use cases.
Evaluate with domain-specific test sets before real-world rollout.

How to Use

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "mewaeltsegay/desta_1b_QA_v4552_Rosa"

device = "cuda" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
).to(device)

model.eval()

if torch.cuda.is_available():
    print(f"VRAM: {torch.cuda.memory_allocated()/1e9:.1f} GB")
print("✓ Ready")

example = {"question": "ሌስተር ሲቲ ብኣርሰናል ብኽንደይ ተሳዒራ?", "context": "ቅንጥብጣብ\nብቕድሚ ትማሊ ኣብ ዝተኻየደ ግጥም ፕሪመር ሊግ እንግሊዝ፡ ኪውፒኣር ንኣስቶን ቪላ 2ብ0 ረቲዓ። እቲ ውጽኢት፡ ንኪውፒኣር ኣብዚ ዓመተ ስፖርት’ዚ ናይ ፈለማ ነጥቢ ካብ ሜዳኣ ወጻኢ ኮይኑ ተሰኒዱ ኣሎ። = ኣብ ዝሓለፈ መስኮት ምስግጋር ካብ ፊዮረንቲና ናብ ቸልሲ ዝተሰጋገረ ኳድራዶ ደጋፊ ማን ዩናይትድ ምዃኑ ተኣሚኑ። እቲ ኮሎምብያዊ ተጻዋታይ፡ ካብ ወዲ 10 ዓመት ጀሚሩ ብፍቕሪ ናይታ ማንቸስተራዊት ክለብ ከምዝተሓመሰን ሕጂ እውን ነታ ጋንታ ብልቡ ከምዝድግፍ ዓላሚ ሃገራዊት ጋንታ ኮሎምብያ ሓቢሩ። = ማርክ ሩይስ ምስ ቦሩስያ ዶርትመንድ ዘለዎ ውዕል ኣናዊሑ። ምንዋሕ ውዕል ናይቲ ተጻዋታይ ንሓያለ ሃደንቱ ክለባት ሕማቕ ዜና ኮይኑ ኣሎ። = ኣጥቃዒ ኒውካስል ሴም ደ ዮንግ ብሰንኪ ሕማም ሳምቡእ ንኣስታት ሸሞንተ ሳምንታት ካብ ጸወታ ከምዝርሕቕ ተሓቢሩ። = ኣርሰናል ንሌስተር ሲቲ 2ብ1 ኣብ ዝሰዓረትሉ ግጥም፡ ኣከፋፋሊኣ ኣሮን ራምሲ ማህሰይቲ ከምዝገጠሞ ኣሰልጣኒ ኣርሰን ቨንገር ኣፍሊጡ።", "answers": "2ብ1", "source": "original"}

max_length = 1024
newline_ids = tokenizer.encode("\n", add_special_tokens=False)
stop_token_ids = [tokenizer.eos_token_id] + newline_ids



def answer_question(context: str, question: str, max_new_tokens: int = 30) -> str:
    prompt = f"ጽሑፍ: {context}\n\nሕቶ: {question}\n\nመልሲ:"
    inputs = tokenizer(
        prompt, return_tensors="pt",
        truncation=True, max_length=max_length
    )
    inputs = {k: v.to(model.device) for k, v in inputs.items()}
    with torch.no_grad():
        out = model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            num_beams=4,
            repetition_penalty=1.3,
            no_repeat_ngram_size=3,
            pad_token_id=tokenizer.pad_token_id,
            eos_token_id=stop_token_ids,
            early_stopping=True,
        )
    new = out[0][inputs["input_ids"].shape[1]:]
    return tokenizer.decode(new, skip_special_tokens=True).strip()


print("Inference ready.")

# ሕቶ: ሌስተር ሲቲ ብኣርሰናል ብኽንደይ ተሳዒራ? መልሲ: 2ብ1

f"ሕቶ: {example["question"]} መልሲ: {answer_question(example["context"], example["question"])}"

License

This model is released under Apache-2.0.
Make sure your usage also complies with the license and terms of the base model and TiQuAD data sources.

Citation

If you use this model in your research, please cite:

@misc{desta-1b-2026,
  title={DESTA-1B: Dedicated Eritrean Semitic Text Autoregressor},
  author={Mewael Tsegay Desta},
  year={2026},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/mewaeltsegay/desta_1b}}
}

Acknowledgments

Base model: TinyLlama
Tokenizer: mewaeltsegay/tokenizer_tigrinya
Training dataset: fgaim/tiquad

Model Card Contact

For questions, issues, or contributions, please open an issue on the model repository.

Downloads last month: 184

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for mewaeltsegay/desta_1b_QA_v4552_Rosa

Base model

TinyLlama/TinyLlama-1.1B-Chat-v1.0

Finetuned

mewaeltsegay/desta_1b

Finetuned

(1)

this model