Desta 1B Question-Answering v4552 Roza
This model is a full fine-tuned variant of mewaeltsegay/desta_1b for Tigrinya question answering using the TiQuAD dataset.
Model Details
Model Description
- Model type: Causal language model (LLaMA-style, decoder-only)
- Base model:
mewaeltsegay/desta_1b - Fine-tuning type: Full-parameter fine-tuning (no LoRA adapters)
- Primary task: Text generation for QA-style prompts
- Primary language: Tigrinya
Intended Use
- Tigrinya question answering research and prototyping
- Educational and low-resource NLP experimentation
- Baseline model for further domain adaptation
Out-of-Scope Use
- Medical, legal, financial, or other high-stakes decisions
- Fully autonomous use without human verification
- Harmful, abusive, or disinformation content generation
Training Details
Training Data
- Dataset: TiQuAD (Tigrinya Question Answering Dataset)
- Train split size: 8,857
- Validation split size: 1,115
- Test split size: 1,317
Training Procedure
- Epochs: 10
- Learning rate: 5e-5
- Batch size: 16
- Max sequence length: 1024
- Precision: bf16
- Hardware: NVIDIA H100 NVL
Framework Versions
- Transformers: 4.57.1
- Architecture in config:
LlamaForCausalLM
Evaluation
Metrics were computed from the saved evaluation outputs in this training run.
| Split | EM | F1 | N |
|---|---|---|---|
| Validation | 42.3690 | 50.2434 | 1317 |
| Test | 42.4450 | 49.4997 | 1317 |
Metric Notes
- EM (Exact Match): Strict string-level answer match.
- F1: Token-overlap F1 between prediction and reference answer.
Bias, Risks, and Limitations
- The model may hallucinate or produce incorrect factual answers.
- It may inherit social, cultural, or topical biases from training data.
- Performance can degrade on inputs outside TiQuAD domains and styles.
- Prompt sensitivity can lead to output variability for semantically similar questions.
Recommendations
- Use retrieval or source-grounding for factual QA applications.
- Add moderation/safety filters before production deployment.
- Keep a human-in-the-loop for sensitive use cases.
- Evaluate with domain-specific test sets before real-world rollout.
How to Use
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "mewaeltsegay/desta_1b_QA_v4552_Rosa"
device = "cuda" if torch.cuda.is_available() else "cpu"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
).to(device)
model.eval()
if torch.cuda.is_available():
print(f"VRAM: {torch.cuda.memory_allocated()/1e9:.1f} GB")
print("✓ Ready")
example = {"question": "ሌስተር ሲቲ ብኣርሰናል ብኽንደይ ተሳዒራ?", "context": "ቅንጥብጣብ\nብቕድሚ ትማሊ ኣብ ዝተኻየደ ግጥም ፕሪመር ሊግ እንግሊዝ፡ ኪውፒኣር ንኣስቶን ቪላ 2ብ0 ረቲዓ። እቲ ውጽኢት፡ ንኪውፒኣር ኣብዚ ዓመተ ስፖርት’ዚ ናይ ፈለማ ነጥቢ ካብ ሜዳኣ ወጻኢ ኮይኑ ተሰኒዱ ኣሎ። = ኣብ ዝሓለፈ መስኮት ምስግጋር ካብ ፊዮረንቲና ናብ ቸልሲ ዝተሰጋገረ ኳድራዶ ደጋፊ ማን ዩናይትድ ምዃኑ ተኣሚኑ። እቲ ኮሎምብያዊ ተጻዋታይ፡ ካብ ወዲ 10 ዓመት ጀሚሩ ብፍቕሪ ናይታ ማንቸስተራዊት ክለብ ከምዝተሓመሰን ሕጂ እውን ነታ ጋንታ ብልቡ ከምዝድግፍ ዓላሚ ሃገራዊት ጋንታ ኮሎምብያ ሓቢሩ። = ማርክ ሩይስ ምስ ቦሩስያ ዶርትመንድ ዘለዎ ውዕል ኣናዊሑ። ምንዋሕ ውዕል ናይቲ ተጻዋታይ ንሓያለ ሃደንቱ ክለባት ሕማቕ ዜና ኮይኑ ኣሎ። = ኣጥቃዒ ኒውካስል ሴም ደ ዮንግ ብሰንኪ ሕማም ሳምቡእ ንኣስታት ሸሞንተ ሳምንታት ካብ ጸወታ ከምዝርሕቕ ተሓቢሩ። = ኣርሰናል ንሌስተር ሲቲ 2ብ1 ኣብ ዝሰዓረትሉ ግጥም፡ ኣከፋፋሊኣ ኣሮን ራምሲ ማህሰይቲ ከምዝገጠሞ ኣሰልጣኒ ኣርሰን ቨንገር ኣፍሊጡ።", "answers": "2ብ1", "source": "original"}
max_length = 1024
newline_ids = tokenizer.encode("\n", add_special_tokens=False)
stop_token_ids = [tokenizer.eos_token_id] + newline_ids
def answer_question(context: str, question: str, max_new_tokens: int = 30) -> str:
prompt = f"ጽሑፍ: {context}\n\nሕቶ: {question}\n\nመልሲ:"
inputs = tokenizer(
prompt, return_tensors="pt",
truncation=True, max_length=max_length
)
inputs = {k: v.to(model.device) for k, v in inputs.items()}
with torch.no_grad():
out = model.generate(
**inputs,
max_new_tokens=max_new_tokens,
num_beams=4,
repetition_penalty=1.3,
no_repeat_ngram_size=3,
pad_token_id=tokenizer.pad_token_id,
eos_token_id=stop_token_ids,
early_stopping=True,
)
new = out[0][inputs["input_ids"].shape[1]:]
return tokenizer.decode(new, skip_special_tokens=True).strip()
print("Inference ready.")
# ሕቶ: ሌስተር ሲቲ ብኣርሰናል ብኽንደይ ተሳዒራ? መልሲ: 2ብ1
f"ሕቶ: {example["question"]} መልሲ: {answer_question(example["context"], example["question"])}"
License
This model is released under Apache-2.0.
Make sure your usage also complies with the license and terms of the base model and TiQuAD data sources.
Citation
If you use this model in your research, please cite:
@misc{desta-1b-2026,
title={DESTA-1B: Dedicated Eritrean Semitic Text Autoregressor},
author={Mewael Tsegay Desta},
year={2026},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/mewaeltsegay/desta_1b}}
}
Acknowledgments
- Base model: TinyLlama
- Tokenizer: mewaeltsegay/tokenizer_tigrinya
- Training dataset: fgaim/tiquad
Model Card Contact
For questions, issues, or contributions, please open an issue on the model repository.
- Downloads last month
- 184
Model tree for mewaeltsegay/desta_1b_QA_v4552_Rosa
Base model
TinyLlama/TinyLlama-1.1B-Chat-v1.0 Finetuned
mewaeltsegay/desta_1b