NagaNLP Project
Collection
Resources for the NagaNLP project: Low-resource NLP for Nagamese (Naga Pidgin), including conversational corpora, NER, and POS tagging resources. • 10 items • Updated • 3
NagaNLP-NER is a Named Entity Recognition model fine-tuned on the Nagamese (Naga Pidgin) language. It is based on XLM-RoBERTa and trained to identify entities such as Persons, Locations, Organizations, and Miscellaneous entities.
This model is part of the NagaNLP project, aiming to provide foundational NLP resources for the low-resource languages of Nagaland.
nag)The model was fine-tuned on a manually annotated corpus containing 214 sentences (approx. 4,800 tokens).
This model is intended for:
YouCan use this model with the Hugging Face pipeline:
from transformers import pipeline
# Load the pipeline
ner_pipeline = pipeline("ner", model="agnivamaiti/naganlp-ner", aggregation_strategy="simple")
# Inference
text = "Etu retreating monsoon normally October mahina start hoi."
results = ner_pipeline(text)
# Print results
for entity in results:
print(entity)
# Expected Output: {'entity_group': 'MISC', 'word': 'monsoon', ...}, {'entity_group': 'MISC', 'word': 'October', ...}
If you use this model, please cite the associated NagaNLP research paper: Citation details to be added upon publication.
Base model
FacebookAI/xlm-roberta-base