Kaa Karakalpak GPT-2 Model
GPT-2 model for the Karakalpak language.
Model Details
- Architecture: GPT-2
- Parameters: 97.0M
- Layers: 8
- Embedding: 768
- Vocab Size: 52000
- Context Length: 512 tokens
Training
- Dataset:
- Training Time: 11:00:02
- Final Loss: 4.1364
- Hardware: Tesla T4 GPU
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("nickoo004/karakalpak-gpt2-v3")
tokenizer = AutoTokenizer.from_pretrained("nickoo004/karakalpak-gpt2-v3")
prompt = "Qaraqalpaqstan"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=50, do_sample=True)
print(tokenizer.decode(outputs[0]))
License
MIT License
- Downloads last month
- 368