Kaa Karakalpak GPT-2 Model

GPT-2 model for the Karakalpak language.

Model Details

  • Architecture: GPT-2
  • Parameters: 97.0M
  • Layers: 8
  • Embedding: 768
  • Vocab Size: 52000
  • Context Length: 512 tokens

Training

  • Dataset:
  • Training Time: 11:00:02
  • Final Loss: 4.1364
  • Hardware: Tesla T4 GPU

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("nickoo004/karakalpak-gpt2-v3")
tokenizer = AutoTokenizer.from_pretrained("nickoo004/karakalpak-gpt2-v3")

prompt = "Qaraqalpaqstan"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=50, do_sample=True)
print(tokenizer.decode(outputs[0]))

License

MIT License

Downloads last month
368
Safetensors
Model size
97M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support