Model Card for Kona2-Bidzer Georgian Definitions

WARNING: this model may occasionally generate highly INAPPROPRIATE content and use words it has seen in its training data (which are not appropriate to say lightly) !!!

Model Details

Model Description

This model is a full fine-tune of Kona2-small-3.8B (based on Microsoft Phi-3.5) on the "Bidzer.ge Georgian dictionary dataset.

It is designed to act as an intelligent Georgian dictionary assistant. Given a word, it generates a concise definition and a contextual usage example. It is capable of handling standard Georgian vocabulary as well as slang and specific terminology found in the training corpus.

  • Developed by: Antony

  • Model type: Causal Language Model (AutoModelForCausalLM)

  • Language(s) (NLP): Georgian (ka)

  • License: Apache 2.0

  • Finetuned from model: tbilisi-ai-lab/kona2-small-3.8B

Uses

Direct Use

The model is intended to be used for generating dictionary-style entries.

Input format:

<|user|>
განმარტე სიტყვა: [WORD]<|end|>
<|assistant|>

Output format:

[DEFINITION]

მაგალითი: [USAGE EXAMPLE]

Downstream Use

This model can be integrated into:

  • Georgian educational tools.
  • Dictionary apps requiring generative explanations.
  • NLP pipelines for semantic analysis of Georgian slang.

Out-of-Scope Use

The model is not designed for:

  • General-purpose chat or open-ended creative writing outside of definitions.
  • Mathematical or coding tasks.

How to Get Started with the Model

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Antony-X/kona2-bidzer-georgian-definitions"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

word = "კომპიუტერი"
prompt = f"<|user|>\nგანმარტე სიტყვა: {word}<|end|>\n<|assistant|>\n"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=150, do_sample=True, temperature=0.6)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

The model was trained on the Bidzer Dictionary Dataset, a collection of Georgian words, definitions, and usage examples. The data includes diverse vocabulary ranging from formal terms to slang.

Data Format:

  • Word: Target term.
  • Definition: Explanation of the term.
  • Usage: A sentence demonstrating the word in context.

Training Procedure

The model was trained using Full Fine-Tuning (updating all weights) rather than LoRA, ensuring maximum adaptation to the dictionary structure.

  • Framework: transformers, trl (SFTTrainer)
  • Precision: bfloat16 (Native H100 optimization)
  • Optimizer: adamw_torch_fused

Training Hyperparameters

  • Training regime: bf16 non-mixed precision
  • Batch Size: 16 per device
  • Epochs: 3
  • Learning Rate: 2e-4
  • Max Sequence Length: 512 tokens

Environmental Impact

  • Hardware Type: NVIDIA H100 (80GB VRAM)
  • Cloud Provider: Kaggle
  • Compute Region: Cloud (GPU)
  • Carbon Emitted: Negligible (Short training run)

Technical Specifications

Model Architecture and Objective

The model uses the Phi-3.5-mini architecture (3.8B parameters), optimized for high performance with a smaller parameter count. It was trained with a Causal Language Modeling (CLM) objective tailored for instruction following.

Downloads last month
15
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Antony-X/kona2-bidzer-georgian-definitions

Finetuned
(1)
this model