Model Card for Kona2-Bidzer Georgian Definitions

WARNING: this model may occasionally generate highly INAPPROPRIATE content and use words it has seen in its training data (which are not appropriate to say lightly) !!!

Model Details

Model Description

This model is a full fine-tune of Kona2-small-3.8B (based on Microsoft Phi-3.5) on the "Bidzer.ge Georgian dictionary dataset.

It is designed to act as an intelligent Georgian dictionary assistant. Given a word, it generates a concise definition and a contextual usage example. It is capable of handling standard Georgian vocabulary as well as slang and specific terminology found in the training corpus.

Developed by: Antony
Model type: Causal Language Model (AutoModelForCausalLM)
Language(s) (NLP): Georgian (ka)
License: Apache 2.0
Finetuned from model: tbilisi-ai-lab/kona2-small-3.8B

Uses

Direct Use

The model is intended to be used for generating dictionary-style entries.

Input format:

<|user|>
განმარტე სიტყვა: [WORD]<|end|>
<|assistant|>

Output format:

[DEFINITION]

მაგალითი: [USAGE EXAMPLE]

Downstream Use

This model can be integrated into:

Georgian educational tools.
Dictionary apps requiring generative explanations.
NLP pipelines for semantic analysis of Georgian slang.

Out-of-Scope Use

The model is not designed for:

General-purpose chat or open-ended creative writing outside of definitions.
Mathematical or coding tasks.

How to Get Started with the Model

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Antony-X/kona2-bidzer-georgian-definitions"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

word = "კომპიუტერი"
prompt = f"<|user|>\nგანმარტე სიტყვა: {word}<|end|>\n<|assistant|>\n"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=150, do_sample=True, temperature=0.6)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

The model was trained on the Bidzer Dictionary Dataset, a collection of Georgian words, definitions, and usage examples. The data includes diverse vocabulary ranging from formal terms to slang.

Data Format:

Word: Target term.
Definition: Explanation of the term.
Usage: A sentence demonstrating the word in context.

Training Procedure

The model was trained using Full Fine-Tuning (updating all weights) rather than LoRA, ensuring maximum adaptation to the dictionary structure.

Framework: transformers, trl (SFTTrainer)
Precision: bfloat16 (Native H100 optimization)
Optimizer: adamw_torch_fused

Training Hyperparameters

Training regime: bf16 non-mixed precision
Batch Size: 16 per device
Epochs: 3
Learning Rate: 2e-4
Max Sequence Length: 512 tokens

Environmental Impact

Hardware Type: NVIDIA H100 (80GB VRAM)
Cloud Provider: Kaggle
Compute Region: Cloud (GPU)
Carbon Emitted: Negligible (Short training run)

Technical Specifications

Model Architecture and Objective

The model uses the Phi-3.5-mini architecture (3.8B parameters), optimized for high performance with a smaller parameter count. It was trained with a Causal Language Modeling (CLM) objective tailored for instruction following.

Downloads last month: 15

Safetensors

Model size

4B params

Tensor type

BF16

Model tree for Antony-X/kona2-bidzer-georgian-definitions

Base model

tbilisi-ai-lab/kona2-small-3.8B

Finetuned

(1)

this model