Occitan Gemma-3-4B-IT (LoRA Merged)
This repository contains a fine-tuned version of Google's Gemma-3-4B-IT specifically optimized for the Occitan language.
The model was trained using LoRA (Low-Rank Adaptation) on a curated dataset of Occitan text and instructions, then merged back into the base weights for ease of use.
π Repository Structure
This repo is a "one-stop-shop" for different use cases:
- Root Directory: Full merged Safetensors weights (compatible with
transformers,accelerate, etc.). /ggufFolder: Quantized versions (Q4_K_M, Q5_K_M, Q8_0, etc.) for local inference via LM Studio, Ollama, or llama.cpp./adapterFolder: The raw LoRA adapter files for researchers who wish to inspect the weights or perform their own merges.
π How to Use
Using Transformers (Python)
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "julienp79/occitan-gemma-3-4b-it-lora"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
messages = [
{"role": "user", "content": "Pòdes m'ajudar a escriure un pichon tèxt en occitan?"},
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
- Downloads last month
- 540