Occitan Gemma-3-4B-IT (LoRA Merged)

This repository contains a fine-tuned version of Google's Gemma-3-4B-IT specifically optimized for the Occitan language.

The model was trained using LoRA (Low-Rank Adaptation) on a curated dataset of Occitan text and instructions, then merged back into the base weights for ease of use.

πŸ“ Repository Structure

This repo is a "one-stop-shop" for different use cases:

  • Root Directory: Full merged Safetensors weights (compatible with transformers, accelerate, etc.).
  • /gguf Folder: Quantized versions (Q4_K_M, Q5_K_M, Q8_0, etc.) for local inference via LM Studio, Ollama, or llama.cpp.
  • /adapter Folder: The raw LoRA adapter files for researchers who wish to inspect the weights or perform their own merges.

πŸš€ How to Use

Using Transformers (Python)

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "julienp79/occitan-gemma-3-4b-it-lora"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

messages = [
    {"role": "user", "content": "Pòdes m'ajudar a escriure un pichon tèxt en occitan?"},
]

inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
540
Safetensors
Model size
4B params
Tensor type
F16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for julienp79/occitan-gemma-3-4b-it-lora

Adapter
(315)
this model
Adapters
2 models