QLoRA Adapter for Dutch Definition Expansion (AYA-101)

This repository contains a QLoRA adapter fine-tuned on CohereLabs/aya-101 for the task of sense-preserving definitional expansion in Dutch.

This work was developed as part of the Master's thesis, "Transformer-based Expansion of Dutch Dictionary Definitions", submitted for the degree of Master of Science in Artificial Intelligence at KU Leuven.

About the Thesis

The research investigates the potential of transformer-based models to automate a significant bottleneck in contemporary lexicography: the manual expansion of concise, core-meaning definitions into comprehensive, formally structured dictionary entries. The study focuses on Dutch, a task requiring not only semantic accuracy but also strict adherence to lexicographical style and structure.

The thesis empirically compares two primary methodologies: in-context learning via few-shot prompting and adaptation via parameter-efficient fine-tuning (specifically, QLoRA). This comparison was conducted across a range of powerful multilingual and Dutch-specific models, including mT5-xl, GEITje Ultra, Aya-101, and Aya-23, to determine the most effective strategy for this high-precision domain.

This Model's Role and Performance

This fine-tuned AYA-101 model represents the fine-tuning approach explored in the thesis. While the few-shot prompted version of AYA-101 achieved the highest quantitative scores, its performance was found to be inconsistent and unreliable, exhibiting a "hit-or-miss" pattern entirely dependent on the quality of its prompt examples. In contrast, the fine-tuned models proved to be far more robust, consistently learning and generalizing the required lexicographical patterns. Although the fine-tuned Aya-23 model ultimately emerged as the most reliable, this fine-tuned AYA-101 model serves as a key component of the study's central finding: that for high-precision domains, task-specific fine-tuning is essential to instill the discipline required for generating reliable, domain-appropriate output.

How to Use

To use this adapter you must first load the base model (CohereLabs/aya-101) in 4-bit and then apply this adapter on top of it.

import torch
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel

base_model_id = "CohereLabs/aya-101"
adapter_id = "RobbedoesHF/aya-101-dutch-definition-expansion-qlora" # The repo ID of this adapter

# Load the base model with 4-bit quantization
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

model = AutoModelForSeq2SeqLM.from_pretrained(
    base_model_id,
    quantization_config=bnb_config,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)

# Apply the LoRA adapter
model = PeftModel.from_pretrained(model, adapter_id)

print("Model loaded successfully!")

Prompting Format

This adapter was fine-tuned on a specific instructional prompt. For best results, your input should match this structure.

# Define the lemma and short definition you want to expand
lemma = "ecoroman"
short_def = "roman over milieuproblematiek"

# Define the prompt components, matching the training script
system_prompt = "Je bent een expert-lexicograaf die definities schrijft voor een Nederlands woordenboek."
instruction = f"Breid de volgende korte definitie voor het woord '{lemma}' uit tot een volledige definitie: '{short_def}'"
prompt = f"{system_prompt}\n\n{instruction}"

# Tokenize the prompt
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

# Generate the output tokens
print("
Generating definition...")
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=384, # Chosen based on the longest full definition's token length for this model
        num_beams=4,  # What was used for the thesis
        early_stopping=True
    )

# Decode the tokens into a string
decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True)

print("\n--- Prompt ---")
print(prompt)
print("\n--- Model Output ---")
print(decoded_output)

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RobbedoesHF/aya-101-dutch-definition-expansion-qlora

Base model

CohereLabs/aya-101

Adapter

(31)

this model

RobbedoesHF
/

aya-101-dutch-definition-expansion-qlora