π§ Llama-3.2-1B-Instruct β IMB Urologia Fine-Tuned Model
This model is a fine-tuned version of unsloth/Llama-3.2-1B-Instruct-unsloth-bnb-4bit, optimized for Italian medical question answering, with a specific focus on Urologia.
The fine-tuning was performed using a subset of the IMB (Italian Medical Benchmark) dataset, specifically:
- Urologia category only
- ~10,000 training samples
The training was performed using the Unsloth library with LoRA fine-tuning, and the adapter weights were later merged into the base model to provide a standalone checkpoint.
This model relies on data from the IMB dataset. If you use this model in research or applications, you must cite the IMB paper (see Citation section below).
π Training Dataset β IMB (Italian Medical Benchmark)
IMB is an Italian benchmark for medical question answering, designed to evaluate and improve LLM performance in clinical-domain Italian language understanding and reasoning.
The full dataset includes:
- IMB-QA: 782,644 doctor-patient conversations collected from Italian online medical forums
- IMB-MCQA: 25,862 multiple-choice questions derived from Italian medical specialization exams
β οΈ Important: This model was trained only on the Urologia subset (~10,000 samples) of IMB, not on the full dataset.
Dataset repository: π https://github.com/PRAISELab-PicusLab/IMB
π§ͺ Usage Example
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("praiselab-picuslab/Llama-3.2-1B-Instruct-Urologia")
tokenizer = AutoTokenizer.from_pretrained("praiselab-picuslab/Llama-3.2-1B-Instruct-Urologia")
prompt = "[Example question in Italian about Urologia]"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=150)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
β οΈ Usage Restrictions
- Allowed use: Non-commercial research only
- Redistribution: Not allowed without explicit authorization
- Mandatory citation: The IMB dataset paper must be cited in any publication or derived work
π Citation
If you use this model, the IMB dataset, or derived outputs in research, please cite:
@inproceedings{DBLP:conf/clic-it/RomanoRBPM25,
author = {Antonio Romano and
Giuseppe Riccio and
Mariano Barone and
Marco Postiglione and
Vincenzo Moscato},
editor = {Cristina Bosco and
Elisabetta Jezek and
Marco Polignano and
Manuela Sanguinetti},
title = {{IMB:} An Italian Medical Benchmark for Question Answering},
booktitle = {Proceedings of the Eleventh Italian Conference on Computational Linguistics
(CLiC-it 2025), Cagliari, Italy, September 24-26, 2025},
series = {{CEUR} Workshop Proceedings},
volume = {4112},
publisher = {CEUR-WS.org},
year = {2025},
url = {https://ceur-ws.org/Vol-4112/92_main_long.pdf}
}
π Training Details
- Base model:
unsloth/Llama-3.2-1B-Instruct-unsloth-bnb-4bit - Fine-tuning method: LoRA (Unsloth)
- Quantization: 4-bit (BitsAndBytes)
- Adapter merging: Yes (Full merged model)
- Language: Italian
- Domain: Medical β Urologia
- Training size: ~10,000 samples
π License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License.
π€ Acknowledgements
π¨βπ» This project was developed by Mariano Barone, Roberta Di Marino, Francesco Di Serio, Giovanni Dioguardi, Marco Postiglione, Antonio Romano, Giuseppe Riccio, and Vincenzo Moscato at University of Naples, Federico II
- Downloads last month
- -
Model tree for praiselab-picuslab/Llama-3.2-1B-Instruct-Urologia
Base model
meta-llama/Llama-3.2-1B-Instruct