Distil Expenses
Collection
Fine-tuned small models that can answer queries about personal finances stored in CSV files via tool calling integration to pandas. • 2 items • Updated
A small language model (SLM) fine-tuned by Distil Labs for answering queries about personal finances via tool calling integration to pandas. Optimized to run locally via Ollama with strong tool calling accuracy.
*********** GITHUB DEMO AND CODE ***********
Given a CSV file with personal expenses, the model answers queries about sums, counts, averages, or comparisons of time periods.
date,provider_name,amount,category
2024-01-05,Whole Foods,-145.32,shopping
2024-01-10,Netflix,-15.99,entertainment
2024-01-18,Shell Gas Station,-52.40,transportation
...
Example query and answer:
Count all my shopping under $100 in the first half of 2024
ANSWER: From 2024-01-01 to 2024-06-30 you spent 6 times under 100 on shopping.
The tuned models were trained using knowledge distillation, leveraging the teacher model GPT-OSS 120B. We used 24 train examples and complemented them with 2500 synthetic examples.
We evaluated the model on 25 test examples. The tuned models match the teacher model accuracy.
| Model | Correct (25) | Tool call accuracy |
|---|---|---|
| GPT-OSS | 22 | 0.88 |
| Llama3.2 3B (tuned) | 21 | 0.84 |
| Llama3.2 1B (tuned) | 22 | 0.88 |
| Llama3.2 3B (base) | 6 | 0.24 |
| Llama3.2 1B (base) | 0 | 0.00 |
Follow the instructions in the Github repository
Base model
meta-llama/Llama-3.2-3B-Instruct