birgermoell/gemma-2-2b-it-4bit-eir-swedish

This is a public MLX-fused Swedish fine-tune of mlx-community/gemma-2-2b-it-4bit for the Eir app.

The model was fine-tuned to help users make sense of Swedish clinical notes in .eir-style records. The intended behavior is:

  • explain clinical notes in plain Swedish
  • surface the most relevant findings without overclaiming
  • suggest practical next steps or useful follow-up questions
  • cite source entries using <JOURNAL_ENTRY id="..."/> where possible

Training summary

  • Base model: mlx-community/gemma-2-2b-it-4bit
  • Fine-tuning method: LoRA
  • Runtime: mlx-lm
  • Training data: synthetic but realistic Swedish primary-care .eir timelines and supervised answers
  • Current synthetic dataset size: 31 train / 5 valid / 6 test

This repository contains the fused model weights, so it can be loaded directly by MLX clients without a separate adapter step.

Evaluation snapshot

Held-out synthetic test split:

  • Base model: token_f1=0.2351, citation_f1=0.0000
  • Fine-tuned model: token_f1=0.2874, citation_f1=0.3333

These numbers are small-sample and synthetic-only, but they were enough to confirm the fine-tune learned task-specific citation and note-explanation behavior.

Use with MLX

from mlx_lm import load, generate

model, tokenizer = load("birgermoell/gemma-2-2b-it-4bit-eir-swedish")

messages = [
    {
        "role": "user",
        "content": "Förklara journalanteckningen på enkel svenska och säg vad som verkar viktigast att följa upp."
    }
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
response = generate(model, tokenizer, prompt=prompt, max_tokens=256, verbose=False)
print(response)

Use in the Eir app

In Eir Settings:

  1. Open Settings
  2. Go to the local models section
  3. Choose Add Model...
  4. Enter birgermoell/gemma-2-2b-it-4bit-eir-swedish

The app can then download and load this model as an on-device MLX model.

Limitations

  • This is an early research model, not a medical device
  • Training data is currently synthetic, not real de-identified patient data
  • It should be used for explanation and support, not diagnosis
  • Outputs still need stronger factuality and safety evaluation before real deployment
Downloads last month
32
Safetensors
Model size
0.4B params
Tensor type
F16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for birgermoell/gemma-2-2b-it-4bit-eir-swedish

Quantized
(2)
this model