Distil-Gemma3-270M-SHELLper

A fine-tuned Gemma 3 270M model for multi-turn bash function calling. Trained using knowledge distillation from GPT OSS 120B, this 270M parameter model achieves 96% tool-call accuracy on our test set over multiple turns while being small enough to run locally on any machine.

Results

Metric GPT OSS 120B (Teacher) Gemma 3 270M (Base) This Model
Tool call accuracy 97.03% 9.90% 96.04%
ROUGE-L 94.42% 49.73% 96.55%

Quick Start

You can follow instructions in our demo repository: github

Using Ollama

# Download and create Ollama model
huggingface-cli download distil-labs/distil-gemma3-270m-SHELLper model.gguf Modelfile --local-dir distil_model
cd distil_model && ollama create distil_model -f Modelfile
cd ..

Using Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("distil-labs/distil-gemma3-270m-SHELLper")
tokenizer = AutoTokenizer.from_pretrained("distil-labs/distil-gemma3-270m-SHELLper")

tools = [
    {
        "type": "function",
        "function": {
            "name": "ls",
            "description": "List directory contents",
            "parameters": {
                "type": "object",
                "properties": {
                    "folder": {"type": "string", "description": "Path to the folder to list"}
                },
                "required": ["folder"]
            }
        }
    }
]

messages = [
    {"role": "system", "content": "You are a helpful assistant that executes bash commands."},
    {"role": "user", "content": "List all files in the current directory"}
]

text = tokenizer.apply_chat_template(messages, tools=tools, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Model Details

Property Value
Base Model google/gemma-3-270m-it
Parameters 270 million
Architecture Gemma3ForCausalLM
Context Length 32,768 tokens
Precision bfloat16
Training Data ~10,000 synthetic examples (expanded from 20 seeds)
Teacher Model GPT OSS 120B

Training

This model was trained using the Distil Labs platform:

  • Seed Data: 20 hand-validated multi-turn bash command conversations
  • Synthetic Generation: Expanded using GPT OSS 120B teacher with conversation turn expansion
  • Fine-tuning: 4 epochs with LoRA (r=128) on the synthetic dataset
  • Evaluation: Multi-turn accuracy testing across variable conversation lengths

Training Hyperparameters

  • Epochs: 4
  • Learning Rate: 2e-5 (linear schedule)
  • Batch Size: 1 (with gradient accumulation)
  • LoRA Rank: 128

Task Format

Input Format

Multi-turn conversation with tool definitions:

[
  {"role": "user", "content": "List all files in the current directory"},
  {"role": "assistant", "tool_calls": [{"function": {"name": "ls", "arguments": {"folder": "."}}}]},
  {"role": "user", "content": "Now go to the src folder"}
]

Output Format

A single tool call in JSON format:

{"name": "cd", "arguments": {"folder": "src"}}

Supported Tools

The model supports 20 bash commands:

Command Description
cat Display file contents
cd Change directory
cp Copy files or directories
diff Compare files
du Estimate file space usage
echo Display text
find Search for files
grep Search file contents
head Output first part of files
ls List directory contents
mkdir Create directories
mv Move or rename files
pwd Print working directory
rm Remove files
rmdir Remove empty directories
sort Sort file contents
tail Output last part of files
touch Create empty files
wc Word, line, character count

Use Cases

  • Natural language interfaces to file systems
  • Command-line assistants and automation
  • Developer productivity tools
  • Educational tools for learning bash
  • Local, privacy-preserving AI assistants

Limitations

  • Optimized for single tool call per turn
  • No support for pipes or combined commands
  • Best with up to 5 conversation turns
  • Trained on English requests only
  • Limited to the 20 supported bash commands

License

This model is released under the Apache 2.0 license.

Links

Citation

@misc{distil-gemma3-270m-shellper,
  author = {Distil Labs},
  title = {Distil-Gemma3-270M-SHELLper: A Fine-tuned Model for Multi-turn Bash Function Calling},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/distil-labs/distil-gemma3-270m-SHELLper}
}
Downloads last month
7
Safetensors
Model size
0.3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for distil-labs/distil-gemma3-270m-SHELLper

Finetuned
(1082)
this model