Distil-Gemma3-270M-SHELLper

A fine-tuned Gemma 3 270M model for multi-turn bash function calling. Trained using knowledge distillation from GPT OSS 120B, this 270M parameter model achieves 96% tool-call accuracy on our test set over multiple turns while being small enough to run locally on any machine.

Results

Metric	GPT OSS 120B (Teacher)	Gemma 3 270M (Base)	This Model
Tool call accuracy	97.03%	9.90%	96.04%
ROUGE-L	94.42%	49.73%	96.55%

Quick Start

You can follow instructions in our demo repository: github

Using Ollama

# Download and create Ollama model
huggingface-cli download distil-labs/distil-gemma3-270m-SHELLper model.gguf Modelfile --local-dir distil_model
cd distil_model && ollama create distil_model -f Modelfile
cd ..

Using Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("distil-labs/distil-gemma3-270m-SHELLper")
tokenizer = AutoTokenizer.from_pretrained("distil-labs/distil-gemma3-270m-SHELLper")

tools = [
    {
        "type": "function",
        "function": {
            "name": "ls",
            "description": "List directory contents",
            "parameters": {
                "type": "object",
                "properties": {
                    "folder": {"type": "string", "description": "Path to the folder to list"}
                },
                "required": ["folder"]
            }
        }
    }
]

messages = [
    {"role": "system", "content": "You are a helpful assistant that executes bash commands."},
    {"role": "user", "content": "List all files in the current directory"}
]

text = tokenizer.apply_chat_template(messages, tools=tools, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Model Details

Property	Value
Base Model	google/gemma-3-270m-it
Parameters	270 million
Architecture	Gemma3ForCausalLM
Context Length	32,768 tokens
Precision	bfloat16
Training Data	~10,000 synthetic examples (expanded from 20 seeds)
Teacher Model	GPT OSS 120B

Training

This model was trained using the Distil Labs platform:

Seed Data: 20 hand-validated multi-turn bash command conversations
Synthetic Generation: Expanded using GPT OSS 120B teacher with conversation turn expansion
Fine-tuning: 4 epochs with LoRA (r=128) on the synthetic dataset
Evaluation: Multi-turn accuracy testing across variable conversation lengths

Training Hyperparameters

Epochs: 4
Learning Rate: 2e-5 (linear schedule)
Batch Size: 1 (with gradient accumulation)
LoRA Rank: 128

Task Format

Input Format

Multi-turn conversation with tool definitions:

[
  {"role": "user", "content": "List all files in the current directory"},
  {"role": "assistant", "tool_calls": [{"function": {"name": "ls", "arguments": {"folder": "."}}}]},
  {"role": "user", "content": "Now go to the src folder"}
]

Output Format

A single tool call in JSON format:

{"name": "cd", "arguments": {"folder": "src"}}

Supported Tools

The model supports 20 bash commands:

Command	Description
`cat`	Display file contents
`cd`	Change directory
`cp`	Copy files or directories
`diff`	Compare files
`du`	Estimate file space usage
`echo`	Display text
`find`	Search for files
`grep`	Search file contents
`head`	Output first part of files
`ls`	List directory contents
`mkdir`	Create directories
`mv`	Move or rename files
`pwd`	Print working directory
`rm`	Remove files
`rmdir`	Remove empty directories
`sort`	Sort file contents
`tail`	Output last part of files
`touch`	Create empty files
`wc`	Word, line, character count

Use Cases

Natural language interfaces to file systems
Command-line assistants and automation
Developer productivity tools
Educational tools for learning bash
Local, privacy-preserving AI assistants

Limitations

Optimized for single tool call per turn
No support for pipes or combined commands
Best with up to 5 conversation turns
Trained on English requests only
Limited to the 20 supported bash commands

License

This model is released under the Apache 2.0 license.

Citation

@misc{distil-gemma3-270m-shellper,
  author = {Distil Labs},
  title = {Distil-Gemma3-270M-SHELLper: A Fine-tuned Model for Multi-turn Bash Function Calling},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/distil-labs/distil-gemma3-270m-SHELLper}
}

Downloads last month: 7

Safetensors

Model size

0.3B params

Tensor type

BF16

Model tree for distil-labs/distil-gemma3-270m-SHELLper

Base model

google/gemma-3-270m

Finetuned

google/gemma-3-270m-it