BabyLlama π¦
BabyLlama is a QLoRA fine-tune of Meta-Llama-3.1-8B-Instruct specialised for parenting assistance. It provides evidence-based advice on child nutrition, daily routines, age-appropriate activities, and generates child-friendly bedtime stories.
Model Details
| Base model | meta-llama/Meta-Llama-3.1-8B-Instruct |
| Fine-tuning method | QLoRA (4-bit quantisation + LoRA adapters) |
| LoRA rank / alpha | r=32, alpha=64, RSLoRA=True |
| LoRA targets | q, k, v, o, gate, up, down projections (all linear layers) |
| Training examples | 1,000,000 |
| Effective batch size | 512 |
| Training steps | 310 (1 epoch) |
| Learning rate | 4Γ10β»β΄ (cosine schedule, 3% warmup) |
| Precision | bfloat16 + TF32 |
| Framework | Unsloth + HuggingFace TRL |
What BabyLlama Can Do
- Infant & toddler nutrition β age-specific meal plans, portion sizes, food introduction timelines
- Daily routines β schedules tailored to the child's age and parent's working hours
- Age-appropriate activities β motor skill development, outdoor play, creative activities by age group
- Children's stories β bedtime stories, short tales, educational narratives for young children
- General parenting Q&A β sleep, tantrums, milestones, developmental guidance
Quick Start
from transformers import AutoTokenizer
from peft import AutoPeftModelForCausalLM
import torch
model = AutoPeftModelForCausalLM.from_pretrained(
"buildrestart/babyllama",
load_in_4bit=True,
torch_dtype=torch.bfloat16,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("buildrestart/babyllama")
messages = [
{"role": "system", "content": (
"You are a helpful, knowledgeable parenting assistant. "
"Provide safe, evidence-based advice about child care, nutrition, "
"activities, and daily routines. Always recommend consulting a "
"pediatrician for medical concerns."
)},
{"role": "user", "content": "What solid foods should I introduce to my 8-month-old?"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
with torch.no_grad():
output = model.generate(
**inputs,
max_new_tokens=400,
temperature=0.7,
top_p=0.9,
do_sample=True,
)
response = tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)
Using with Unsloth (faster inference)
from unsloth import FastLanguageModel
import torch
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="buildrestart/babyllama",
max_seq_length=2048,
dtype=torch.bfloat16,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
messages = [
{"role": "system", "content": "You are a helpful parenting assistant."},
{"role": "user", "content": "Tell me a bedtime story for a 3-year-old about a bunny."},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
output = model.generate(**inputs, max_new_tokens=400, temperature=0.7, do_sample=True)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
Training Data
Trained on 1M curated examples from multiple sources covering:
| Category | Examples |
|---|---|
| Children's stories & bedtime stories | ~995,000 |
| Infant/toddler/child nutrition Q&A and meal plans | ~250 |
| Daily routines and schedules | ~24 |
| Age-appropriate physical activities | ~25 |
The dataset emphasises narrative generation (stories) alongside structured parenting advice.
Limitations
- Not a substitute for medical advice β always consult a qualified paediatrician for health concerns.
- Story-heavy training distribution β the majority of training data is children's stories, so the model excels at narrative tasks. Nutrition and schedule advice is present but less represented.
- English only β trained exclusively on English-language data.
- Age range β primarily covers 0β12 years; advice for older children/teens is limited.
- Single epoch β trained for one pass over 1M examples; may benefit from further fine-tuning on specific parenting topics.
Intended Use
This model is intended for:
- Parenting app prototypes and demos
- Educational tools for new parents
- Research into domain-specific fine-tuning of LLMs
- Generating child-safe story content
It is not intended for clinical or medical decision-making.
License
This model inherits the Llama 3.1 Community License. Usage must comply with Meta's acceptable use policy.
- Downloads last month
- 72
Model tree for buildrestart/babyllama
Base model
meta-llama/Llama-3.1-8B Finetuned
meta-llama/Llama-3.1-8B-Instruct