---
license: apache-2.0
datasets:
- roneneldan/TinyStories
language:
- en
pipeline_tag: text-generation
library_name: transformers
tags:
- small
- tiny
- story
- tinystories
- roneneldan
- cpu
- free
- open-source
---

# 📖 StorySupra 10M

## Config
- Parameters: 12,587,264 (~10M)
- Hidden Size: 256
- Intermediate Size: 1024
- Hidden Layers: 8
- Attention Heads: 8
- Max Position Embeddings: 256
- Vocab Size: 8192

## Samples
Once upon a time , a small bird was flying in the sky . It saw a big tree and wanted to rest under it . But the tree was too high for the bird to reach . The bird tried to fly up , but it could not . Then , a wise old owl flew by and saw the bird struggling . The owl said , " Don ' t worry little bird , I can help you ." The owl used its strong beak to climb the tree and get the bird down . The bird was
<br><br>
Once upon a time , there was a little boy named Timmy . He loved to play with his toys and run around outside . One day , he found a shiny penny on the ground . It was so pretty that he picked it up and showed it to his mom . " Look , Mommy ! I found a penny !" he said . His mom smiled and said , " That ' s great , Timmy . But be careful , it ' s very special ." Timmy didn ' t understand what " valuable " meant , but he knew it meant something important . So
<br><br>
Once upon a time , there was a lovely princess . She had long , blonde hair and a sparkly crown . One day , she wanted to go for a walk in the forest . She put on her dress and started walking . As she walked , she saw something strange . It was a big , scary bear ! The princess was scared , but she didn ' t want to get away . So she just kept walking until she reached the forest . When she got there , she saw a little rabbit . He was wearing a bright red bow and he looked very friendly .

## Training
- GPU: single RTX 5060 Ti 16GB
- Time: ~20 minutes
- Epochs: 3
- Samples of the dataset: 200k

## Dataset
200k samples of roneneldan/TinyStories

## Code
You can find the full code in this repo as `train.py` and inference.py. Have fun :-)

## Usage
Use this to run the model:
```python3
"""
StorySupra-10M — Interactive Story Generator
Loads model weights directly from HuggingFace: SupraLabs/StorySupra-10M
"""

import torch
from transformers import LlamaForCausalLM, PreTrainedTokenizerFast

# ──────────────────────────────────────────────
# Configuration
# ──────────────────────────────────────────────
MODEL_ID = "SupraLabs/StorySupra-10M"

GENERATION_DEFAULTS = {
    "max_new_tokens": 100,
    "temperature": 0.55,
    "top_k": 25,
    "top_p": 0.85,
    "repetition_penalty": 1.1,
    "do_sample": True,
}

EXIT_COMMANDS = {"exit", "quit", "leave"}

# ──────────────────────────────────────────────
# Model loading
# ──────────────────────────────────────────────

def load_model(model_id: str):
    """Download and return the tokenizer and model from HuggingFace Hub."""
    print(f"Downloading model from HuggingFace: {model_id}")
    print("(This may take a moment on first run — weights will be cached locally.)\n")

    tokenizer = PreTrainedTokenizerFast.from_pretrained(model_id)
    model = LlamaForCausalLM.from_pretrained(model_id)

    device = "cuda" if torch.cuda.is_available() else "cpu"
    print(f"Using device: {device}\n")

    model.to(device)
    model.eval()

    return tokenizer, model, device


# ──────────────────────────────────────────────
# Text generation
# ──────────────────────────────────────────────

def generate_text(
    prompt: str,
    tokenizer,
    model,
    device: str,
    max_new_tokens: int = GENERATION_DEFAULTS["max_new_tokens"],
    temperature: float = GENERATION_DEFAULTS["temperature"],
    top_k: int = GENERATION_DEFAULTS["top_k"],
    top_p: float = GENERATION_DEFAULTS["top_p"],
    repetition_penalty: float = GENERATION_DEFAULTS["repetition_penalty"],
) -> str:
    """Generate a story continuation from the given prompt."""
    inputs = tokenizer(prompt, return_tensors="pt").to(device)

    with torch.no_grad():
        output_tokens = model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            do_sample=True,
            temperature=temperature,
            top_k=top_k,
            top_p=top_p,
            repetition_penalty=repetition_penalty,
            pad_token_id=tokenizer.pad_token_id,
            eos_token_id=tokenizer.eos_token_id,
        )

    return tokenizer.decode(output_tokens[0], skip_special_tokens=True)


# ──────────────────────────────────────────────
# Interactive loop
# ──────────────────────────────────────────────

def run():
    print("=" * 50)
    print("  StorySupra-10M — Interactive Story Generator")
    print("=" * 50)

    tokenizer, model, device = load_model(MODEL_ID)

    print("-" * 50)
    print("Model ready! Type a prompt to generate a story.")
    print(f"Type {' / '.join(EXIT_COMMANDS)} to quit.")
    print("-" * 50)

    while True:
        try:
            user_prompt = input("\nYour prompt: ").strip()
        except (EOFError, KeyboardInterrupt):
            print("\nExiting. Goodbye!")
            break

        if not user_prompt:
            print("Please enter a prompt.")
            continue

        if user_prompt.lower() in EXIT_COMMANDS:
            print("Goodbye!")
            break

        print("\nGenerating...\n")
        story = generate_text(user_prompt, tokenizer, model, device)

        print("Generated story:")
        print("-" * 20)
        print(story)
        print("-" * 20)


# ──────────────────────────────────────────────
# Entry point
# ──────────────────────────────────────────────

if __name__ == "__main__":
    run()
```