StoryBox

        

  

๐Ÿ“ Introduction

This is the repository for the paper StoryBox: Collaborative Multi-Agent Simulation for Hybrid Bottom-Up Long-Form Story Generation Using Large Language Models, accepted by AAAI 2026.

Framework

StoryBox is a framework that leverages collaborative multi-agent simulation for hybrid bottom-up long-form story generation. By combining bottom-up character-driven agent interactions with top-down narrative planning, it dynamically constructs deep, coherent, and engaging story worlds.


โšก Quick Start (with uv)

# 1. Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# 2. Clone
git clone https://huggingface.co/raazkumar/storybox-reproduction
cd storybox-reproduction

# 3. Install dependencies (uv handles everything)
uv sync

# 4. Run quick test (1 day, mock LLM โ€” no API key needed)
uv run python reverie/test_run.py

# 5. Run full simulation (14 days, requires API key)
export OPENAI_API_KEY="sk-..."
uv run python reverie/run.py

โš™๏ธ Installation Options

# Base install (OpenAI only)
uv sync

# Apple Silicon โ€” native MLX (fastest on M1/M2/M3/M4)
uv sync --extra mlx

# NVIDIA NIM inference
uv sync --extra nim

# Ollama local inference
uv sync --extra ollama

# All local backends (MLX + Ollama)
uv sync --extra local

# Everything (all providers + dev tools)
uv sync --extra all --extra dev

๐Ÿš€ Supported LLM Providers

Provider Config Speed Setup
OpenAI gpt-4o-mini Fastest API key
Ollama gemma4 Fast brew install ollama
MLX (Apple) โญ llama3.1-8b-mlx Fastest on Mac uv sync --extra mlx
NVIDIA NIM nvidia/meta/llama-3.1-8b-instruct Fast API key

Switch providers by changing one line in reverie/config/config.py:

llm_model_name = 'llama3.1-8b-mlx'      # MLX native (Apple)
llm_model_name = 'gemma4'               # Ollama
llm_model_name = 'nvidia/meta/llama-3.1-8b-instruct'  # NIM
llm_model_name = 'gpt-4o-mini'          # OpenAI

๐Ÿ“ Project Structure

storybox/
โ”œโ”€โ”€ reverie/
โ”‚   โ”œโ”€โ”€ run.py                    # Main entry point
โ”‚   โ”œโ”€โ”€ test_run.py               # Quick 1-day test
โ”‚   โ”œโ”€โ”€ config/config.py          # All settings
โ”‚   โ”œโ”€โ”€ agent/storyteller.py      # Story generation
โ”‚   โ”œโ”€โ”€ persona/                  # Characters + cognition
โ”‚   โ”œโ”€โ”€ environment/world.py      # Sandbox world
โ”‚   โ”œโ”€โ”€ common/llm.py             # LLM provider router
โ”‚   โ”œโ”€โ”€ common/mlx_llm.py         # Native MLX (Apple)
โ”‚   โ””โ”€โ”€ prompts/prompt-1/         # 30+ prompt templates
โ”œโ”€โ”€ data/story01-20/              # 20 story settings
โ”œโ”€โ”€ pyproject.toml                # uv project config
โ””โ”€โ”€ uv.lock                       # Locked dependencies

๐Ÿ› ๏ธ Common Commands

# Run with uv (recommended)
uv run python reverie/run.py
uv run python reverie/test_run.py

# Install additional dependencies
uv add gradio
uv add --dev pytest

# Lock dependencies
uv lock

# Update dependencies
uv sync --upgrade

# Run tests
uv run pytest

# Format code
uv run ruff format .
uv run ruff check --fix .

# Type check
uv run mypy reverie/

๐Ÿ‡ฎ๐Ÿ‡ณ Hindi / Multilingual Stories

# Generate English story
uv run python reverie/run.py

# Translate to Hindi (post-generation)
# See GRADIO_UI_GUIDE.md for full pipeline

Recommended approach: English simulation โ†’ Hindi storyteller. Characters plan/chat in English, final story written in Hindi.


๐Ÿญ Synthetic Data Generation

# Generate 1000 stories for training data
uv run python scripts/generate_synthetic_dataset.py \
    --num-stories 1000 \
    --output stories.jsonl

# Convert to instruction format
uv run python scripts/to_instruction_format.py \
    --input stories.jsonl \
    --output train.json

๐Ÿ“š Citation

@inproceedings{chen2026storybox,
  title     = {StoryBox: Collaborative Multi-Agent Simulation for Hybrid Bottom-Up Long-Form Story Generation Using Large Language Models},
  author    = {Chen, Zehao and Pan, Rong and Li, Haoran},
  booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence},
  volume    = {40},
  number    = {36},
  pages     = {30359--30367},
  year      = {2026}
}

โš–๏ธ License

This project is licensed under the MIT License.

Generated by ML Intern

This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = 'raazkumar/storybox-reproduction'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

For non-causal architectures, replace AutoModelForCausalLM with the appropriate AutoModel class.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Paper for raazkumar/storybox-reproduction