--- tags: - ml-intern ---

StoryBox

## 📝 Introduction This is the repository for the paper [StoryBox: Collaborative Multi-Agent Simulation for Hybrid Bottom-Up Long-Form Story Generation Using Large Language Models](https://ojs.aaai.org/index.php/AAAI/article/view/40288), accepted by **AAAI 2026**. ![Framework](assets/framework.jpg) **StoryBox** is a framework that leverages collaborative multi-agent simulation for hybrid bottom-up long-form story generation. By combining bottom-up character-driven agent interactions with top-down narrative planning, it dynamically constructs deep, coherent, and engaging story worlds. --- ## ⚡ Quick Start (with uv) ```bash # 1. Install uv curl -LsSf https://astral.sh/uv/install.sh | sh # 2. Clone git clone https://huggingface.co/raazkumar/storybox-reproduction cd storybox-reproduction # 3. Install dependencies (uv handles everything) uv sync # 4. Run quick test (1 day, mock LLM — no API key needed) uv run python reverie/test_run.py # 5. Run full simulation (14 days, requires API key) export OPENAI_API_KEY="sk-..." uv run python reverie/run.py ``` --- ## ⚙️ Installation Options ```bash # Base install (OpenAI only) uv sync # Apple Silicon — native MLX (fastest on M1/M2/M3/M4) uv sync --extra mlx # NVIDIA NIM inference uv sync --extra nim # Ollama local inference uv sync --extra ollama # All local backends (MLX + Ollama) uv sync --extra local # Everything (all providers + dev tools) uv sync --extra all --extra dev ``` --- ## 🚀 Supported LLM Providers | Provider | Config | Speed | Setup | |----------|--------|-------|-------| | **OpenAI** | `gpt-4o-mini` | Fastest | API key | | **Ollama** | `gemma4` | Fast | `brew install ollama` | | **MLX (Apple)** ⭐ | `llama3.1-8b-mlx` | **Fastest on Mac** | `uv sync --extra mlx` | | **NVIDIA NIM** | `nvidia/meta/llama-3.1-8b-instruct` | Fast | API key | Switch providers by changing **one line** in `reverie/config/config.py`: ```python llm_model_name = 'llama3.1-8b-mlx' # MLX native (Apple) llm_model_name = 'gemma4' # Ollama llm_model_name = 'nvidia/meta/llama-3.1-8b-instruct' # NIM llm_model_name = 'gpt-4o-mini' # OpenAI ``` --- ## 📁 Project Structure ``` storybox/ ├── reverie/ │ ├── run.py # Main entry point │ ├── test_run.py # Quick 1-day test │ ├── config/config.py # All settings │ ├── agent/storyteller.py # Story generation │ ├── persona/ # Characters + cognition │ ├── environment/world.py # Sandbox world │ ├── common/llm.py # LLM provider router │ ├── common/mlx_llm.py # Native MLX (Apple) │ └── prompts/prompt-1/ # 30+ prompt templates ├── data/story01-20/ # 20 story settings ├── pyproject.toml # uv project config └── uv.lock # Locked dependencies ``` --- ## 🛠️ Common Commands ```bash # Run with uv (recommended) uv run python reverie/run.py uv run python reverie/test_run.py # Install additional dependencies uv add gradio uv add --dev pytest # Lock dependencies uv lock # Update dependencies uv sync --upgrade # Run tests uv run pytest # Format code uv run ruff format . uv run ruff check --fix . # Type check uv run mypy reverie/ ``` --- ## 🇮🇳 Hindi / Multilingual Stories ```bash # Generate English story uv run python reverie/run.py # Translate to Hindi (post-generation) # See GRADIO_UI_GUIDE.md for full pipeline ``` **Recommended approach**: English simulation → Hindi storyteller. Characters plan/chat in English, final story written in Hindi. --- ## 🏭 Synthetic Data Generation ```bash # Generate 1000 stories for training data uv run python scripts/generate_synthetic_dataset.py \ --num-stories 1000 \ --output stories.jsonl # Convert to instruction format uv run python scripts/to_instruction_format.py \ --input stories.jsonl \ --output train.json ``` --- ## 📚 Citation ```bibtex @inproceedings{chen2026storybox, title = {StoryBox: Collaborative Multi-Agent Simulation for Hybrid Bottom-Up Long-Form Story Generation Using Large Language Models}, author = {Chen, Zehao and Pan, Rong and Li, Haoran}, booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence}, volume = {40}, number = {36}, pages = {30359--30367}, year = {2026} } ``` --- ## ⚖️ License

This project is licensed under the [MIT License](https://opensource.org/license/MIT). ## Generated by ML Intern This model repository was generated by [ML Intern](https://github.com/huggingface/ml-intern), an agent for machine learning research and development on the Hugging Face Hub. - Try ML Intern: https://smolagents-ml-intern.hf.space - Source code: https://github.com/huggingface/ml-intern ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_id = 'raazkumar/storybox-reproduction' tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id) ``` For non-causal architectures, replace `AutoModelForCausalLM` with the appropriate `AutoModel` class.