Instructions to use Jashan887/83_Self_Improving_Loop with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Jashan887/83_Self_Improving_Loop with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Jashan887/83_Self_Improving_Loop",
	filename="bonsai-8b-stage2-post-curriculum-q8.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use Jashan887/83_Self_Improving_Loop with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Jashan887/83_Self_Improving_Loop
# Run inference directly in the terminal:
llama-cli -hf Jashan887/83_Self_Improving_Loop

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Jashan887/83_Self_Improving_Loop
# Run inference directly in the terminal:
llama-cli -hf Jashan887/83_Self_Improving_Loop

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Jashan887/83_Self_Improving_Loop
# Run inference directly in the terminal:
./llama-cli -hf Jashan887/83_Self_Improving_Loop

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Jashan887/83_Self_Improving_Loop
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Jashan887/83_Self_Improving_Loop

Use Docker

docker model run hf.co/Jashan887/83_Self_Improving_Loop

LM Studio
Jan

vLLM

How to use Jashan887/83_Self_Improving_Loop with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Jashan887/83_Self_Improving_Loop"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Jashan887/83_Self_Improving_Loop",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Jashan887/83_Self_Improving_Loop

Ollama
How to use Jashan887/83_Self_Improving_Loop with Ollama:
```
ollama run hf.co/Jashan887/83_Self_Improving_Loop
```

Unsloth Studio new

How to use Jashan887/83_Self_Improving_Loop with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Jashan887/83_Self_Improving_Loop to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Jashan887/83_Self_Improving_Loop to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Jashan887/83_Self_Improving_Loop to start chatting

Pi new

How to use Jashan887/83_Self_Improving_Loop with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf Jashan887/83_Self_Improving_Loop

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "83_Self_Improving_Loop"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Docker Model Runner
How to use Jashan887/83_Self_Improving_Loop with Docker Model Runner:
```
docker model run hf.co/Jashan887/83_Self_Improving_Loop
```

Lemonade

How to use Jashan887/83_Self_Improving_Loop with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Jashan887/83_Self_Improving_Loop

Run and chat with the model

lemonade run user.83_Self_Improving_Loop-{{QUANT_TAG}}

List all available models

lemonade list

83_Self_Improving_Loop / README.md

Jashan887

Upload folder using huggingface_hub

ef0538d verified 5 days ago

preview code

raw

history blame contribute delete

3.17 kB

metadata

license: apache-2.0
base_model: Qwen/Qwen3-8B-Base
library_name: llama.cpp
pipeline_tag: text-generation

Hermes-Bonsai Karpathy Self-Improving Agent Loop

Stage 2 checkpoint for the Hermes/Bonsai Karpathy auto-research loop.

Last updated: 2026-04-05

This release is inspired by Andrej Karpathy's framing of self-improving training loops and auto-research. It contains the model artifact that worked, plus a concise model card explaining how it was produced and how to run it.

Overview

Base model: Qwen3-8B-Base
Training method: supervised fine-tuning via the Hermes/Karpathy loop
Stage: Stage 2 — the checkpoint that worked
Known limitation: Stage 3 exposed a learned-helplessness pattern on some tasks; that behavior is documented in the GitHub methodology repo
License: Apache-2.0 for this release; the underlying base model license also applies to the inherited Qwen3-8B-Base components

What went into this checkpoint

The loop-produced training curriculum and trace distillation pipeline
140 verified raw passes used as positive reinforcement for curriculum rebalancing and trace selection
- These are Bonsai's own unedited outputs that passed teacher evaluation
10 domains covered across the build
Validation signal from a mixed-domain batch

Domains covered

memory_integration
refusal_redirect
self_correction
agent_routing
devops
logic_puzzle
code_debugging
math
architecture
research_synthesis

Strongest domains

Best performance concentrated in:

memory_integration
refusal_redirect
self_correction

Validation metrics

Mixed-domain batch: 13/50 raw passes
Raw pass rate: 26%
This checkpoint is the stage 2 model that produced those verified passes

What's novel

Trained via a graduation protocol with teacher-guided validation, raw-pass reinforcement, and frontier failure analysis. The interesting contribution is the loop methodology; see GitHub for the full curriculum and training workflow.

GitHub methodology

The training loop, curriculum design, graduation protocol, and detailed methodology live here:

https://github.com/aurous37-lang/Hermes-Bonsai-Self-Improving-Agent-Loop

Files in this Hugging Face repo

bonsai-8b-stage2-post-curriculum-q8.gguf — the shipped stage 2 checkpoint
README.md — this model card
LICENSE — Apache-2.0 license

How to use

Recommended working config from the stable local run:

--ctx-size 40960
--n-gpu-layers 37

llama.cpp

./llama-cli -m bonsai-8b-stage2-post-curriculum-q8.gguf \
  --ctx-size 40960 \
  -p "Explain the CAP theorem for a backend engineer."

llama-server

./llama-server -m bonsai-8b-stage2-post-curriculum-q8.gguf \
  --ctx-size 40960 \
  --n-gpu-layers 37 \
  --host 0.0.0.0 --port 8080

Then point your client at the local OpenAI-compatible endpoint exposed by llama-server.

Notes

This is a release checkpoint, not the full training corpus.
The GitHub repo contains the code and documentation needed to reproduce the loop.
The Hugging Face repo contains the model artifact that ships from that loop.