---
title: Python Bug Fixer OpenEnv
emoji: 🐛
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
---

# Python Bug Fixer — OpenEnv

An OpenEnv-compliant environment where an AI agent must identify and fix bugs
in Python code to produce correct program output. Simulates real-world
software debugging and code review workflows.

---

## Environment Description

The agent receives a buggy Python code snippet along with a description of
expected behavior. The agent's action is to return the corrected Python code.
The environment executes the code and rewards the agent based on how many
expected output lines are produced correctly.

---

## Observation Space

**Type:** Text

The observation contains:
- Task description and difficulty
- Expected stdout output (ground truth)
- The buggy Python code to fix

---

## Action Space

**Type:** Text

The action is raw Python code (no markdown, no code fences).
It must be valid Python that can be executed with `python3`.

---

## Tasks

| Task ID | Name | Difficulty | Bugs | Max Steps |
|---------|------|-----------|------|-----------|
| `task_easy` | Fix Index Errors | Easy | 2 | 5 |
| `task_medium` | Fix Binary Search | Medium | 2 | 5 |
| `task_hard` | Fix DataProcessor Class | Hard | 3 | 7 |

### Reward Function
- Reward ∈ [0.0, 1.0]
- Each expected output line is worth `1 / N` reward
- Partial credit awarded for partially correct fixes
- Code that crashes with runtime error: 0.1 partial credit if some output produced

---

## Setup & Run Locally

```bash
# 1. Install dependencies
pip install -r requirements.txt

# 2. Start the server
uvicorn app.main:app --host 0.0.0.0 --port 7860

# 3. Test endpoints
curl http://localhost:7860/health
curl http://localhost:7860/tasks
```

---

## Run Inference

```bash
export API_BASE_URL="https://api-inference.huggingface.co/v1"
export MODEL_NAME="meta-llama/Meta-Llama-3-8B-Instruct"
export HF_TOKEN="hf_YOUR_TOKEN_HERE"
export SPACE_URL="https://YOUR_USERNAME-python-bug-fixer.hf.space"

python inference.py
```

Expected output format:
```
[START] {"task_id": "task_easy", "session_id": "...", "model": "...", "timestamp": "..."}
[STEP]  {"step": 1, "reward": 1.0, "done": true, ...}
[END]   {"task_id": "task_easy", "total_reward": 1.0, "steps": 1, "success": true, ...}
```

---

## API Reference

### `POST /reset`
Start a new episode.
```json
Request:  { "task_id": "task_easy" }
Response: { "session_id": "...", "task_id": "...", "observation": "...", "info": {} }
```

### `POST /step`
Submit fixed code as an action.
```json
Request:  { "session_id": "...", "action": "def get_last_element(lst): ..." }
Response: { "observation": "...", "reward": 1.0, "done": true, "info": {} }
```

### `GET /state?session_id=...`
Get current episode state without advancing.
```json
Response: { "session_id": "...", "task_id": "...", "steps": 1, "done": true, "current_observation": "..." }
```

### `GET /tasks`
List all available tasks and metadata.

### `GET /health`
Returns `{"status": "ok"}`.

---

## Docker

```bash
docker build -t python-bug-fixer .
docker run -p 7860:7860 python-bug-fixer
```

---

## Project Structure

```
my-openenv/
├── inference.py          # Baseline inference script (root — required)
├── openenv.yaml          # OpenEnv specification
├── Dockerfile            # Container definition
├── requirements.txt      # Python dependencies
├── README.md
└── app/
    ├── __init__.py
    ├── main.py           # FastAPI server (reset/step/state endpoints)
    ├── models.py         # Pydantic request/response models
    └── tasks/
        ├── __init__.py   # Task registry
        ├── base.py       # BaseTask + safe code runner
        ├── task_easy.py  # Easy task (2 index bugs)
        ├── task_medium.py # Medium task (2 binary search bugs)
        └── task_hard.py  # Hard task (3 DataProcessor bugs)
```