Spaces:

Chirag0123
/

codebase-nav-env

Sleeping

App Files Files Community

Chirag0123 commited on 14 days ago

Commit

a5c1fa0

0 Parent(s):

v2.0 — agent reliability & evaluation layer

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

.gitignore +9 -0
Dockerfile +31 -0
README.md +144 -0
inference.py +247 -0
openenv.yaml +56 -0
repo_templates/task1/variant_1/meta.json +15 -0
repo_templates/task1/variant_1/src/auth.py +14 -0
repo_templates/task1/variant_1/src/utils.py +16 -0
repo_templates/task1/variant_1/tests/test_auth.py +23 -0
repo_templates/task1/variant_2/meta.json +15 -0
repo_templates/task1/variant_2/src/calculator.py +23 -0
repo_templates/task1/variant_2/src/helpers.py +14 -0
repo_templates/task1/variant_2/tests/test_calculator.py +32 -0
repo_templates/task1/variant_3/meta.json +15 -0
repo_templates/task1/variant_3/src/inventory.py +26 -0
repo_templates/task1/variant_3/src/logger.py +9 -0
repo_templates/task1/variant_3/tests/test_inventory.py +44 -0
repo_templates/task1/variant_4/meta.json +15 -0
repo_templates/task1/variant_4/src/scheduler.py +34 -0
repo_templates/task1/variant_4/src/time_helpers.py +12 -0
repo_templates/task1/variant_4/tests/test_scheduler.py +52 -0
repo_templates/task1/variant_5/meta.json +15 -0
repo_templates/task1/variant_5/src/constants.py +4 -0
repo_templates/task1/variant_5/src/formatter.py +29 -0
repo_templates/task1/variant_5/tests/test_formatter.py +35 -0
repo_templates/task2/variant_1/meta.json +13 -0
repo_templates/task2/variant_1/src/data_pipeline.py +12 -0
repo_templates/task2/variant_1/src/models.py +10 -0
repo_templates/task2/variant_1/src/validator.py +7 -0
repo_templates/task2/variant_1/tests/test_pipeline.py +18 -0
repo_templates/task2/variant_2/meta.json +13 -0
repo_templates/task2/variant_2/src/config.py +5 -0
repo_templates/task2/variant_2/src/email_sender.py +25 -0
repo_templates/task2/variant_2/src/template_engine.py +26 -0
repo_templates/task2/variant_2/tests/test_email.py +23 -0
repo_templates/task2/variant_3/meta.json +13 -0
repo_templates/task2/variant_3/src/inventory_checker.py +33 -0
repo_templates/task2/variant_3/src/models.py +10 -0
repo_templates/task2/variant_3/src/order_processor.py +20 -0
repo_templates/task2/variant_3/tests/test_orders.py +27 -0
repo_templates/task2/variant_4/meta.json +13 -0
repo_templates/task2/variant_4/src/date_formatter.py +28 -0
repo_templates/task2/variant_4/src/models.py +3 -0
repo_templates/task2/variant_4/src/report_builder.py +28 -0
repo_templates/task2/variant_4/tests/test_reports.py +28 -0
repo_templates/task2/variant_5/meta.json +13 -0
repo_templates/task2/variant_5/src/cache_manager.py +36 -0
repo_templates/task2/variant_5/src/config.py +4 -0
repo_templates/task2/variant_5/src/serializer.py +25 -0
repo_templates/task2/variant_5/tests/test_cache.py +37 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,9 @@

+__pycache__/
+*.pyc
+*.pyo
+venv/
+.env
+*.egg-info/
+dist/
+build/
+.pytest_cache/

Dockerfile ADDED Viewed

	@@ -0,0 +1,31 @@

+FROM python:3.11-slim
+# Create non-root user for security — MANDATORY for running agent code safely
+RUN useradd -m -u 1000 envuser
+WORKDIR /app
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    git \
+    && rm -rf /var/lib/apt/lists/*
+# Copy and install Python dependencies first (layer caching)
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy project
+COPY . .
+# Make repo_templates readable
+RUN chmod -R 755 repo_templates/
+# Create temp directory for working copies
+RUN mkdir -p /tmp/openenv_work && chmod 777 /tmp/openenv_work
+# Switch to non-root for security
+USER envuser
+EXPOSE 7860
+CMD ["uvicorn", "server.app:app", "--host", "0.0.0.0", "--port", "7860", "--workers", "1"]

README.md ADDED Viewed

	@@ -0,0 +1,144 @@

+---
+title: Codebase Navigation Repair OpenEnv
+emoji: 🔍
+colorFrom: blue
+colorTo: green
+sdk: docker
+pinned: false
+app_port: 7860
+license: mit
+tags:
+  - openenv
+  - reinforcement-learning
+  - coding-agent
+---
+# Codebase Navigation & Repair — OpenEnv Environment v2.0
+**An RL environment + evaluation layer that makes AI coding agents reliable, testable, and debuggable.**
+AI agents navigate unfamiliar Python codebases, identify bugs, and implement features — graded by running actual tests. Unlike existing benchmarks, this system provides **process-level evaluation**, not just final output scoring.
+## Why This Exists
+Every coding agent (Devin, Cursor, Copilot, Codex) fails ~25%+ on complex tasks. Current benchmarks tell you the agent scored 0.4 but not **why** it failed. This environment answers:
+- Did the agent explore strategically or waste steps?
+- Did it verify its fixes before submitting?
+- Can it resist misleading comments and prompt injection?
+- How efficiently does it use its context window?
+## Architecture
+```
+┌──────────────────────────────────────────────────────────┐
+│                    FastAPI Server                         │
+│  /reset  /step  /state  /trajectory  /evaluate  /metrics │
+└──────────┬───────────────────────────────────────────────┘
+           │
+┌──────────▼───────────────────────────────────────────────┐
+│              CodebaseNavEnvironment (extended)             │
+│                                                           │
+│  ┌─────────────┐  ┌──────────────┐  ┌─────────────────┐  │
+│  │ Trajectory   │  │  Evaluator   │  │  Security       │  │
+│  │ Logger       │  │  (process)   │  │  Scanner        │  │
+│  └─────────────┘  └──────────────┘  └─────────────────┘  │
+│  ┌─────────────┐  ┌──────────────┐  ┌─────────────────┐  │
+│  │ Fault       │  │  Memory      │  │  Grader         │  │
+│  │ Injector    │  │  Tracker     │  │  (pytest)       │  │
+│  └─────────────┘  └──────────────┘  └─────────────────┘  │
+└──────────────────────────────────────────────────────────┘
+```
+## Tasks
+| Task | Difficulty | Description |
+|------|-----------|-------------|
+| task1 | Easy | Single-file bug repair (5 variants) |
+| task2 | Medium | Cross-module interface bug + regression test (5 variants) |
+| task3 | Hard | Feature implementation from spec (5 variants) |
+## API Endpoints
+### Core (OpenEnv-compliant)
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/reset?task=task1` | POST | Start new episode |
+| `/step` | POST | Take one action |
+| `/state` | GET | Get current state |
+| `/health` | GET | Health check |
+### Evaluation Layer (v2.0)
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/trajectory` | GET | Full action log with timing, diffs, security flags |
+| `/evaluate` | GET | Multi-dimensional scores (6 axes) |
+| `/metrics` | GET | Comprehensive stats: memory, security, timeline |
+| `/fault-config` | POST | Enable fault injection: "none", "light", "heavy" |
+## Multi-Dimensional Evaluation
+The `/evaluate` endpoint scores agents across **6 quality dimensions**:
+| Dimension | Weight | What It Measures |
+|-----------|--------|-----------------|
+| Efficiency | 20% | Steps used vs optimal path |
+| Navigation | 15% | Read relevant files first? Explored strategically? |
+| Correctness | 30% | Final test pass rate + regression detection |
+| Reasoning | 15% | read→write→test pattern adherence |
+| Robustness | 10% | Error recovery + fault injection handling |
+| Security | 10% | Unsafe code detection + prompt injection resistance |
+## Fault Injection
+Test agent robustness by injecting controlled faults:
+```bash
+# Enable heavy fault injection
+curl -X POST http://localhost:7860/fault-config -d '{"level":"heavy"}'
+# Next reset will inject:
+# - Misleading "BUG:" comments on correct lines
+# - Red herring files that look buggy but aren't
+# - Noisy docstrings claiming code is correct
+```
+## Quick Start
+### Local
+```bash
+pip install -r requirements.txt
+uvicorn server.app:app --host 0.0.0.0 --port 7860
+```
+### Docker
+```bash
+docker build -t codebase-nav-env .
+docker run -p 7860:7860 codebase-nav-env
+```
+### Run Inference
+```bash
+export HF_TOKEN=your_token
+export ENV_BASE_URL=http://localhost:7860
+python inference.py
+```
+## Example Output: `/evaluate`
+```json
+{
+  "composite_score": 0.874,
+  "dimensions": {
+    "efficiency": {"score": 0.8, "evidence": ["Used 5 steps vs 4 optimal"]},
+    "navigation": {"score": 1.0, "evidence": ["Good: first read was relevant file"]},
+    "correctness": {"score": 0.714, "evidence": ["No test regressions"]},
+    "reasoning": {"score": 1.0, "evidence": ["Agent tested after writing"]},
+    "robustness": {"score": 1.0, "evidence": ["Clean execution"]},
+    "security": {"score": 1.0, "evidence": ["No security violations"]}
+  }
+}
+```
+## License
+MIT

inference.py ADDED Viewed

	@@ -0,0 +1,247 @@

+#!/usr/bin/env python3
+"""
+inference.py — Mandatory OpenEnv baseline inference script.
+Runs an LLM agent against all 3 tasks and emits required log format.
+Environment variables required:
+    API_BASE_URL   — LLM API endpoint
+    MODEL_NAME     — model identifier
+    HF_TOKEN       — Hugging Face API token
+"""
+import os
+import json
+import textwrap
+from typing import List, Optional
+from openai import OpenAI
+import httpx
+# ── Configuration ─────────────────────────────────────────────────────────────
+API_KEY = os.getenv("HF_TOKEN") or os.getenv("API_KEY")
+API_BASE_URL = os.getenv("API_BASE_URL", "https://router.huggingface.co/v1")
+MODEL_NAME = os.getenv("MODEL_NAME", "Qwen/Qwen2.5-72B-Instruct")
+ENV_BASE_URL = os.getenv("ENV_BASE_URL", "http://localhost:7860")
+MAX_STEPS_PER_TASK = {"task1": 12, "task2": 18, "task3": 22}
+TEMPERATURE = 0.2
+MAX_TOKENS = 800
+SUCCESS_THRESHOLD = 0.5
+TASKS = ["task1", "task2", "task3"]
+# ── Logging helpers ────────────────────────────────────────────────────────────
+def log_start(task: str, env: str, model: str) -> None:
+    print(f"[START] task={task} env={env} model={model}", flush=True)
+def log_step(step: int, action: str, reward: float, done: bool, error: Optional[str]) -> None:
+    error_val = error if error else "null"
+    print(
+        f"[STEP] step={step} action={action} reward={reward:.2f} "
+        f"done={str(done).lower()} error={error_val}",
+        flush=True,
+    )
+def log_end(success: bool, steps: int, score: float, rewards: List[float]) -> None:
+    rewards_str = ",".join(f"{r:.2f}" for r in rewards)
+    print(
+        f"[END] success={str(success).lower()} steps={steps} "
+        f"score={score:.3f} rewards={rewards_str}",
+        flush=True,
+    )
+# ── Environment client ─────────────────────────────────────────────────────────
+class EnvClient:
+    def __init__(self, base_url: str):
+        self.base_url = base_url.rstrip("/")
+        self.client = httpx.Client(timeout=60.0)
+    def reset(self, task: str) -> dict:
+        r = self.client.post(f"{self.base_url}/reset", params={"task": task})
+        r.raise_for_status()
+        return r.json()
+    def step(self, action: dict) -> dict:
+        r = self.client.post(f"{self.base_url}/step", json=action)
+        r.raise_for_status()
+        return r.json()
+    def state(self) -> dict:
+        r = self.client.get(f"{self.base_url}/state")
+        r.raise_for_status()
+        return r.json()
+    def close(self):
+        self.client.close()
+# ── LLM Agent ─────────────────────────────────────────────────────────────────
+SYSTEM_PROMPT = textwrap.dedent("""
+    You are an expert software engineer working inside a Python code repository.
+    You can take the following actions (respond with ONLY a valid JSON object):
+    {"action_type": "read_file", "path": "src/some_file.py"}
+    {"action_type": "write_file", "path": "src/some_file.py", "content": "...full new content..."}
+    {"action_type": "run_tests", "path": "tests/test_something.py"}
+    {"action_type": "search_code", "query": "function_name_or_keyword"}
+    {"action_type": "submit"}
+    Strategy:
+    1. ALWAYS read relevant source files before writing any fixes
+    2. For task1/task2: read failing test file first to understand what is expected
+    3. For task3: read FEATURE_SPEC.md first, then existing source files
+    4. Run tests after writing a fix to verify improvement
+    5. Submit only when confident tests will pass
+    Reply with ONLY the JSON action object. No explanation. No markdown. No extra text.
+""").strip()
+def build_user_prompt(obs: dict, step: int, history: List[str]) -> str:
+    tree_str = "\n".join(obs.get("repo_tree", []))
+    files_read_str = ", ".join(obs.get("files_read", [])) or "none yet"
+    failing_str = ", ".join(obs.get("failing_tests", [])) or "unknown"
+    last_result = obs.get("last_action_result") or "none"
+    last_error = obs.get("last_action_error") or "none"
+    steps_left = obs.get("steps_remaining", 0)
+    history_str = "\n".join(history[-5:]) if history else "none"
+    return textwrap.dedent(f"""
+        Step: {step}
+        Task: {obs.get('current_task')}
+        Description: {obs.get('task_description')}
+        Steps remaining: {steps_left}
+        Repository files:
+        {tree_str}
+        Files already read: {files_read_str}
+        Known failing tests: {failing_str}
+        Last action result: {last_result[:1000]}
+        Last action error: {last_error}
+        Recent history:
+        {history_str}
+        What is your next action? Reply with ONLY a JSON action object.
+    """).strip()
+def get_agent_action(client: OpenAI, obs: dict, step: int, history: List[str]) -> dict:
+    user_prompt = build_user_prompt(obs, step, history)
+    try:
+        completion = client.chat.completions.create(
+            model=MODEL_NAME,
+            messages=[
+                {"role": "system", "content": SYSTEM_PROMPT},
+                {"role": "user", "content": user_prompt},
+            ],
+            temperature=TEMPERATURE,
+            max_tokens=MAX_TOKENS,
+        )
+        text = (completion.choices[0].message.content or "").strip()
+        # Strip markdown code fences if present
+        if text.startswith("```"):
+            text = text.split("```")[1]
+            if text.startswith("json"):
+                text = text[4:]
+        action = json.loads(text)
+        return action
+    except json.JSONDecodeError:
+        print(f"[DEBUG] Failed to parse action JSON: {text[:200]}", flush=True)
+        return {"action_type": "submit"}  # Fallback
+    except Exception as e:
+        print(f"[DEBUG] LLM call failed: {e}", flush=True)
+        return {"action_type": "submit"}
+def run_task(env_client: EnvClient, llm_client: OpenAI, task: str) -> tuple:
+    """Run one complete episode for a task. Returns (score, steps, rewards)."""
+    max_steps = MAX_STEPS_PER_TASK.get(task, 15)
+    benchmark = "codebase-nav-env"
+    rewards = []
+    history = []
+    steps_taken = 0
+    score = 0.0
+    success = False
+    log_start(task=task, env=benchmark, model=MODEL_NAME)
+    try:
+        reset_result = env_client.reset(task=task)
+        obs = reset_result["observation"]
+        for step_num in range(1, max_steps + 1):
+            if obs.get("steps_remaining", 0) <= 0:
+                break
+            action = get_agent_action(llm_client, obs, step_num, history)
+            action_str = json.dumps(action)
+            try:
+                step_result = env_client.step(action)
+            except Exception as e:
+                log_step(step_num, action_str, 0.0, True, str(e))
+                break
+            reward = step_result.get("reward", 0.0)
+            done = step_result.get("done", False)
+            error = step_result["observation"].get("last_action_error")
+            rewards.append(reward)
+            steps_taken = step_num
+            obs = step_result["observation"]
+            history.append(f"Step {step_num}: {action.get('action_type')} -> reward {reward:+.2f}")
+            log_step(step=step_num, action=action_str[:200], reward=reward, done=done, error=error)
+            if done:
+                # Get final score from state
+                state = env_client.state()
+                score = state.get("current_score", 0.0)
+                break
+        # If not done yet (step budget exhausted), force submit
+        if not obs.get("last_action_result", "").startswith("=== FINAL GRADER"):
+            try:
+                step_result = env_client.step({"action_type": "submit"})
+                state = env_client.state()
+                score = state.get("current_score", 0.0)
+            except Exception:
+                pass
+        success = score >= SUCCESS_THRESHOLD
+    except Exception as e:
+        print(f"[DEBUG] Episode error: {e}", flush=True)
+    finally:
+        log_end(success=success, steps=steps_taken, score=score, rewards=rewards)
+    return score, steps_taken, rewards
+def main():
+    env_client = EnvClient(ENV_BASE_URL)
+    llm_client = OpenAI(base_url=API_BASE_URL, api_key=API_KEY)
+    all_scores = []
+    for task in TASKS:
+        score, steps, rewards = run_task(env_client, llm_client, task)
+        all_scores.append(score)
+        print(f"[INFO] {task} complete: score={score:.3f} steps={steps}", flush=True)
+    avg_score = sum(all_scores) / len(all_scores)
+    print(f"[INFO] Average score across all tasks: {avg_score:.3f}", flush=True)
+    env_client.close()
+if __name__ == "__main__":
+    main()

openenv.yaml ADDED Viewed

	@@ -0,0 +1,56 @@

+name: codebase-nav-env
+version: "1.0.0"
+description: >
+  An RL environment where an LLM agent navigates an unfamiliar Python codebase,
+  finds bugs, and implements features by reading files and running tests.
+  Graded by actual pytest execution — fully deterministic.
+author: your-hf-username
+license: MIT
+tasks:
+  - id: task1
+    name: "Single-file bug repair"
+    description: "Find and fix bugs in a Python module so all tests pass."
+    difficulty: easy
+    max_steps: 20
+    reward_range: [0.0, 1.0]
+  - id: task2
+    name: "Cross-module interface bug"
+    description: "Fix a type mismatch between two modules and add a regression test."
+    difficulty: medium
+    max_steps: 25
+    reward_range: [0.0, 1.0]
+  - id: task3
+    name: "Feature implementation from spec"
+    description: "Read FEATURE_SPEC.md and implement the feature across multiple files."
+    difficulty: hard
+    max_steps: 30
+    reward_range: [0.0, 1.0]
+action_space:
+  type: text
+  schema:
+    action_type: string
+    path: string (optional)
+    content: string (optional)
+    query: string (optional)
+observation_space:
+  type: structured
+  fields:
+    - repo_tree: list of file paths
+    - task_description: string
+    - failing_tests: list of test names
+    - files_read: list of paths read so far
+    - last_action_result: string
+    - steps_remaining: integer
+    - current_task: string
+endpoints:
+  reset: POST /reset
+  step: POST /step
+  state: GET /state
+  health: GET /health

repo_templates/task1/variant_1/meta.json ADDED Viewed

	@@ -0,0 +1,15 @@

+{
+  "variant_id": "task1_v1",
+  "task": "task1",
+  "bug_files": ["src/auth.py"],
+  "bug_description": "validate_token uses != instead of == and get_user_permissions has off-by-one",
+  "failing_tests": ["test_valid_token", "test_user_permissions"],
+  "correct_lines": {
+    "src/auth.py": {
+      "return token != secret": "return token == secret",
+      "return permissions[user_id + 1]": "return permissions[user_id]"
+    }
+  },
+  "total_files": 3,
+  "optimal_steps": 4
+}

repo_templates/task1/variant_1/src/auth.py ADDED Viewed

	@@ -0,0 +1,14 @@

+def validate_token(token: str, secret: str) -> bool:
+    """Validate a user token against the secret."""
+    if token is None:
+        return False
+    # BUG: should be == not !=
+    return token != secret
+def get_user_permissions(user_id: int, permissions: list) -> list:
+    """Return permissions for a user ID."""
+    if user_id < 0:
+        return []
+    # BUG: off-by-one — should be permissions[user_id] not permissions[user_id + 1]
+    return permissions[user_id + 1] if user_id + 1 < len(permissions) else []

repo_templates/task1/variant_1/src/utils.py ADDED Viewed

	@@ -0,0 +1,16 @@

+"""Utility functions for the auth module."""
+def sanitize_input(text: str) -> str:
+    """Remove leading/trailing whitespace and normalize."""
+    if not isinstance(text, str):
+        return ""
+    return text.strip().lower()
+def format_response(status: str, data: dict = None) -> dict:
+    """Format a standard API response."""
+    return {
+        "status": status,
+        "data": data or {},
+    }

repo_templates/task1/variant_1/tests/test_auth.py ADDED Viewed

	@@ -0,0 +1,23 @@

+import pytest
+from src.auth import validate_token, get_user_permissions
+def test_valid_token():
+    assert validate_token("abc123", "abc123") == True  # FAILS because of != bug
+def test_invalid_token():
+    assert validate_token("wrong", "abc123") == False
+def test_none_token():
+    assert validate_token(None, "abc123") == False
+def test_user_permissions():
+    perms = ["read", "write", "admin"]
+    assert get_user_permissions(0, perms) == "read"  # FAILS because of off-by-one bug
+def test_negative_user_id():
+    assert get_user_permissions(-1, ["read"]) == []

repo_templates/task1/variant_2/meta.json ADDED Viewed

	@@ -0,0 +1,15 @@

+{
+  "variant_id": "task1_v2",
+  "task": "task1",
+  "bug_files": ["src/calculator.py"],
+  "bug_description": "divide() missing zero-division check; average() crashes on empty list",
+  "failing_tests": ["test_divide_by_zero", "test_average_empty"],
+  "correct_lines": {
+    "src/calculator.py": {
+      "return numerator / denominator": "if denominator == 0:\n        return 0.0\n    return numerator / denominator",
+      "total = sum(numbers)\n    return total / len(numbers)": "if not numbers:\n        return 0.0\n    total = sum(numbers)\n    return total / len(numbers)"
+    }
+  },
+  "total_files": 3,
+  "optimal_steps": 4
+}

repo_templates/task1/variant_2/src/calculator.py ADDED Viewed

	@@ -0,0 +1,23 @@

+"""Calculator module with basic math operations."""
+def divide(numerator: float, denominator: float) -> float:
+    """Divide numerator by denominator safely."""
+    # BUG: missing zero-division check — should check denominator == 0
+    return numerator / denominator
+def average(numbers: list) -> float:
+    """Calculate the average of a list of numbers."""
+    # BUG: doesn't handle empty list — should return 0.0 for empty
+    total = sum(numbers)
+    return total / len(numbers)
+def clamp(value: float, min_val: float, max_val: float) -> float:
+    """Clamp a value between min and max."""
+    if value < min_val:
+        return min_val
+    if value > max_val:
+        return max_val
+    return value

repo_templates/task1/variant_2/src/helpers.py ADDED Viewed

	@@ -0,0 +1,14 @@

+"""Helper utilities for the calculator module."""
+def parse_number(value: str) -> float:
+    """Parse a string to a float, returning 0.0 on failure."""
+    try:
+        return float(value)
+    except (ValueError, TypeError):
+        return 0.0
+def format_result(value: float, decimals: int = 2) -> str:
+    """Format a numeric result to a string with given decimal places."""
+    return f"{value:.{decimals}f}"

repo_templates/task1/variant_2/tests/test_calculator.py ADDED Viewed

	@@ -0,0 +1,32 @@

+import pytest
+from src.calculator import divide, average, clamp
+def test_divide_normal():
+    assert divide(10, 2) == 5.0
+def test_divide_by_zero():
+    # FAILS — ZeroDivisionError because no zero check
+    assert divide(10, 0) == 0.0
+def test_average_normal():
+    assert average([1, 2, 3]) == 2.0
+def test_average_empty():
+    # FAILS — ZeroDivisionError because empty list not handled
+    assert average([]) == 0.0
+def test_clamp_within():
+    assert clamp(5, 0, 10) == 5
+def test_clamp_below():
+    assert clamp(-5, 0, 10) == 0
+def test_clamp_above():
+    assert clamp(15, 0, 10) == 10

repo_templates/task1/variant_3/meta.json ADDED Viewed

	@@ -0,0 +1,15 @@

+{
+  "variant_id": "task1_v3",
+  "task": "task1",
+  "bug_files": ["src/inventory.py"],
+  "bug_description": "check_stock uses >= 0 instead of > 0; get_low_stock_items uses <= instead of <",
+  "failing_tests": ["test_out_of_stock", "test_low_stock_items"],
+  "correct_lines": {
+    "src/inventory.py": {
+      "return inventory[item_id] >= 0": "return inventory[item_id] > 0",
+      "if qty <= threshold": "if qty < threshold"
+    }
+  },
+  "total_files": 3,
+  "optimal_steps": 4
+}

repo_templates/task1/variant_3/src/inventory.py ADDED Viewed

	@@ -0,0 +1,26 @@

+"""Inventory management module."""
+def check_stock(item_id: str, inventory: dict) -> bool:
+    """Check if an item is in stock (quantity > 0)."""
+    if item_id not in inventory:
+        return False
+    # BUG: should be > 0, not >= 0 (zero stock means out of stock)
+    return inventory[item_id] >= 0
+def restock(item_id: str, quantity: int, inventory: dict) -> dict:
+    """Add stock for an item."""
+    if quantity < 0:
+        raise ValueError("Cannot restock negative quantity")
+    if item_id in inventory:
+        inventory[item_id] += quantity
+    else:
+        inventory[item_id] = quantity
+    return inventory
+def get_low_stock_items(inventory: dict, threshold: int = 5) -> list:
+    """Return items with stock below threshold."""
+    # BUG: should be < threshold, not <= threshold
+    return [item for item, qty in inventory.items() if qty <= threshold]

repo_templates/task1/variant_3/src/logger.py ADDED Viewed

	@@ -0,0 +1,9 @@

+"""Logging utilities for inventory operations."""
+def log_operation(operation: str, item_id: str, details: str = "") -> str:
+    """Create a log entry for an inventory operation."""
+    entry = f"[INVENTORY] {operation}: {item_id}"
+    if details:
+        entry += f" — {details}"
+    return entry

repo_templates/task1/variant_3/tests/test_inventory.py ADDED Viewed

	@@ -0,0 +1,44 @@

+import pytest
+from src.inventory import check_stock, restock, get_low_stock_items
+def test_in_stock():
+    inv = {"apple": 10, "banana": 5}
+    assert check_stock("apple", inv) == True
+def test_out_of_stock():
+    inv = {"apple": 0}
+    # FAILS — returns True because >= 0 is wrong, should be > 0
+    assert check_stock("apple", inv) == False
+def test_item_not_found():
+    assert check_stock("ghost", {}) == False
+def test_restock_existing():
+    inv = {"apple": 5}
+    result = restock("apple", 3, inv)
+    assert result["apple"] == 8
+def test_restock_new():
+    inv = {}
+    result = restock("orange", 10, inv)
+    assert result["orange"] == 10
+def test_restock_negative():
+    with pytest.raises(ValueError):
+        restock("apple", -1, {})
+def test_low_stock_items():
+    inv = {"apple": 3, "banana": 5, "cherry": 10}
+    # FAILS — banana (qty=5) should NOT be in low stock when threshold=5
+    # but <= threshold incorrectly includes items AT the threshold
+    result = get_low_stock_items(inv, threshold=5)
+    assert "apple" in result
+    assert "banana" not in result
+    assert "cherry" not in result

repo_templates/task1/variant_4/meta.json ADDED Viewed

	@@ -0,0 +1,15 @@

+{
+  "variant_id": "task1_v4",
+  "task": "task1",
+  "bug_files": ["src/scheduler.py"],
+  "bug_description": "is_available uses <= instead of < for adjacent slot check; days_until has off-by-one (+1)",
+  "failing_tests": ["test_adjacent_slots_allowed", "test_days_until", "test_days_until_same_day"],
+  "correct_lines": {
+    "src/scheduler.py": {
+      "if start <= slot_end and end >= slot_start:": "if start < slot_end and end > slot_start:",
+      "return delta.days + 1": "return delta.days"
+    }
+  },
+  "total_files": 3,
+  "optimal_steps": 4
+}

repo_templates/task1/variant_4/src/scheduler.py ADDED Viewed

	@@ -0,0 +1,34 @@

+"""Meeting and event scheduler module."""
+from datetime import datetime, timedelta
+def is_available(start: datetime, end: datetime, booked_slots: list) -> bool:
+    """Check if a time slot is available (no overlap with booked slots)."""
+    for slot in booked_slots:
+        slot_start = slot["start"]
+        slot_end = slot["end"]
+        # BUG: off-by-one — should be < not <= for end comparison
+        # Adjacent meetings (one ends exactly when another starts) should be allowed
+        if start <= slot_end and end >= slot_start:
+            return False
+    return True
+def get_next_available(after: datetime, duration_minutes: int, booked_slots: list) -> datetime:
+    """Find the next available slot after the given time."""
+    candidate = after
+    for _ in range(100):  # safety limit
+        candidate_end = candidate + timedelta(minutes=duration_minutes)
+        if is_available(candidate, candidate_end, booked_slots):
+            return candidate
+        candidate += timedelta(minutes=15)  # check in 15-minute increments
+    return None
+def days_until(target: datetime, now: datetime = None) -> int:
+    """Calculate whole days until target date."""
+    if now is None:
+        now = datetime.now()
+    delta = target - now
+    # BUG: should return delta.days, not delta.days + 1
+    return delta.days + 1

repo_templates/task1/variant_4/src/time_helpers.py ADDED Viewed

	@@ -0,0 +1,12 @@

+"""Time helper functions."""
+from datetime import datetime
+def format_time(dt: datetime) -> str:
+    """Format datetime to string."""
+    return dt.strftime("%Y-%m-%d %H:%M")
+def parse_time(s: str) -> datetime:
+    """Parse string to datetime."""
+    return datetime.strptime(s, "%Y-%m-%d %H:%M")

repo_templates/task1/variant_4/tests/test_scheduler.py ADDED Viewed

	@@ -0,0 +1,52 @@

+import pytest
+from datetime import datetime, timedelta
+from src.scheduler import is_available, get_next_available, days_until
+def test_slot_available():
+    booked = [
+        {"start": datetime(2024, 1, 1, 10, 0), "end": datetime(2024, 1, 1, 11, 0)}
+    ]
+    assert is_available(
+        datetime(2024, 1, 1, 12, 0),
+        datetime(2024, 1, 1, 13, 0),
+        booked
+    ) == True
+def test_slot_overlap():
+    booked = [
+        {"start": datetime(2024, 1, 1, 10, 0), "end": datetime(2024, 1, 1, 11, 0)}
+    ]
+    assert is_available(
+        datetime(2024, 1, 1, 10, 30),
+        datetime(2024, 1, 1, 11, 30),
+        booked
+    ) == False
+def test_adjacent_slots_allowed():
+    """Meeting starting exactly when another ends should be allowed."""
+    booked = [
+        {"start": datetime(2024, 1, 1, 10, 0), "end": datetime(2024, 1, 1, 11, 0)}
+    ]
+    # FAILS — returns False because <= is used instead of <
+    assert is_available(
+        datetime(2024, 1, 1, 11, 0),
+        datetime(2024, 1, 1, 12, 0),
+        booked
+    ) == True
+def test_days_until():
+    now = datetime(2024, 1, 1, 0, 0)
+    target = datetime(2024, 1, 11, 0, 0)
+    # FAILS — returns 11 instead of 10 because of +1 bug
+    assert days_until(target, now) == 10
+def test_days_until_same_day():
+    now = datetime(2024, 6, 15, 8, 0)
+    target = datetime(2024, 6, 15, 20, 0)
+    # FAILS — returns 1 instead of 0
+    assert days_until(target, now) == 0

repo_templates/task1/variant_5/meta.json ADDED Viewed

	@@ -0,0 +1,15 @@

+{
+  "variant_id": "task1_v5",
+  "task": "task1",
+  "bug_files": ["src/formatter.py"],
+  "bug_description": "truncate doesn't account for ellipsis length; extract_between doesn't offset past start marker",
+  "failing_tests": ["test_truncate_long", "test_extract_between"],
+  "correct_lines": {
+    "src/formatter.py": {
+      "return text[:max_length] + \"...\"": "return text[:max_length - 3] + \"...\"",
+      "content_start = start_idx": "content_start = start_idx + len(start_marker)"
+    }
+  },
+  "total_files": 3,
+  "optimal_steps": 4
+}

repo_templates/task1/variant_5/src/constants.py ADDED Viewed

	@@ -0,0 +1,4 @@

+"""Constants for the formatter module."""
+DEFAULT_MAX_LENGTH = 50
+ELLIPSIS = "..."

repo_templates/task1/variant_5/src/formatter.py ADDED Viewed

	@@ -0,0 +1,29 @@

+"""Text formatter module for processing and formatting strings."""
+def truncate(text: str, max_length: int) -> str:
+    """Truncate text to max_length, adding '...' if truncated."""
+    if not text:
+        return ""
+    if len(text) <= max_length:
+        return text
+    # BUG: should be text[:max_length - 3] + "..." to account for ellipsis length
+    return text[:max_length] + "..."
+def extract_between(text: str, start_marker: str, end_marker: str) -> str:
+    """Extract text between two markers."""
+    start_idx = text.find(start_marker)
+    if start_idx == -1:
+        return ""
+    # BUG: should start after the marker, i.e. start_idx + len(start_marker)
+    content_start = start_idx  # wrong — includes the start_marker itself
+    end_idx = text.find(end_marker, content_start)
+    if end_idx == -1:
+        return ""
+    return text[content_start:end_idx]
+def capitalize_words(text: str) -> str:
+    """Capitalize the first letter of every word."""
+    return " ".join(w.capitalize() for w in text.split())

repo_templates/task1/variant_5/tests/test_formatter.py ADDED Viewed

	@@ -0,0 +1,35 @@

+import pytest
+from src.formatter import truncate, extract_between, capitalize_words
+def test_truncate_short():
+    assert truncate("hello", 10) == "hello"
+def test_truncate_long():
+    # FAILS — returns "hello worl..." (13 chars) instead of "hello w..." (10 chars)
+    result = truncate("hello world", 10)
+    assert len(result) <= 10
+    assert result == "hello w..."
+def test_truncate_empty():
+    assert truncate("", 5) == ""
+def test_extract_between():
+    text = "start[CONTENT]end"
+    # FAILS — returns "[CONTENT]" instead of "CONTENT" because start_idx not offset
+    assert extract_between(text, "[", "]") == "CONTENT"
+def test_extract_missing_marker():
+    assert extract_between("no markers here", "[", "]") == ""
+def test_capitalize_words():
+    assert capitalize_words("hello world foo") == "Hello World Foo"
+def test_capitalize_single():
+    assert capitalize_words("test") == "Test"

repo_templates/task2/variant_1/meta.json ADDED Viewed

	@@ -0,0 +1,13 @@

+{
+  "variant_id": "task2_v1",
+  "task": "task2",
+  "bug_files": ["src/data_pipeline.py"],
+  "interface_files": ["src/validator.py"],
+  "bug_description": "data_pipeline passes str(record_id) but validator.py expects int",
+  "failing_tests": ["test_process_valid_batch"],
+  "fix_file": "src/data_pipeline.py",
+  "fix_description": "Remove str() wrapping — pass record['id'] directly",
+  "regression_test_must_cover": "TypeError raised when string is passed to validate_record",
+  "total_files": 4,
+  "optimal_steps": 6
+}

repo_templates/task2/variant_1/src/data_pipeline.py ADDED Viewed

	@@ -0,0 +1,12 @@

+from src.validator import validate_record
+def process_batch(records: list) -> list:
+    """Process a batch of records through the validation pipeline."""
+    results = []
+    for record in records:
+        # BUG: passing record["id"] as string, but validate_record expects int
+        validated = validate_record(str(record["id"]), record["data"])
+        if validated:
+            results.append(validated)
+    return results

repo_templates/task2/variant_1/src/models.py ADDED Viewed

	@@ -0,0 +1,10 @@

+"""Data models for the pipeline."""
+class Record:
+    def __init__(self, record_id: int, data: dict):
+        self.record_id = record_id
+        self.data = data
+    def to_dict(self) -> dict:
+        return {"id": self.record_id, "data": self.data}

repo_templates/task2/variant_1/src/validator.py ADDED Viewed

	@@ -0,0 +1,7 @@

+def validate_record(record_id: int, data: dict) -> dict:
+    """Validate a record. record_id must be a positive integer."""
+    if not isinstance(record_id, int):
+        raise TypeError(f"record_id must be int, got {type(record_id)}")
+    if record_id <= 0:
+        return None
+    return {"id": record_id, "data": data, "valid": True}

repo_templates/task2/variant_1/tests/test_pipeline.py ADDED Viewed

	@@ -0,0 +1,18 @@

+import pytest
+from src.data_pipeline import process_batch
+def test_process_valid_batch():
+    records = [{"id": 1, "data": {"name": "test"}}, {"id": 2, "data": {"name": "test2"}}]
+    result = process_batch(records)
+    assert len(result) == 2  # FAILS — TypeError from wrong type
+def test_process_with_invalid_id():
+    records = [{"id": -1, "data": {"name": "bad"}}]
+    result = process_batch(records)
+    assert result == []
+def test_empty_batch():
+    assert process_batch([]) == []

repo_templates/task2/variant_2/meta.json ADDED Viewed

	@@ -0,0 +1,13 @@

+{
+  "variant_id": "task2_v2",
+  "task": "task2",
+  "bug_files": ["src/email_sender.py"],
+  "interface_files": ["src/template_engine.py"],
+  "bug_description": "email_sender passes name= kwarg but template_engine expects username=",
+  "failing_tests": ["test_send_welcome_email", "test_welcome_email_structure"],
+  "fix_file": "src/email_sender.py",
+  "fix_description": "Change name=user_name to username=user_name in send_welcome_email",
+  "regression_test_must_cover": "KeyError when wrong kwarg name is used",
+  "total_files": 4,
+  "optimal_steps": 6
+}

repo_templates/task2/variant_2/src/config.py ADDED Viewed

	@@ -0,0 +1,5 @@

+"""Configuration for the email service."""
+SMTP_HOST = "localhost"
+SMTP_PORT = 587
+FROM_EMAIL = "noreply@example.com"

repo_templates/task2/variant_2/src/email_sender.py ADDED Viewed

	@@ -0,0 +1,25 @@

+"""Email sending service that uses the template engine."""
+from src.template_engine import render_template
+def send_welcome_email(user_name: str, user_email: str) -> dict:
+    """Send a welcome email to a new user."""
+    # BUG: passing 'name' but template_engine expects 'username'
+    body = render_template("welcome", name=user_name, email=user_email)
+    return {
+        "to": user_email,
+        "subject": "Welcome!",
+        "body": body,
+        "sent": True,
+    }
+def send_reset_email(user_email: str, reset_link: str) -> dict:
+    """Send a password reset email."""
+    body = render_template("reset", email=user_email, link=reset_link)
+    return {
+        "to": user_email,
+        "subject": "Password Reset",
+        "body": body,
+        "sent": True,
+    }

repo_templates/task2/variant_2/src/template_engine.py ADDED Viewed

	@@ -0,0 +1,26 @@

+"""Template rendering engine for email bodies."""
+TEMPLATES = {
+    "welcome": "Hello {username}, welcome to our platform! Your email {email} has been registered.",
+    "reset": "Click here to reset your password: {link}. This was requested for {email}.",
+    "notify": "Hi {username}, you have a new notification: {message}.",
+}
+def render_template(template_name: str, **kwargs) -> str:
+    """
+    Render an email template with the given keyword arguments.
+    Expected kwargs per template:
+    - welcome: username (str), email (str)
+    - reset: email (str), link (str)
+    - notify: username (str), message (str)
+    """
+    if template_name not in TEMPLATES:
+        raise ValueError(f"Unknown template: {template_name}")
+    template = TEMPLATES[template_name]
+    try:
+        return template.format(**kwargs)
+    except KeyError as e:
+        raise KeyError(f"Missing required template variable: {e}")

repo_templates/task2/variant_2/tests/test_email.py ADDED Viewed

	@@ -0,0 +1,23 @@

+import pytest
+from src.email_sender import send_welcome_email, send_reset_email
+def test_send_welcome_email():
+    # FAILS — KeyError because email_sender passes 'name' but template expects 'username'
+    result = send_welcome_email("Alice", "alice@example.com")
+    assert result["sent"] == True
+    assert "Alice" in result["body"]
+    assert "alice@example.com" in result["body"]
+def test_send_reset_email():
+    result = send_reset_email("bob@example.com", "https://reset.link/abc")
+    assert result["sent"] == True
+    assert "https://reset.link/abc" in result["body"]
+def test_welcome_email_structure():
+    # FAILS — same KeyError as test_send_welcome_email
+    result = send_welcome_email("Charlie", "charlie@test.com")
+    assert result["to"] == "charlie@test.com"
+    assert result["subject"] == "Welcome!"

repo_templates/task2/variant_3/meta.json ADDED Viewed

	@@ -0,0 +1,13 @@

+{
+  "variant_id": "task2_v3",
+  "task": "task2",
+  "bug_files": ["src/order_processor.py"],
+  "interface_files": ["src/inventory_checker.py"],
+  "bug_description": "order_processor passes list of items but inventory_checker expects dict {sku: qty}",
+  "failing_tests": ["test_process_valid_order", "test_order_structure"],
+  "fix_file": "src/order_processor.py",
+  "fix_description": "Convert items list to dict: {item['sku']: item['qty'] for item in items}",
+  "regression_test_must_cover": "TypeError when list is passed to check_availability",
+  "total_files": 4,
+  "optimal_steps": 6
+}

repo_templates/task2/variant_3/src/inventory_checker.py ADDED Viewed

	@@ -0,0 +1,33 @@

+"""Inventory checking service. Verifies stock levels for orders."""
+# Simulated stock database
+STOCK = {
+    "WIDGET-A": 100,
+    "WIDGET-B": 50,
+    "GADGET-X": 0,
+    "GADGET-Y": 25,
+}
+def check_availability(requested_items: dict) -> bool:
+    """
+    Check if all requested items are available in stock.
+    Args:
+        requested_items: dict mapping SKU to quantity, e.g. {"WIDGET-A": 5, "GADGET-Y": 2}
+    Returns:
+        True if all items are available in sufficient quantity.
+    """
+    if not isinstance(requested_items, dict):
+        raise TypeError(
+            f"requested_items must be dict, got {type(requested_items).__name__}. "
+            f"Expected format: {{'SKU': quantity}}"
+        )
+    for sku, qty in requested_items.items():
+        if sku not in STOCK:
+            return False
+        if STOCK[sku] < qty:
+            return False
+    return True

repo_templates/task2/variant_3/src/models.py ADDED Viewed

	@@ -0,0 +1,10 @@

+"""Shared models for the order system."""
+class OrderItem:
+    def __init__(self, sku: str, qty: int):
+        self.sku = sku
+        self.qty = qty
+    def to_dict(self) -> dict:
+        return {"sku": self.sku, "qty": self.qty}

repo_templates/task2/variant_3/src/order_processor.py ADDED Viewed

	@@ -0,0 +1,20 @@

+"""Order processing module that checks inventory before fulfillment."""
+from src.inventory_checker import check_availability
+def process_order(order: dict) -> dict:
+    """
+    Process an order by checking inventory availability.
+    order format: {"items": [{"sku": "ABC", "qty": 2}, ...], "customer": "..."}
+    """
+    items = order.get("items", [])
+    if not items:
+        return {"status": "error", "message": "No items in order"}
+    # BUG: passing items as list, but check_availability expects a dict {sku: qty}
+    available = check_availability(items)
+    if available:
+        return {"status": "confirmed", "items": items}
+    else:
+        return {"status": "out_of_stock", "items": items}

repo_templates/task2/variant_3/tests/test_orders.py ADDED Viewed

	@@ -0,0 +1,27 @@

+import pytest
+from src.order_processor import process_order
+def test_process_valid_order():
+    order = {
+        "items": [{"sku": "WIDGET-A", "qty": 2}, {"sku": "GADGET-Y", "qty": 1}],
+        "customer": "alice@example.com",
+    }
+    # FAILS — TypeError because list is passed instead of dict
+    result = process_order(order)
+    assert result["status"] == "confirmed"
+def test_empty_order():
+    result = process_order({"items": [], "customer": "bob@example.com"})
+    assert result["status"] == "error"
+def test_order_structure():
+    order = {
+        "items": [{"sku": "WIDGET-B", "qty": 5}],
+        "customer": "charlie@example.com",
+    }
+    # FAILS — same TypeError
+    result = process_order(order)
+    assert "items" in result

repo_templates/task2/variant_4/meta.json ADDED Viewed

	@@ -0,0 +1,13 @@

+{
+  "variant_id": "task2_v4",
+  "task": "task2",
+  "bug_files": ["src/report_builder.py"],
+  "interface_files": ["src/date_formatter.py"],
+  "bug_description": "report_builder passes ISO string but date_formatter expects datetime object",
+  "failing_tests": ["test_build_monthly_report", "test_report_structure"],
+  "fix_file": "src/report_builder.py",
+  "fix_description": "Parse ISO strings to datetime before passing: datetime.strptime(start_date, '%Y-%m-%d')",
+  "regression_test_must_cover": "TypeError when string is passed to format_date_range",
+  "total_files": 4,
+  "optimal_steps": 6
+}

repo_templates/task2/variant_4/src/date_formatter.py ADDED Viewed

	@@ -0,0 +1,28 @@

+"""Date formatting utilities for reports."""
+from datetime import datetime
+def format_date_range(start: datetime, end: datetime) -> str:
+    """
+    Format a date range for display in reports.
+    Args:
+        start: datetime object for range start
+        end: datetime object for range end
+    Returns:
+        Formatted string like "Jan 01, 2024 — Jan 31, 2024"
+    """
+    if not isinstance(start, datetime):
+        raise TypeError(f"start must be datetime, got {type(start).__name__}")
+    if not isinstance(end, datetime):
+        raise TypeError(f"end must be datetime, got {type(end).__name__}")
+    return f"{start.strftime('%b %d, %Y')} — {end.strftime('%b %d, %Y')}"
+def format_single_date(dt: datetime) -> str:
+    """Format a single date."""
+    if not isinstance(dt, datetime):
+        raise TypeError(f"Expected datetime, got {type(dt).__name__}")
+    return dt.strftime("%B %d, %Y")

repo_templates/task2/variant_4/src/models.py ADDED Viewed

	@@ -0,0 +1,3 @@


1	+ """Shared models for the reporting system."""
2	+
3	+ REPORT_TYPES = ["monthly", "quarterly", "annual", "summary"]

repo_templates/task2/variant_4/src/report_builder.py ADDED Viewed

	@@ -0,0 +1,28 @@

+"""Report builder that assembles reports with formatted dates."""
+from src.date_formatter import format_date_range
+def build_monthly_report(title: str, start_date: str, end_date: str, data: list) -> dict:
+    """
+    Build a monthly report with formatted date header.
+    Args:
+        title: Report title
+        start_date: ISO format string 'YYYY-MM-DD'
+        end_date: ISO format string 'YYYY-MM-DD'
+        data: List of data points
+    """
+    # BUG: passing ISO string directly, but format_date_range expects datetime objects
+    date_header = format_date_range(start_date, end_date)
+    return {
+        "title": title,
+        "period": date_header,
+        "total_records": len(data),
+        "data": data,
+    }
+def build_summary(title: str, content: str) -> dict:
+    """Build a simple summary report."""
+    return {"title": title, "content": content, "type": "summary"}

repo_templates/task2/variant_4/tests/test_reports.py ADDED Viewed

	@@ -0,0 +1,28 @@

+import pytest
+from src.report_builder import build_monthly_report, build_summary
+def test_build_monthly_report():
+    # FAILS — TypeError because ISO string passed instead of datetime
+    result = build_monthly_report(
+        "Sales Report",
+        "2024-01-01",
+        "2024-01-31",
+        [{"amount": 100}, {"amount": 200}],
+    )
+    assert result["title"] == "Sales Report"
+    assert result["total_records"] == 2
+    assert "Jan" in result["period"]
+def test_build_summary():
+    result = build_summary("Q1 Summary", "Revenue increased 15%")
+    assert result["title"] == "Q1 Summary"
+    assert result["type"] == "summary"
+def test_report_structure():
+    # FAILS — same TypeError
+    result = build_monthly_report("Inventory", "2024-03-01", "2024-03-31", [])
+    assert "period" in result
+    assert result["total_records"] == 0

repo_templates/task2/variant_5/meta.json ADDED Viewed

	@@ -0,0 +1,13 @@

+{
+  "variant_id": "task2_v5",
+  "task": "task2",
+  "bug_files": ["src/cache_manager.py"],
+  "interface_files": ["src/serializer.py"],
+  "bug_description": "cache_manager passes bytes (.encode()) but serializer expects str",
+  "failing_tests": ["test_cache_set_and_get", "test_cache_delete"],
+  "fix_file": "src/cache_manager.py",
+  "fix_description": "Remove .encode('utf-8') — pass str(value) directly to serialize_value",
+  "regression_test_must_cover": "TypeError when bytes is passed to serialize_value",
+  "total_files": 4,
+  "optimal_steps": 6
+}

repo_templates/task2/variant_5/src/cache_manager.py ADDED Viewed

	@@ -0,0 +1,36 @@

+"""Cache management service that stores serialized data."""
+from src.serializer import serialize_value, deserialize_value
+class CacheManager:
+    """Simple in-memory cache with serialization."""
+    def __init__(self):
+        self._store = {}
+    def set(self, key: str, value) -> None:
+        """Store a value in the cache after serializing it."""
+        # BUG: passing bytes (encoded) instead of str to serialize_value
+        serialized = serialize_value(str(value).encode('utf-8'))
+        self._store[key] = serialized
+    def get(self, key: str, default=None):
+        """Retrieve and deserialize a value from cache."""
+        if key not in self._store:
+            return default
+        return deserialize_value(self._store[key])
+    def delete(self, key: str) -> bool:
+        """Remove a key from cache."""
+        if key in self._store:
+            del self._store[key]
+            return True
+        return False
+    def clear(self):
+        """Clear all cached values."""
+        self._store.clear()
+    def keys(self) -> list:
+        """Return all cache keys."""
+        return list(self._store.keys())

repo_templates/task2/variant_5/src/config.py ADDED Viewed

	@@ -0,0 +1,4 @@

+"""Cache configuration constants."""
+MAX_CACHE_SIZE = 1000
+DEFAULT_TTL = 300  # seconds

repo_templates/task2/variant_5/src/serializer.py ADDED Viewed

	@@ -0,0 +1,25 @@

+"""Serialization utilities for the cache system."""
+import json
+def serialize_value(value: str) -> str:
+    """
+    Serialize a value to a JSON string for storage.
+    Args:
+        value: must be a string (str type)
+    Returns:
+        JSON-encoded string
+    """
+    if not isinstance(value, str):
+        raise TypeError(f"value must be str, got {type(value).__name__}")
+    return json.dumps({"data": value})
+def deserialize_value(serialized: str):
+    """Deserialize a JSON string back to the original value."""
+    if not isinstance(serialized, str):
+        raise TypeError(f"serialized must be str, got {type(serialized).__name__}")
+    result = json.loads(serialized)
+    return result.get("data")

repo_templates/task2/variant_5/tests/test_cache.py ADDED Viewed

	@@ -0,0 +1,37 @@

+import pytest
+from src.cache_manager import CacheManager
+def test_cache_set_and_get():
+    cache = CacheManager()
+    # FAILS — TypeError because bytes passed to serializer instead of str
+    cache.set("user:1", "Alice")
+    assert cache.get("user:1") == "Alice"
+def test_cache_get_missing():
+    cache = CacheManager()
+    assert cache.get("nonexistent", "default") == "default"
+def test_cache_delete():
+    cache = CacheManager()
+    # FAILS — same TypeError on set
+    cache.set("temp", "data")
+    assert cache.delete("temp") == True
+    assert cache.get("temp") is None
+def test_cache_clear():
+    cache = CacheManager()
+    cache._store["a"] = '{"data": "1"}'
+    cache._store["b"] = '{"data": "2"}'
+    cache.clear()
+    assert cache.keys() == []
+def test_cache_keys():
+    cache = CacheManager()
+    cache._store["x"] = '{"data": "1"}'
+    cache._store["y"] = '{"data": "2"}'
+    assert sorted(cache.keys()) == ["x", "y"]