astrbot_help / CLAUDE.md
qa1145's picture
Upload 28 files
d347708 verified

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Read Agent is an AI-powered code analysis assistant that uses OpenAI-compatible APIs with the ReAct (Reasoning + Acting) pattern for iterative code exploration. It provides both CLI and Web interfaces.

Common Commands

Running the Application

# CLI interface (interactive terminal)
python main.py

# CLI with specific code directory
python main.py --code-dir /path/to/code

# CLI with multiple API keys (comma-separated)
python main.py --api-key "key1,key2,key3"

# Web server (default port 7860)
python app.py

# Web server with debug mode
DEBUG=true python app.py

Docker

# Using docker-compose
docker-compose up -d

# Build and run manually
docker build -t read-agent .
docker run -p 7860:7860 read-agent

Testing

pytest                    # Run tests
pytest --cov            # Run with coverage

Dependencies

pip install -r requirements.txt

Architecture

Core Pattern: ReAct Loop

The ReadAgent (src/agent.py) implements a ReAct (Reasoning + Acting) pattern:

  1. LLM generates a "thought" about what to do next
  2. LLM specifies an "action" using available tools (read_file, search_code, etc.)
  3. ToolExecutor executes the action and returns observations
  4. Loop continues until the LLM decides it has enough information
  5. Final answer is generated based on accumulated observations

This pattern enables iterative exploration without requiring all context upfront.

Multi API Key Rotation (ApiKeyManager)

src/api_key_manager.py - ApiKeyManager class

  • Manages multiple API keys for load balancing and reliability
  • Round-robin rotation across keys
  • Thread-safe operations with locks
  • Tracks statistics: request count, success rate, errors
  • Global singleton pattern via get_global_manager() or init_manager()

Usage:

from src.api_key_manager import ApiKeyManager

# Single key
manager = ApiKeyManager("sk-xxx")

# Multiple keys (comma-separated)
manager = ApiKeyManager("sk-key1,sk-key2,sk-key3")

# Get next key (round-robin)
key = manager.get_key()

# Record results
manager.record_success(key)
manager.record_error(key, "Error message")

# Get statistics
stats = manager.get_stats()

Integration with ReadAgent:

  • ReadAgent accepts api_key_manager parameter
  • If provided, uses ApiKeyManager to get keys via rotation
  • Records success/failure statistics automatically
  • Falls back to legacy single-key mode if no manager provided

Memory Management

To prevent context expansion across multiple steps, the agent uses a Memory dataclass:

@dataclass
class Memory:
    file_path: str          # File being analyzed
    overview: str           # One-sentence summary
    key_definitions: List[str]  # Key function/class names
    core_logic: str         # Core logic description
    dependencies: List[str] # Dependencies
    needed_info: str        # Information to verify

After reading a file, the agent creates a Memory object instead of keeping full file content. Subsequent tool calls can reference previously analyzed files without re-reading them.

Key Components

src/agent.py - ReadAgent class

  • Main orchestration of ReAct loop
  • Manages Memory objects to optimize context
  • Supports streaming output via ask(stream=True)
  • Batch action support for parallel independent operations
  • Integrates with ApiKeyManager for multi-key rotation

src/searcher.py - CodeSearcher class

  • Provides all file/code interaction tools
  • Integrates with CodeIndex for fast keyword/symbol search
  • Tools: read_file, find_files, search_code, find_by_ext, list_dir, get_file_info, get_dir_tree

src/index.py - CodeIndex class

  • Inverted index for fast code search
  • Lazy building: builds on first search if not exists
  • Supports both keyword search and symbol extraction
  • Tokenization handles camelCase, PascalCase, snake_case

src/repo_manager.py - RepoManager class

  • Downloads GitHub repos as ZIP files
  • Skip detection: won't re-download existing repos unless forced
  • Parallel sync support (threading)
  • Configured via environment variables (REPO_1_URL, REPO_2_URL, etc.)

src/session_storage.py - SessionStorage class

  • SQLite-based persistent storage for sessions
  • Thread-safe with locks
  • Stores: session metadata, conversation history, memories
  • Cleanup of old sessions

prompts.py - Prompt configuration

  • ReAct format instructions
  • Information need tree construction strategy
  • Priority-based search (docs → config → code)
  • Recursive validation protocol

Entry Points

  1. main.py - CLI interface with interactive commands (quit, clear, status, help)
  2. app.py - Flask web application with REST APIs

Session Isolation

Each user session (web) has:

  • Independent ReadAgent instance
  • Separate Memory objects
  • Isolated conversation history
  • SQLite persistence (can be restored)
  • Shared ApiKeyManager instance (for efficient key rotation)

Streaming Support

The agent supports streaming responses (STREAM_OUTPUT=true):

  • Thoughts and actions stream in real-time
  • Final answer detection via special tokens
  • Provides immediate feedback during long-running analysis

Environment Variables

API Configuration

  • OPENAI_API_KEY - Required (can be multiple keys separated by commas)
  • OPENAI_BASE_URL - Default: https://api.openai.com/v1
  • OPENAI_MODEL - Default: gpt-4

Repository Configuration

  • CODE_DIR - Default: ./repos
  • REPO_SYNC_ON_STARTUP - Default: true
  • REPO_1_URL, REPO_2_URL, etc. - GitHub repo URLs
  • REPO_1_NAME, REPO_1_BRANCH, etc. - Per-repo settings

Agent Configuration

  • MAX_STEPS - Maximum reasoning steps (default: 10)
  • TREE_DEPTH - Directory tree preload depth (default: 3)
  • STREAM_OUTPUT - Enable streaming (default: true)
  • WEB_PORT - Web server port (default: 7860)
  • DEBUG - Debug mode (default: false)

API Endpoints (app.py)

Question API

  • POST /api/ask - Main question endpoint (supports streaming via query param or JSON field)

Session Management

  • POST /api/session/new - Create new session
  • POST /api/session/clear - Clear session(s)
  • GET /status - Service status

Repository Management

  • GET /api/repos - List repositories
  • POST /api/repos/sync - Sync repositories
  • GET /api/repos/config - Get repository configuration
  • POST /api/repos/clear - Clear all repositories

API Key Management

  • GET /api/api-keys/stats - Get API key usage statistics
  • POST /api/api-keys/reset-stats - Reset API key statistics

Health Check

  • GET /health - Health check
  • GET /prompt - Return system prompt

Technical Notes

  • Pure Python - Uses only standard library (urllib) and minimal dependencies (Flask, python-dotenv)
  • No async/await - Uses threading for parallel operations
  • SQLite for session persistence (file-based, no external DB required)
  • Symbol extraction for Python and JavaScript in CodeIndex (AST-based)
  • ReAct format - LLM outputs structured JSON with "thought" and "action" fields
  • Thread-safe API Key Management - Uses locks for concurrent access to ApiKeyManager