Spaces:

qa1145
/

astrbot_help

Sleeping

App Files Files Community

astrbot_help / CLAUDE.md

qa1145

Upload 28 files

d347708 verified about 2 months ago

preview code

raw

history blame contribute delete

7.5 kB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Read Agent is an AI-powered code analysis assistant that uses OpenAI-compatible APIs with the ReAct (Reasoning + Acting) pattern for iterative code exploration. It provides both CLI and Web interfaces.

Common Commands

Running the Application

# CLI interface (interactive terminal)
python main.py

# CLI with specific code directory
python main.py --code-dir /path/to/code

# CLI with multiple API keys (comma-separated)
python main.py --api-key "key1,key2,key3"

# Web server (default port 7860)
python app.py

# Web server with debug mode
DEBUG=true python app.py

Docker

# Using docker-compose
docker-compose up -d

# Build and run manually
docker build -t read-agent .
docker run -p 7860:7860 read-agent

Testing

pytest                    # Run tests
pytest --cov            # Run with coverage

Dependencies

pip install -r requirements.txt

Architecture

Core Pattern: ReAct Loop

The ReadAgent (src/agent.py) implements a ReAct (Reasoning + Acting) pattern:

LLM generates a "thought" about what to do next
LLM specifies an "action" using available tools (read_file, search_code, etc.)
ToolExecutor executes the action and returns observations
Loop continues until the LLM decides it has enough information
Final answer is generated based on accumulated observations

This pattern enables iterative exploration without requiring all context upfront.

Multi API Key Rotation (ApiKeyManager)

src/api_key_manager.py - ApiKeyManager class

Manages multiple API keys for load balancing and reliability
Round-robin rotation across keys
Thread-safe operations with locks
Tracks statistics: request count, success rate, errors
Global singleton pattern via get_global_manager() or init_manager()

Usage:

from src.api_key_manager import ApiKeyManager

# Single key
manager = ApiKeyManager("sk-xxx")

# Multiple keys (comma-separated)
manager = ApiKeyManager("sk-key1,sk-key2,sk-key3")

# Get next key (round-robin)
key = manager.get_key()

# Record results
manager.record_success(key)
manager.record_error(key, "Error message")

# Get statistics
stats = manager.get_stats()

Integration with ReadAgent:

ReadAgent accepts api_key_manager parameter
If provided, uses ApiKeyManager to get keys via rotation
Records success/failure statistics automatically
Falls back to legacy single-key mode if no manager provided

Memory Management

To prevent context expansion across multiple steps, the agent uses a Memory dataclass:

@dataclass
class Memory:
    file_path: str          # File being analyzed
    overview: str           # One-sentence summary
    key_definitions: List[str]  # Key function/class names
    core_logic: str         # Core logic description
    dependencies: List[str] # Dependencies
    needed_info: str        # Information to verify

After reading a file, the agent creates a Memory object instead of keeping full file content. Subsequent tool calls can reference previously analyzed files without re-reading them.

Key Components

src/agent.py - ReadAgent class

Main orchestration of ReAct loop
Manages Memory objects to optimize context
Supports streaming output via ask(stream=True)
Batch action support for parallel independent operations
Integrates with ApiKeyManager for multi-key rotation

src/searcher.py - CodeSearcher class

Provides all file/code interaction tools
Integrates with CodeIndex for fast keyword/symbol search
Tools: read_file, find_files, search_code, find_by_ext, list_dir, get_file_info, get_dir_tree

src/index.py - CodeIndex class

Inverted index for fast code search
Lazy building: builds on first search if not exists
Supports both keyword search and symbol extraction
Tokenization handles camelCase, PascalCase, snake_case

src/repo_manager.py - RepoManager class

Downloads GitHub repos as ZIP files
Skip detection: won't re-download existing repos unless forced
Parallel sync support (threading)
Configured via environment variables (REPO_1_URL, REPO_2_URL, etc.)

src/session_storage.py - SessionStorage class

SQLite-based persistent storage for sessions
Thread-safe with locks
Stores: session metadata, conversation history, memories
Cleanup of old sessions

prompts.py - Prompt configuration

ReAct format instructions
Information need tree construction strategy
Priority-based search (docs → config → code)
Recursive validation protocol

Entry Points

main.py - CLI interface with interactive commands (quit, clear, status, help)
app.py - Flask web application with REST APIs

Session Isolation

Each user session (web) has:

Independent ReadAgent instance
Separate Memory objects
Isolated conversation history
SQLite persistence (can be restored)
Shared ApiKeyManager instance (for efficient key rotation)

Streaming Support

The agent supports streaming responses (STREAM_OUTPUT=true):

Thoughts and actions stream in real-time
Final answer detection via special tokens
Provides immediate feedback during long-running analysis

Environment Variables

API Configuration

OPENAI_API_KEY - Required (can be multiple keys separated by commas)
OPENAI_BASE_URL - Default: https://api.openai.com/v1
OPENAI_MODEL - Default: gpt-4

Repository Configuration

CODE_DIR - Default: ./repos
REPO_SYNC_ON_STARTUP - Default: true
REPO_1_URL, REPO_2_URL, etc. - GitHub repo URLs
REPO_1_NAME, REPO_1_BRANCH, etc. - Per-repo settings

Agent Configuration

MAX_STEPS - Maximum reasoning steps (default: 10)
TREE_DEPTH - Directory tree preload depth (default: 3)
STREAM_OUTPUT - Enable streaming (default: true)
WEB_PORT - Web server port (default: 7860)
DEBUG - Debug mode (default: false)

API Endpoints (app.py)

Question API

POST /api/ask - Main question endpoint (supports streaming via query param or JSON field)

Session Management

POST /api/session/new - Create new session
POST /api/session/clear - Clear session(s)
GET /status - Service status

Repository Management

GET /api/repos - List repositories
POST /api/repos/sync - Sync repositories
GET /api/repos/config - Get repository configuration
POST /api/repos/clear - Clear all repositories

API Key Management

GET /api/api-keys/stats - Get API key usage statistics
POST /api/api-keys/reset-stats - Reset API key statistics

Health Check

GET /health - Health check
GET /prompt - Return system prompt

Technical Notes

Pure Python - Uses only standard library (urllib) and minimal dependencies (Flask, python-dotenv)
No async/await - Uses threading for parallel operations
SQLite for session persistence (file-based, no external DB required)
Symbol extraction for Python and JavaScript in CodeIndex (AST-based)
ReAct format - LLM outputs structured JSON with "thought" and "action" fields
Thread-safe API Key Management - Uses locks for concurrent access to ApiKeyManager