Spaces:

qa1145
/

astrbot_help

Sleeping

App Files Files Community

astrbot_help / CLAUDE.md

qa1145

Upload 28 files

d347708 verified about 2 months ago

preview code

raw

history blame contribute delete

7.5 kB

	# CLAUDE.md

	This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

	## Project Overview

	Read Agent is an AI-powered code analysis assistant that uses OpenAI-compatible APIs with the ReAct (Reasoning + Acting) pattern for iterative code exploration. It provides both CLI and Web interfaces.

	## Common Commands

	### Running the Application

	```bash
	# CLI interface (interactive terminal)
	python main.py

	# CLI with specific code directory
	python main.py --code-dir /path/to/code

	# CLI with multiple API keys (comma-separated)
	python main.py --api-key "key1,key2,key3"

	# Web server (default port 7860)
	python app.py

	# Web server with debug mode
	DEBUG=true python app.py
	```

	### Docker

	```bash
	# Using docker-compose
	docker-compose up -d

	# Build and run manually
	docker build -t read-agent .
	docker run -p 7860:7860 read-agent
	```

	### Testing

	```bash
	pytest # Run tests
	pytest --cov # Run with coverage
	```

	### Dependencies

	```bash
	pip install -r requirements.txt
	```

	## Architecture

	### Core Pattern: ReAct Loop

	The ReadAgent (src/agent.py) implements a ReAct (Reasoning + Acting) pattern:
	1. LLM generates a "thought" about what to do next
	2. LLM specifies an "action" using available tools (read_file, search_code, etc.)
	3. ToolExecutor executes the action and returns observations
	4. Loop continues until the LLM decides it has enough information
	5. Final answer is generated based on accumulated observations

	This pattern enables iterative exploration without requiring all context upfront.

	### Multi API Key Rotation (ApiKeyManager)

	src/api_key_manager.py - ApiKeyManager class
	- Manages multiple API keys for load balancing and reliability
	- Round-robin rotation across keys
	- Thread-safe operations with locks
	- Tracks statistics: request count, success rate, errors
	- Global singleton pattern via `get_global_manager()` or `init_manager()`

	Usage:
	```python
	from src.api_key_manager import ApiKeyManager

	# Single key
	manager = ApiKeyManager("sk-xxx")

	# Multiple keys (comma-separated)
	manager = ApiKeyManager("sk-key1,sk-key2,sk-key3")

	# Get next key (round-robin)
	key = manager.get_key()

	# Record results
	manager.record_success(key)
	manager.record_error(key, "Error message")

	# Get statistics
	stats = manager.get_stats()
	```

	Integration with ReadAgent:
	- ReadAgent accepts `api_key_manager` parameter
	- If provided, uses ApiKeyManager to get keys via rotation
	- Records success/failure statistics automatically
	- Falls back to legacy single-key mode if no manager provided

	### Memory Management

	To prevent context expansion across multiple steps, the agent uses a Memory dataclass:

	```python
	@dataclass
	class Memory:
	file_path: str # File being analyzed
	overview: str # One-sentence summary
	key_definitions: List[str] # Key function/class names
	core_logic: str # Core logic description
	dependencies: List[str] # Dependencies
	needed_info: str # Information to verify
	```

	After reading a file, the agent creates a Memory object instead of keeping full file content. Subsequent tool calls can reference previously analyzed files without re-reading them.

	### Key Components

	src/agent.py - ReadAgent class
	- Main orchestration of ReAct loop
	- Manages Memory objects to optimize context
	- Supports streaming output via `ask(stream=True)`
	- Batch action support for parallel independent operations
	- Integrates with ApiKeyManager for multi-key rotation

	src/searcher.py - CodeSearcher class
	- Provides all file/code interaction tools
	- Integrates with CodeIndex for fast keyword/symbol search
	- Tools: read_file, find_files, search_code, find_by_ext, list_dir, get_file_info, get_dir_tree

	src/index.py - CodeIndex class
	- Inverted index for fast code search
	- Lazy building: builds on first search if not exists
	- Supports both keyword search and symbol extraction
	- Tokenization handles camelCase, PascalCase, snake_case

	src/repo_manager.py - RepoManager class
	- Downloads GitHub repos as ZIP files
	- Skip detection: won't re-download existing repos unless forced
	- Parallel sync support (threading)
	- Configured via environment variables (REPO_1_URL, REPO_2_URL, etc.)

	src/session_storage.py - SessionStorage class
	- SQLite-based persistent storage for sessions
	- Thread-safe with locks
	- Stores: session metadata, conversation history, memories
	- Cleanup of old sessions

	prompts.py - Prompt configuration
	- ReAct format instructions
	- Information need tree construction strategy
	- Priority-based search (docs → config → code)
	- Recursive validation protocol

	### Entry Points

	1. main.py - CLI interface with interactive commands (quit, clear, status, help)
	2. app.py - Flask web application with REST APIs

	### Session Isolation

	Each user session (web) has:
	- Independent ReadAgent instance
	- Separate Memory objects
	- Isolated conversation history
	- SQLite persistence (can be restored)
	- Shared ApiKeyManager instance (for efficient key rotation)

	### Streaming Support

	The agent supports streaming responses (`STREAM_OUTPUT=true`):
	- Thoughts and actions stream in real-time
	- Final answer detection via special tokens
	- Provides immediate feedback during long-running analysis

	## Environment Variables

	### API Configuration
	- `OPENAI_API_KEY` - Required (can be multiple keys separated by commas)
	- `OPENAI_BASE_URL` - Default: https://api.openai.com/v1
	- `OPENAI_MODEL` - Default: gpt-4

	### Repository Configuration
	- `CODE_DIR` - Default: ./repos
	- `REPO_SYNC_ON_STARTUP` - Default: true
	- `REPO_1_URL`, `REPO_2_URL`, etc. - GitHub repo URLs
	- `REPO_1_NAME`, `REPO_1_BRANCH`, etc. - Per-repo settings

	### Agent Configuration
	- `MAX_STEPS` - Maximum reasoning steps (default: 10)
	- `TREE_DEPTH` - Directory tree preload depth (default: 3)
	- `STREAM_OUTPUT` - Enable streaming (default: true)
	- `WEB_PORT` - Web server port (default: 7860)
	- `DEBUG` - Debug mode (default: false)

	## API Endpoints (app.py)

	### Question API
	- `POST /api/ask` - Main question endpoint (supports streaming via query param or JSON field)

	### Session Management
	- `POST /api/session/new` - Create new session
	- `POST /api/session/clear` - Clear session(s)
	- `GET /status` - Service status

	### Repository Management
	- `GET /api/repos` - List repositories
	- `POST /api/repos/sync` - Sync repositories
	- `GET /api/repos/config` - Get repository configuration
	- `POST /api/repos/clear` - Clear all repositories

	### API Key Management
	- `GET /api/api-keys/stats` - Get API key usage statistics
	- `POST /api/api-keys/reset-stats` - Reset API key statistics

	### Health Check
	- `GET /health` - Health check
	- `GET /prompt` - Return system prompt

	## Technical Notes

	- Pure Python - Uses only standard library (urllib) and minimal dependencies (Flask, python-dotenv)
	- No async/await - Uses threading for parallel operations
	- SQLite for session persistence (file-based, no external DB required)
	- Symbol extraction for Python and JavaScript in CodeIndex (AST-based)
	- ReAct format - LLM outputs structured JSON with "thought" and "action" fields
	- Thread-safe API Key Management - Uses locks for concurrent access to ApiKeyManager