Spaces:

NeerajCodz
/

scrapeRL

Running

App Files Files Community

scrapeRL / docs /overview.md

NeerajCodz

docs: init proto

24f0bf0 9 days ago

preview code

raw

history blame contribute delete

3.24 kB

overview

purpose

This document is the top-level guide for the ScrapeRL documentation set. It explains what the platform does, how the main runtime surfaces connect, and where to find detailed references.

platform-summary

dimension	summary
core-goal	AI-first scraping workflows with RL-style episodes and dynamic agent planning
backend	FastAPI control plane with episode, scrape, agent, plugin, memory, and provider APIs
frontend	React dashboard for task submission, stream monitoring, and result inspection
runtime-pattern	session-based execution with real-time `step`/`tool_call` stream events
output-targets	`json`, `csv`, `markdown`, and `text`
integrations	OpenAI, Anthropic, Google, Groq, NVIDIA, plugin tools, memory layers

primary-runtime-flows

flowchart TD
    A[user-request] --> B[api-scrape-stream]
    B --> C[agent-decision]
    C --> D[tool-plan-and-execution]
    D --> E[llm-extraction-and-formatting]
    E --> F[complete-event]
    B --> G[session-status-and-artifacts]

documentation-navigation

doc	focus-area
`readme.md`	documentation index
`api-reference.md`	complete endpoint catalog and stream/event contract
`architecture.md`	system topology, subsystem planes, reliability model
`openenv.md`	environment/action/observation/reward contract
`features.md`	advanced runtime features and toggles
`memory.md`	memory layers, storage, and operations
`plugins.md`	plugin registry and runtime tool-selection model
`tool-calls.md`	tool call payload schema and lifecycle
`api.md`	multi-model routing and provider behavior
`settings.md`	runtime setting controls and policy knobs
`observability.md`	telemetry/tracing/cost visibility
`rewards.md`	reward design and scoring structure
`search-engine.md`	search provider and retrieval routing details
`mcp.md`	mcp integration architecture
`agents.md`	agent roles and coordination model

key-api-surfaces

surface	endpoints
system-health	`/api/health`, `/api/ready`, `/api/ping`
episode-runtime	`/api/episode/reset`, `/api/episode/step`, `/api/episode/state/{episode_id}`
scrape-runtime	`/api/scrape/stream`, `/api/scrape/{session_id}/status`, `/api/scrape/{session_id}/result`
agent-tool-memory	`/api/agents/`, `/api/tools/`, `/api/plugins/`, `/api/memory/`
realtime-channel	`/ws/episode/{episode_id}`

Use api-reference.md for full method/path listings.

configuration-surfaces

file	intent
`.env.example`	complete variable template for app + inference runtime
`.env`	local runtime values
`docker-compose.yml`	backend/frontend orchestration and env wiring
`inference.py`	OpenEnv-compliant inference entrypoint and stdout contract

recommended-reading-order

overview.md
api-reference.md
architecture.md
openenv.md
tool-calls.md
plugins.md
domain docs (memory.md, api.md, features.md, settings.md)

document-metadata

key	value
document	`overview.md`
status	active
owner	platform-docs