plugins
plugin-registry-overview
The plugin registry is the canonical catalog of callable capabilities used by the scraper runtime and agent tool planner.
Current registry snapshot:
| metric |
value |
| plugin-groups |
12 |
| total-tools |
82 |
| source-file |
backend/app/plugins/registry.py |
plugin-group-matrix
| plugin-id |
category |
tool-count |
primary-purpose |
browser |
browser |
8 |
navigation and interaction actions |
html-parser |
parser |
13 |
html and dom parsing/extraction |
data-processing |
data |
13 |
json/csv/dataframe style transforms |
regex |
extraction |
5 |
pattern matching and text extraction |
network |
network |
5 |
http/url operations |
media |
media |
4 |
media and document extraction |
analysis |
analysis |
7 |
schema/relevance/stats/text analysis |
extraction |
extraction |
8 |
contact/date/price/entity extraction |
validation |
validation |
7 |
url/json/schema/signal validation |
storage |
storage |
5 |
memory and cache operations |
sandbox |
ai |
3 |
sandboxed code execution |
ai |
ai |
4 |
ai completion/embedding/classification |
runtime-usage-model
flowchart TD
A[scrape request] --> B[resolve enabled plugins]
B --> C[agent tool planner]
C --> D[plugin registry catalog]
D --> E[selected tool calls]
E --> F[tool executor]
F --> G[tool results and context updates]
G --> H[llm extraction code generation]
H --> I[sandbox execution]
I --> J[formatted output and complete event]
request-and-selection-rules
| input-surface |
behavior |
enable_plugins |
requested plugin ids from the request payload |
| plugin-resolver |
filters to installed plugin ids and returns enabled + missing lists |
selected_agents |
controls agent roles/modules, independent from plugin install state |
| runtime planner |
chooses tools dynamically from registry metadata, not fixed site templates |
plugin-extension-checklist
- add new
ToolDefinition entries in backend/app/plugins/registry.py
- ensure tool names use namespace format (
namespace.action)
- provide parameter and return schemas in the registry entry
- implement runtime behavior in agent executor if the namespace is executable in-agent
- expose and verify behavior via scrape stream step events
plugin-extension-flow
sequenceDiagram
participant Dev as developer
participant Reg as plugin-registry
participant Planner as agent-tool-planner
participant Exec as tool-executor
participant Stream as scrape-stream
Dev->>Reg: add ToolDefinition
Reg-->>Planner: tool metadata available
Planner->>Exec: select and call tool
Exec-->>Stream: tool_call result in step event
Stream-->>Dev: visible runtime behavior
recently-added-tools
| namespace |
tool-name |
intent |
html |
html.extract_meta |
capture title and meta tags |
html |
html.extract_jsonld |
parse structured json-ld blocks |
html |
html.detect_repeating_blocks |
identify repeated dom structures |
data |
data.dedupe_rows |
remove duplicate records |
data |
data.rank_rows |
rank rows by selected score field |
data |
data.select_columns |
project rows to requested columns |
analysis |
analysis.infer_schema |
infer field types/nullability |
analysis |
analysis.score_relevance |
score rows against instructions |
extract |
extract.top_n |
keep top-n records |
validate |
validate.data_completeness |
completeness score by field |
validate |
validate.row_signal |
estimate row quality signal |
related-api-reference
| item |
value |
| api-reference |
api-reference.md |