scrapeRL / docs /tool-calls.md
NeerajCodz's picture
docs: init proto
24f0bf0

tool-calls

stream-event-overview

Tool calls are surfaced through scrape streaming events (/api/scrape/stream) as step payloads.

event-type purpose contains-tool-call-data
init stream/session initialization no
url_start url processing started no
step progress/action update yes (for action=tool_call and action=agent_decision)
url_complete url processing complete no
complete final response payload no (aggregated output only)
error runtime error surface optional

scrape-step-schema

step events are based on the ScrapeStep model.

field type description
step_number integer sequence index in the session
action string logical action type (tool_call, agent_decision, plugins, etc.)
url string or null active url for this step when available
status string runtime state (running, complete, completed, failed, etc.)
message string short human-readable step summary
reward number reward delta for this step
extracted_data object or null structured details, including tool payloads
duration_ms number or null optional elapsed time for the step
timestamp string utc iso timestamp

tool-call-payload-patterns

pattern-a-registry-helper-calls

Used by _create_tool_call_step(...).

key-path value-shape
extracted_data.tool_name namespace.action
extracted_data.tool_description short description
extracted_data.parameters argument object
extracted_data.result optional result object

pattern-b-runtime-agent-planner-and-executor

Used by dynamic runtime tool-calling in agentic scrape flow.

action key-path value-shape
agent_decision extracted_data.tool_calls[] tool, params, reasoning
tool_call extracted_data.tool selected tool name
tool_call extracted_data.success boolean execution state
tool_call extracted_data.result_preview compact serialized result
tool_call extracted_data.error error message if failed
tool_call extracted_data.duration_ms execution duration

runtime-tool-call-lifecycle

sequenceDiagram
    participant Client as scrape-client
    participant Route as scrape-route
    participant Planner as agent-tool-caller
    participant Executor as tool-executor

    Client->>Route: POST /api/scrape/stream
    Route->>Planner: decide_tools(context, model)
    Planner-->>Route: [tool-call-plan]
    Route-->>Client: step(action=agent_decision)
    loop each selected tool
        Route->>Executor: execute_tool_call(tool, context)
        Executor-->>Route: ToolCallResult
        Route-->>Client: step(action=tool_call)
    end
    Route-->>Client: complete(output, extracted_data, metadata)

field-order-and-rendering-guidance

Frontend and log consumers should parse structured fields, not message text.

consumer-surface recommendation
timeline ui group by action, then read extracted_data keys
tool call panel prefer tool_name/tool over message
analytics aggregate by tool_name/tool and success
debugging use result_preview and error first, full context second

example-step-events

{
  "type": "step",
  "data": {
    "step_number": 17,
    "action": "agent_decision",
    "status": "completed",
    "message": "Agent selected 4 runtime tools",
    "reward": 0.1,
    "extracted_data": {
      "tool_calls": [
        {"tool": "html.select", "params": {"selector": "article", "limit": 20}, "reasoning": "Find repeated blocks"},
        {"tool": "extract.top_n", "params": {"n": 10}, "reasoning": "Apply output size cap"}
      ]
    },
    "timestamp": "2026-04-08T11:49:20.000000+00:00"
  }
}
{
  "type": "step",
  "data": {
    "step_number": 18,
    "action": "tool_call",
    "status": "completed",
    "message": "Tool html.select: ok",
    "reward": 0.05,
    "extracted_data": {
      "tool": "html.select",
      "success": true,
      "result_preview": "{'elements_found': 12, 'selector_used': 'article'}",
      "error": null,
      "duration_ms": 3
    },
    "timestamp": "2026-04-08T11:49:20.005000+00:00"
  }
}

troubleshooting-table

symptom likely-cause check
agent_decision absent planner disabled or failed before plan emit verify live_llm_enabled path and planner warnings
selected tools not executed planner output filtered/empty inspect selected tool names against registry
many failed tool calls unsupported namespace or bad params verify executor namespace handlers and args
output quality unchanged tool observations not influencing extraction verify AGENT TOOL OBSERVATIONS injected in extraction prompt

related-api-reference

item value
api-reference api-reference.md