Spaces:

smolagents
/

ml-intern

Running on CPU Upgrade

App Files Files Community

akseljoonas HF Staff commited on Apr 5

Commit

b09abf1

1 Parent(s): e17cb5d

Add SOTA-awareness to research sub-agent system prompt

Browse files

Files changed (1) hide show

agent/tools/research_tool.py +16 -4

agent/tools/research_tool.py CHANGED Viewed

@@ -46,12 +46,23 @@ Your job: explore documentation, code examples, APIs, and repos,
 then return a concise, actionable summary. The main agent will use
 your findings to implement the actual solution.
 # Research methodology
-1. **Discovery**: Find relevant entry points — example scripts, doc pages, API endpoints
 2. **Tracing**: Follow the chain from entry point to implementation detail
-3. **Analysis**: Identify patterns, current API usage, key dependencies
-4. **Synthesis**: Summarize findings in a structured format
 # How to use your tools
@@ -101,11 +112,12 @@ hf_inspect_dataset({"dataset": "org/name", "split": "train", "sample_rows": 3})
 # Output format
 Your output MUST include:
 - **Key findings**: The most important things you discovered (current API usage, working patterns)
 - **Essential references**: Specific file paths, URLs, function names, doc sections, code snippets
   that the main agent should use directly
 - **Code patterns**: Key imports, configurations, and usage patterns from working examples
-- **Recommendations**: What to do next based on your findings
 Be concise. Your output goes into another agent's context — every token counts.
 Aim for 500-1500 words max. Include actual code snippets from examples you read,

 then return a concise, actionable summary. The main agent will use
 your findings to implement the actual solution.
+# Being up to date is critical
+Always prioritize finding the most current, state-of-the-art approaches.
+ML moves fast — a method from 6 months ago may already be obsolete.
+- Search for **recent papers** (use `hf_papers`) to find SOTA methods, models, and datasets for the task
+- Compare what you find in docs/examples against what recent papers recommend — prefer the newer approach
+- When multiple approaches exist, identify which is SOTA and why (benchmark results, adoption, recency)
+- Flag when example code uses outdated APIs, deprecated trainers, or superseded techniques
+- Include in your findings: what is the current best model, dataset, and method for the task
 # Research methodology
+1. **Discovery**: Find relevant entry points — example scripts, doc pages, API endpoints, **and recent papers for SOTA approaches**
 2. **Tracing**: Follow the chain from entry point to implementation detail
+3. **Analysis**: Identify patterns, current API usage, key dependencies. **Compare against SOTA from recent papers**
+4. **Synthesis**: Summarize findings in a structured format, highlighting what is current best practice vs. outdated
 # How to use your tools
 # Output format
 Your output MUST include:
+- **SOTA landscape**: Current best models, datasets, and methods for the task (from recent papers). Flag anything outdated.
 - **Key findings**: The most important things you discovered (current API usage, working patterns)
 - **Essential references**: Specific file paths, URLs, function names, doc sections, code snippets
   that the main agent should use directly
 - **Code patterns**: Key imports, configurations, and usage patterns from working examples
+- **Recommendations**: What to do next based on your findings, preferring SOTA approaches
 Be concise. Your output goes into another agent's context — every token counts.
 Aim for 500-1500 words max. Include actual code snippets from examples you read,