Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
|
@@ -33,3 +33,16 @@ It helps builders generate:
|
|
| 33 |
- next-step follow-up tests
|
| 34 |
|
| 35 |
The Space is intentionally lightweight and portfolio-friendly: fast to inspect, easy to extend, and aligned with public artifacts on Kaggle, Codeberg, and other AI platforms.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
- next-step follow-up tests
|
| 34 |
|
| 35 |
The Space is intentionally lightweight and portfolio-friendly: fast to inspect, easy to extend, and aligned with public artifacts on Kaggle, Codeberg, and other AI platforms.
|
| 36 |
+
|
| 37 |
+
## Associated Papers
|
| 38 |
+
|
| 39 |
+
- Primary paper: [Lightweight Evaluation and Operational Scorecards for Tool-Using AI Agents](https://doi.org/10.5281/zenodo.20034550)
|
| 40 |
+
- Paper landing page: [lightweight-agent-eval-paper](https://mukundakatta.github.io/lightweight-agent-eval-paper/)
|
| 41 |
+
- Artifact repo: [MukundaKatta/lightweight-agent-eval-paper](https://github.com/MukundaKatta/lightweight-agent-eval-paper)
|
| 42 |
+
- Companion evaluation harness paper: [AI Eval Forge: Mixed-Check Regression Testing for LLM and Agent Workflows](https://doi.org/10.5281/zenodo.20044318)
|
| 43 |
+
|
| 44 |
+
## Related Public Artifacts
|
| 45 |
+
|
| 46 |
+
- Hugging Face dataset: [mukunda1729/agent-eval-scenarios](https://huggingface.co/datasets/mukunda1729/agent-eval-scenarios)
|
| 47 |
+
- Hugging Face dataset: [mukunda1729/premium-agent-repo-landscape](https://huggingface.co/datasets/mukunda1729/premium-agent-repo-landscape)
|
| 48 |
+
- Hugging Face collection: [Agent Labs Portfolio](https://huggingface.co/collections/mukunda1729/agent-labs-portfolio)
|