Dwootton commited on
Commit
00c71c4
·
verified ·
1 Parent(s): 6284167

Add comprehensive README

Browse files
Files changed (1) hide show
  1. README.md +16 -0
README.md ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - tool-use
5
+ - evaluation
6
+ - play2prompt
7
+ - stabletoolbench
8
+ ---
9
+
10
+ # Play2Prompt (P2P) StableToolBench Evaluation Pipeline
11
+
12
+ Replicates the [Play2Prompt](https://aclanthology.org/2025.findings-acl.1347/) paper conditions on [StableToolBench](https://arxiv.org/abs/2403.07714) using Llama-3.1-8B-Instruct.
13
+
14
+ **Designed for extensibility**: The 4 conditions are controlled by two pluggable components — tool descriptions and in-context examples. To test your own description types, just drop replacement files into `p2p_data/descriptions/` and `p2p_data/examples/`.
15
+
16
+ See `pipeline/` directory for all source code.