Spaces:
Sleeping
Sleeping
Commit Β·
340ce5a
1
Parent(s): 93fd30a
Made some documentation updates
Browse files- README.md +2 -28
- openenv.yaml +4 -4
README.md
CHANGED
|
@@ -179,31 +179,6 @@ The agent manages a queue of 15 mixed emails. It must choose which email to hand
|
|
| 179 |
| `GET` | `/tasks` | List all tasks with action schema |
|
| 180 |
| `GET` | `/grader` | Current grader score (0.0β1.0) |
|
| 181 |
|
| 182 |
-
## Setup
|
| 183 |
-
|
| 184 |
-
**Prerequisites:** Python 3.11+, [uv](https://github.com/astral-sh/uv)
|
| 185 |
-
|
| 186 |
-
**Install dependencies**
|
| 187 |
-
- `uv sync`
|
| 188 |
-
|
| 189 |
-
**Environment variables**
|
| 190 |
-
|
| 191 |
-
- `API_BASE_URL` β LLM API endpoint (default: `https://router.huggingface.co/v1`)
|
| 192 |
-
- `MODEL_NAME` β Model identifier (default: `Qwen/Qwen2.5-7B-Instruct`)
|
| 193 |
-
- `OPENAI_API_KEY` β API key for the LLM provider
|
| 194 |
-
- `HF_TOKEN` β Hugging Face token
|
| 195 |
-
- `ENV_BASE_URL` β Running environment URL (default: `http://localhost:7860`)
|
| 196 |
-
|
| 197 |
-
**Run the server**
|
| 198 |
-
- `uvicorn server.app:app --host 0.0.0.0 --port 7860`
|
| 199 |
-
|
| 200 |
-
**Run baseline inference**
|
| 201 |
-
- `python inference.py`
|
| 202 |
-
|
| 203 |
-
**Run with Docker**
|
| 204 |
-
- `docker build -t sieve .`
|
| 205 |
-
- `docker run -p 7860:7860 -e OPENAI_API_KEY=... sieve`
|
| 206 |
-
|
| 207 |
## Baseline Scores
|
| 208 |
|
| 209 |
Baseline agent: `gpt-4o-mini` via OpenAI API
|
|
@@ -211,6 +186,5 @@ Baseline agent: `gpt-4o-mini` via OpenAI API
|
|
| 211 |
| Task | Score | Steps | Total Reward |
|
| 212 |
|------|-------|-------|--------------|
|
| 213 |
| Email Classification | 0.930 | 10 | 1.755 |
|
| 214 |
-
| Response Drafting | 0.
|
| 215 |
-
| Support Session | 0.
|
| 216 |
-
| **Average** | **0.919** | β | β |
|
|
|
|
| 179 |
| `GET` | `/tasks` | List all tasks with action schema |
|
| 180 |
| `GET` | `/grader` | Current grader score (0.0β1.0) |
|
| 181 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 182 |
## Baseline Scores
|
| 183 |
|
| 184 |
Baseline agent: `gpt-4o-mini` via OpenAI API
|
|
|
|
| 186 |
| Task | Score | Steps | Total Reward |
|
| 187 |
|------|-------|-------|--------------|
|
| 188 |
| Email Classification | 0.930 | 10 | 1.755 |
|
| 189 |
+
| Response Drafting | 0.920 | 6 | 1.650 |
|
| 190 |
+
| Support Session | 0.882 | 15 | 1.506 |
|
|
|
openenv.yaml
CHANGED
|
@@ -78,7 +78,7 @@ action_space:
|
|
| 78 |
baseline:
|
| 79 |
agent: gpt-4o-mini
|
| 80 |
scores:
|
| 81 |
-
email_classification: 0.
|
| 82 |
-
response_drafting: 0.
|
| 83 |
-
support_session: 0.
|
| 84 |
-
average: 0.
|
|
|
|
| 78 |
baseline:
|
| 79 |
agent: gpt-4o-mini
|
| 80 |
scores:
|
| 81 |
+
email_classification: 0.930
|
| 82 |
+
response_drafting: 0.920
|
| 83 |
+
support_session: 0.882
|
| 84 |
+
average: 0.911
|