Prem Jampuram commited on
Commit
c416bc5
·
1 Parent(s): 6b61dd3

Updated README file by removing the old rewarding scheme

Browse files
Files changed (1) hide show
  1. README.md +0 -32
README.md CHANGED
@@ -163,37 +163,5 @@ The agent manages a queue of mixed emails and must prioritize, classify, and tak
163
  }
164
  ```
165
 
166
- # Reward Categorization
167
-
168
- The reward is dense — every step produces a signal:
169
-
170
- | Component | Value | Trigger |
171
- |-----------|-------|---------|
172
- | `category_correct` | +0.15 | Correct email category |
173
- | `urgency_correct` | +0.05 | Correct urgency level |
174
- | `keyword_score` | 0–0.25 | Response keyword coverage |
175
- | `adequate_length` | +0.05 | Response ≥ 50 characters |
176
- | `vip_priority` | +0.08 | VIP email handled in first 4 steps |
177
- | `high_priority` | +0.05 | High-urgency email handled early |
178
- | `correct_action` | +0.06 | Correct respond/escalate/archive decision |
179
- | `response_present` | +0.02 | Non-empty response for respond action |
180
- | `step_penalty` | −0.005 | Applied every step (encourages efficiency) |
181
- | `wrong_action` | −0.03 to −0.05 | Wrong action type for task |
182
- | `spam_not_archived` | −0.04 | Spam email not archived |
183
-
184
-
185
- # Backend API
186
-
187
- We will be using FastAPI as out backend framework, and we are adding end points addressed as per the mentioned requirements.
188
-
189
- ## End Points
190
-
191
- | Method | Path | Description |
192
- |--------|------|-------------|
193
- | `POST` | `/reset?task_id=<id>` | Reset environment for a task, returns initial Observation |
194
- | `POST` | `/step` | Submit an Action, returns `{observation, reward, done, info}` |
195
- | `GET` | `/state` | Current environment state |
196
- | `GET` | `/tasks` | List all tasks with action schema |
197
- | `GET` | `/grader` | Current grader score (0.0–1.0) |
198
 
199
 
 
163
  }
164
  ```
165
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
166
 
167