mnawfal29 commited on
Commit
532cd03
Β·
verified Β·
1 Parent(s): d6243f2

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +64 -4
  2. inference.py +19 -14
README.md CHANGED
@@ -132,10 +132,70 @@ Each step returns:
132
 
133
  ## Scoring
134
 
135
- Final score (0-1) is weighted:
136
- - **Fraud detection accuracy** (50%) β€” Correct flags with right fraud type
137
- - **Detection timeliness** (30%) β€” How early fraud was caught
138
- - **Investigation efficiency** (20%) β€” Budget usage and false positive avoidance
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
139
 
140
  ## Deployment
141
 
 
132
 
133
  ## Scoring
134
 
135
+ ### Step Reward
136
+
137
+ Every action returns an immediate reward in **[0, 1]**, centered at 0.5 (neutral).
138
+
139
+ | Action | Condition | Reward |
140
+ |--------|-----------|--------|
141
+ | `monitor` | No active fraud | 0.50 |
142
+ | `monitor` | Active unflagged fraud | 0.40 β†’ 0.20 (penalty grows day over day) |
143
+ | `investigate_publisher` | Publisher is fraudulent | 0.55 β†’ 0.65 (bonus for investigating early) |
144
+ | `investigate_publisher` | Publisher is clean | 0.35 (wastes budget) |
145
+ | `flag_fraud` | Correct publisher + correct fraud type | 0.95 β†’ 1.00 (bonus for early flag) |
146
+ | `flag_fraud` | Correct publisher, wrong fraud type | 0.70 |
147
+ | `flag_fraud` | False positive | 0.05 |
148
+ | `submit_report` | Any | 0.50 |
149
+ | Invalid / malformed action | β€” | 0.05 |
150
+
151
+ The monitor penalty formula: `0.50 - (0.10 + 0.20 Γ— day/14)`, floored at 0.05. On day 1 the penalty is ~0.10; by day 14 it reaches ~0.30, reflecting increasing urgency as fraud compounds.
152
+
153
+ ### Final Score
154
+
155
+ Computed at episode end, combining three weighted components into a score in **[0, 1]**:
156
+
157
+ ```
158
+ final_score = 0.50 Γ— accuracy + 0.30 Γ— timeliness + 0.20 Γ— efficiency
159
+ ```
160
+
161
+ #### 1. Fraud Detection Accuracy (50%)
162
+
163
+ Measures whether fraudulent publishers were correctly identified with the right fraud type.
164
+
165
+ - **+1.0 / N** per fraudster flagged with the correct fraud type
166
+ - **+0.5 / N** per fraudster flagged with the wrong fraud type
167
+ - **βˆ’0.5 / N** per false positive (clean publisher flagged as fraudulent)
168
+
169
+ Clamped to [0, 1].
170
+
171
+ #### 2. Detection Timeliness (30%)
172
+
173
+ Measures how quickly each fraudster was caught after fraud began.
174
+
175
+ ```
176
+ timeliness = 1.0 βˆ’ (day_flagged βˆ’ fraud_start_day) / (14 βˆ’ fraud_start_day)
177
+ ```
178
+
179
+ - Flagging immediately when fraud starts β†’ 1.0
180
+ - Flagging on the final day β†’ 0.0
181
+ - Unflagged fraudster β†’ 0.0
182
+ - Averaged across all fraudsters.
183
+
184
+ #### 3. Investigation Efficiency (20%)
185
+
186
+ Measures whether investigations were targeted at real fraudsters without wasting budget.
187
+
188
+ ```
189
+ efficiency = 0.5 Γ— (useful_investigations / total_investigations)
190
+ + 0.3 Γ— (1 βˆ’ budget_used / budget_total)
191
+ βˆ’ 0.2 Γ— num_false_positives
192
+ ```
193
+
194
+ - **Information value** β€” fraction of investigations spent on fraudulent publishers
195
+ - **Budget efficiency** β€” fraction of budget left unused
196
+ - **False positive penalty** β€” βˆ’0.2 per clean publisher incorrectly flagged
197
+
198
+ Clamped to [0, 1].
199
 
200
  ## Deployment
201
 
inference.py CHANGED
@@ -64,13 +64,13 @@ MODEL_NAME = os.getenv("MODEL_NAME", "Qwen/Qwen2.5-7B-Instruct")
64
 
65
  _VALID_TASKS = {"easy", "medium", "hard"}
66
  _task_env = os.getenv("ADAUDIT_TASK", "").strip().lower()
67
- TASK_NAME = _task_env if _task_env in _VALID_TASKS else "hard"
68
  BENCHMARK = os.getenv("ADAUDIT_BENCHMARK", "adaudit")
69
  TEMPERATURE = 0.0
70
  MAX_TOKENS = 2048
71
  HISTORY_WINDOW = 5
72
  BASELINE_DAYS = 3
73
- SUCCESS_SCORE_THRESHOLD = 0.5
74
 
75
  # Rule-based investigation tools per fraud type
76
  TOOLS_FOR = {
@@ -126,7 +126,7 @@ def log_step(step: int, action: str, reward: float, done: bool, error: Optional[
126
 
127
  def log_end(success: bool, steps: int, score: float, rewards: List[float]) -> None:
128
  rewards_str = ",".join(f"{r:.2f}" for r in rewards)
129
- print(f"[END] success={str(success).lower()} steps={steps} score={score:.3f} rewards={rewards_str}", flush=True)
130
 
131
 
132
  # ---------------------------------------------------------------------------
@@ -326,15 +326,7 @@ def get_rule_action(
326
  # Main
327
  # ---------------------------------------------------------------------------
328
 
329
- def main() -> None:
330
- # Try to init LLM client; fall back to rule-based if it fails
331
- llm_client: Optional[OpenAI] = None
332
- try:
333
- llm_client = OpenAI(base_url=API_BASE_URL, api_key=API_KEY)
334
- llm_client.models.list()
335
- except Exception:
336
- llm_client = None
337
-
338
  use_rules = llm_client is None
339
 
340
  env = AdAuditEnv()
@@ -353,10 +345,10 @@ def main() -> None:
353
  investigated: Dict[str, List[str]] = {}
354
  flagged: set = set()
355
 
356
- log_start(task=TASK_NAME, env=BENCHMARK, model=MODEL_NAME if not use_rules else "rule-based")
357
 
358
  try:
359
- obs = env.reset(episode_id=TASK_NAME)
360
  obs_dict = obs.model_dump()
361
 
362
  while not obs_dict.get("done", False) and steps_taken < EPISODE_DAYS:
@@ -421,5 +413,18 @@ def main() -> None:
421
  log_end(success=success, steps=steps_taken, score=score, rewards=rewards)
422
 
423
 
 
 
 
 
 
 
 
 
 
 
 
 
 
424
  if __name__ == "__main__":
425
  main()
 
64
 
65
  _VALID_TASKS = {"easy", "medium", "hard"}
66
  _task_env = os.getenv("ADAUDIT_TASK", "").strip().lower()
67
+ TASK_NAME = _task_env if _task_env in _VALID_TASKS else "medium"
68
  BENCHMARK = os.getenv("ADAUDIT_BENCHMARK", "adaudit")
69
  TEMPERATURE = 0.0
70
  MAX_TOKENS = 2048
71
  HISTORY_WINDOW = 5
72
  BASELINE_DAYS = 3
73
+ SUCCESS_SCORE_THRESHOLD = 0.4
74
 
75
  # Rule-based investigation tools per fraud type
76
  TOOLS_FOR = {
 
126
 
127
  def log_end(success: bool, steps: int, score: float, rewards: List[float]) -> None:
128
  rewards_str = ",".join(f"{r:.2f}" for r in rewards)
129
+ print(f"[END] success={str(success).lower()} steps={steps} score={score:.2f} rewards={rewards_str}", flush=True)
130
 
131
 
132
  # ---------------------------------------------------------------------------
 
326
  # Main
327
  # ---------------------------------------------------------------------------
328
 
329
+ def run_episode(task_name: str, llm_client: Optional[OpenAI]) -> None:
 
 
 
 
 
 
 
 
330
  use_rules = llm_client is None
331
 
332
  env = AdAuditEnv()
 
345
  investigated: Dict[str, List[str]] = {}
346
  flagged: set = set()
347
 
348
+ log_start(task=task_name, env=BENCHMARK, model=MODEL_NAME if not use_rules else "rule-based")
349
 
350
  try:
351
+ obs = env.reset(episode_id=task_name)
352
  obs_dict = obs.model_dump()
353
 
354
  while not obs_dict.get("done", False) and steps_taken < EPISODE_DAYS:
 
413
  log_end(success=success, steps=steps_taken, score=score, rewards=rewards)
414
 
415
 
416
+ def main() -> None:
417
+ # Try to init LLM client; fall back to rule-based if it fails
418
+ llm_client: Optional[OpenAI] = None
419
+ try:
420
+ llm_client = OpenAI(base_url=API_BASE_URL, api_key=API_KEY)
421
+ llm_client.models.list()
422
+ except Exception:
423
+ llm_client = None
424
+
425
+ for task in sorted(_VALID_TASKS):
426
+ run_episode(task, llm_client)
427
+
428
+
429
  if __name__ == "__main__":
430
  main()