Spaces:
Running
Running
File size: 1,735 Bytes
03815d6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | cff-version: 1.2.0
title: "Chakravyuh: A Multi-Agent RL Environment for Indian UPI Fraud Detection"
message: "If you use this environment, benchmark, or trained adapter, please cite it as below."
type: software
authors:
- family-names: Pardeshi
given-names: Ujjwal
email: ujjwal.pardeshi@riamona.com
- family-names: Kadam
given-names: Omkar
date-released: 2026-04-26
url: "https://github.com/UjjwalPardeshi/Chakravyuh"
repository-code: "https://github.com/UjjwalPardeshi/Chakravyuh"
license: MIT
keywords:
- reinforcement-learning
- multi-agent
- fraud-detection
- openenv
- upi
- india
- llm
- grpo
- lora
- scalable-oversight
abstract: >-
Chakravyuh is a five-agent OpenEnv-compliant reinforcement learning
environment for training Large Language Models to detect Indian UPI
fraud. The Analyzer agent (Qwen2.5-7B + LoRA) observes scripted
Scammer-Victim dialogues and must output a calibrated suspicion score
with a justified explanation, while a Bank Monitor and Regulator
provide cross-modal oversight. A composable eight-rubric reward
(detection, missed-scam penalty, false-positive penalty, calibration,
explanation quality, signal accuracy, format adherence, length control)
is designed to be hard to game; v2 of the trained adapter reduces
false-positive rate by approximately 5x relative to a reward-hacked v1
baseline on a 175-scenario Indian-grounded benchmark.
preferred-citation:
type: software
title: "Chakravyuh: A Multi-Agent RL Environment for Indian UPI Fraud Detection"
authors:
- family-names: Pardeshi
given-names: Ujjwal
- family-names: Kadam
given-names: Omkar
year: 2026
url: "https://github.com/UjjwalPardeshi/Chakravyuh"
|