cff-version: 1.2.0 title: "Chakravyuh: A Multi-Agent RL Environment for Indian UPI Fraud Detection" message: "If you use this environment, benchmark, or trained adapter, please cite it as below." type: software authors: - family-names: Pardeshi given-names: Ujjwal email: ujjwal.pardeshi@riamona.com - family-names: Kadam given-names: Omkar date-released: 2026-04-26 url: "https://github.com/UjjwalPardeshi/Chakravyuh" repository-code: "https://github.com/UjjwalPardeshi/Chakravyuh" license: MIT keywords: - reinforcement-learning - multi-agent - fraud-detection - openenv - upi - india - llm - grpo - lora - scalable-oversight abstract: >- Chakravyuh is a five-agent OpenEnv-compliant reinforcement learning environment for training Large Language Models to detect Indian UPI fraud. The Analyzer agent (Qwen2.5-7B + LoRA) observes scripted Scammer-Victim dialogues and must output a calibrated suspicion score with a justified explanation, while a Bank Monitor and Regulator provide cross-modal oversight. A composable eight-rubric reward (detection, missed-scam penalty, false-positive penalty, calibration, explanation quality, signal accuracy, format adherence, length control) is designed to be hard to game; v2 of the trained adapter reduces false-positive rate by approximately 5x relative to a reward-hacked v1 baseline on a 175-scenario Indian-grounded benchmark. preferred-citation: type: software title: "Chakravyuh: A Multi-Agent RL Environment for Indian UPI Fraud Detection" authors: - family-names: Pardeshi given-names: Ujjwal - family-names: Kadam given-names: Omkar year: 2026 url: "https://github.com/UjjwalPardeshi/Chakravyuh"