Spaces:
Sleeping
Sleeping
Commit History
fix easy task test for updated issue types 0e13037
replace ambiguous salary issue with date format fix f1b7439
fix root endpoint to list all 5 tasks c3f32c9
remove ambiguous LR fix — identify-only, any valid LR works a1f98bf
fix moderation issue row collisions and verify all data 8560706
add moderation task to Gradio demo replay 887c1aa
add content moderation task with real OpenAI Moderation data b99e42b
add toxic/biased response issue to alignment task c699b6f
replace ambiguous fixes with deterministic ones across all tasks b08652c
demo only proposes logically inferrable fixes 5de8f8e
fix grading: reward valid fixes, not just exact matches 5e1f8bb
update README with alignment task details and issue breakdown 1bd072d
make alignment issues subtler to challenge frontier models 96d698c
fix alignment demo trajectory to use correct clean values for fixes 8910a26
use real NVIDIA HelpSteer data for alignment task 4051320
improve alignment task: replace label swaps with real contamination a9620ef
use real Stanford Alpaca data for alignment task 7479de3
add alignment data QA task: 12 issues in LLM instruction-tuning data 5cb467d
Fix port to 8000 for validator compatibility 56f55e9
Varshith B Claude Opus 4.6 (1M context) commited on
Add root-level wrapper files and uv.lock for openenv deployment 0dbc19e
Varshith B Claude Opus 4.6 (1M context) commited on
Merge pull request #1 from varshith15/enhancementsv1 ca01572 unverified
Varshith Bathini commited on
remove base_path: /web to fix HF Space iframe 404 85257bc
add root endpoint for browser/judge friendliness 51adf89
remove binary PNGs for HF push compatibility d7c51ad
use port 7860 for HF Spaces compatibility 671acb9
minor change for meeting requirement format 92187c5
clean code structure 22369d8
expand datasets to include harder real-world scenarios 5d90461
expand datasets 081eb22
add fix stage+demo c3002ad
fixes v1: add per step reward cd11aba
init 4c1a85d
Varshith B commited on