Spaces:

avanigupta
/

dataqa-env

Sleeping

App Files Files Community

dataqa-env

Commit History

remove ambiguous moderation rows, replace with clear-cut examples

fcce834

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

fix easy task test for updated issue types

0e13037

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

replace ambiguous salary issue with date format fix

f1b7439

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

fix root endpoint to list all 5 tasks

c3f32c9

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

remove ambiguous LR fix — identify-only, any valid LR works

a1f98bf

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

fix moderation issue row collisions and verify all data

8560706

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

add moderation task to Gradio demo replay

887c1aa

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

add content moderation task with real OpenAI Moderation data

b99e42b

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

add toxic/biased response issue to alignment task

c699b6f

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

replace ambiguous fixes with deterministic ones across all tasks

b08652c

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

demo only proposes logically inferrable fixes

5de8f8e

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

fix grading: reward valid fixes, not just exact matches

5e1f8bb

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

update README with alignment task details and issue breakdown

1bd072d

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

make alignment issues subtler to challenge frontier models

96d698c

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

fix alignment demo trajectory to use correct clean values for fixes

8910a26

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

use real NVIDIA HelpSteer data for alignment task

4051320

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

improve alignment task: replace label swaps with real contamination

a9620ef

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

use real Stanford Alpaca data for alignment task

7479de3

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

add alignment data QA task: 12 issues in LLM instruction-tuning data

5cb467d

avanigupta Claude Opus 4.6 (1M context) commited on 8 days ago

Fix port to 8000 for validator compatibility

56f55e9

Varshith B Claude Opus 4.6 (1M context) commited on 8 days ago

Add root-level wrapper files and uv.lock for openenv deployment

0dbc19e

Varshith B Claude Opus 4.6 (1M context) commited on 9 days ago

Merge pull request #1 from varshith15/enhancementsv1

ca01572
unverified

Varshith Bathini commited on 9 days ago

remove base_path: /web to fix HF Space iframe 404

85257bc

avanigupta Claude Opus 4.6 (1M context) commited on 9 days ago

add root endpoint for browser/judge friendliness

51adf89

avanigupta Claude Opus 4.6 (1M context) commited on 9 days ago

remove binary PNGs for HF push compatibility

d7c51ad

avanigupta Claude Opus 4.6 (1M context) commited on 9 days ago

use port 7860 for HF Spaces compatibility

671acb9

avanigupta Claude Opus 4.6 (1M context) commited on 9 days ago

minor change for meeting requirement format

92187c5

avanigupta commited on 9 days ago

clean code structure

22369d8

avanigupta commited on 9 days ago

expand datasets to include harder real-world scenarios

5d90461

avanigupta commited on 9 days ago

expand datasets

081eb22

avanigupta commited on 9 days ago

add fix stage+demo

c3002ad

avanigupta commited on 9 days ago

fixes v1: add per step reward

cd11aba

avanigupta commited on 9 days ago

init

4c1a85d

Varshith B commited on 9 days ago

Commit History

remove ambiguous moderation rows, replace with clear-cut examples fcce834

fix easy task test for updated issue types 0e13037

replace ambiguous salary issue with date format fix f1b7439

fix root endpoint to list all 5 tasks c3f32c9

remove ambiguous LR fix — identify-only, any valid LR works a1f98bf

fix moderation issue row collisions and verify all data 8560706

add moderation task to Gradio demo replay 887c1aa

add content moderation task with real OpenAI Moderation data b99e42b

add toxic/biased response issue to alignment task c699b6f

replace ambiguous fixes with deterministic ones across all tasks b08652c

demo only proposes logically inferrable fixes 5de8f8e

fix grading: reward valid fixes, not just exact matches 5e1f8bb

update README with alignment task details and issue breakdown 1bd072d

make alignment issues subtler to challenge frontier models 96d698c

fix alignment demo trajectory to use correct clean values for fixes 8910a26

use real NVIDIA HelpSteer data for alignment task 4051320

improve alignment task: replace label swaps with real contamination a9620ef

use real Stanford Alpaca data for alignment task 7479de3

add alignment data QA task: 12 issues in LLM instruction-tuning data 5cb467d

Fix port to 8000 for validator compatibility 56f55e9

Add root-level wrapper files and uv.lock for openenv deployment 0dbc19e

Merge pull request #1 from varshith15/enhancementsv1 ca01572 unverified

remove base_path: /web to fix HF Space iframe 404 85257bc

add root endpoint for browser/judge friendliness 51adf89

remove binary PNGs for HF push compatibility d7c51ad

use port 7860 for HF Spaces compatibility 671acb9

minor change for meeting requirement format 92187c5

clean code structure 22369d8

expand datasets to include harder real-world scenarios 5d90461

expand datasets 081eb22

add fix stage+demo c3002ad

fixes v1: add per step reward cd11aba

init 4c1a85d

remove ambiguous moderation rows, replace with clear-cut examples

fcce834

fix easy task test for updated issue types

0e13037

replace ambiguous salary issue with date format fix

f1b7439

fix root endpoint to list all 5 tasks

c3f32c9

remove ambiguous LR fix — identify-only, any valid LR works

a1f98bf

fix moderation issue row collisions and verify all data

8560706

add moderation task to Gradio demo replay

887c1aa

add content moderation task with real OpenAI Moderation data

b99e42b

add toxic/biased response issue to alignment task

c699b6f

replace ambiguous fixes with deterministic ones across all tasks

b08652c

demo only proposes logically inferrable fixes

5de8f8e

fix grading: reward valid fixes, not just exact matches

5e1f8bb

update README with alignment task details and issue breakdown

1bd072d

make alignment issues subtler to challenge frontier models

96d698c

fix alignment demo trajectory to use correct clean values for fixes

8910a26

use real NVIDIA HelpSteer data for alignment task

4051320

improve alignment task: replace label swaps with real contamination

a9620ef

use real Stanford Alpaca data for alignment task

7479de3

add alignment data QA task: 12 issues in LLM instruction-tuning data

5cb467d

Fix port to 8000 for validator compatibility

56f55e9

Add root-level wrapper files and uv.lock for openenv deployment

0dbc19e

Merge pull request #1 from varshith15/enhancementsv1

ca01572
unverified

remove base_path: /web to fix HF Space iframe 404

85257bc

add root endpoint for browser/judge friendliness

51adf89

remove binary PNGs for HF push compatibility

d7c51ad

use port 7860 for HF Spaces compatibility

671acb9

minor change for meeting requirement format

92187c5

clean code structure

22369d8

expand datasets to include harder real-world scenarios

5d90461

expand datasets

081eb22

add fix stage+demo

c3002ad

fixes v1: add per step reward

cd11aba

init

4c1a85d