Spaces:

ycwhencpp
/

final-iteration

Paused

App Files Files Community

final-iteration / training

Commit History

fix: align notebook with 15-day horizon, drop unused replies field

f7b5241

vaibhav12332112312 commited on 13 days ago

update

5459ec8

vaibhav12332112312 commited on 13 days ago

train: batched parallel rollouts on Qwen2.5-3B + parser hardening

a6b8df0

vaibhav12332112312 commited on 13 days ago

Default repo clone branch to main for training notebooks and HF script.

ad48770

anuragredbus commited on 13 days ago

Set TASK_HORIZON to 15 days and align graders, UI, and training prompts.

99717c2

anuragredbus commited on 13 days ago

update

f9880dd

vaibhav12332112312 commited on 13 days ago

train: shrink to weekly horizon + bounded steps

abe4587

vaibhav12332112312 commited on 13 days ago

train: default HF Job flavor l4x1 -> l40sx1 (48GB VRAM)

76b19bd

vaibhav12332112312 commited on 13 days ago

train: per-step credit + drop replies + larger batches

9ee7a09

vaibhav12332112312 commited on 13 days ago

fix(notebook): py3.11 f-string backslash error in format_obs

56f70b1

vaibhav12332112312 commited on 13 days ago

Track PNGs with LFS

3e5148a

vaibhav12332112312 commited on 13 days ago

update

4419350

vaibhav12332112312 commited on 13 days ago

fix(notebook): pin typing_extensions>=4.13.0 to fix pydantic Sentinel ImportError

b1bd9cc

vaibhav12332112312 commited on 13 days ago

fix: restore parse_model_output exception parity with original bare except

aeedd8d

anuragredbus commited on 13 days ago

chore: align train_grpo.ipynb with smoke/syntax patterns for Colab

0587f05

anuragredbus commited on 13 days ago

add training/syntax_only.ipynb — kernel + Python syntax only (no project logic)

0e50d91

anuragredbus commited on 13 days ago

add train_grpo_smoke notebook; quote pip versions in train_grpo

b55c1ff

anuragredbus commited on 13 days ago

fix: notebook loads Qwen without bitsandbytes on Mac; optional training deps

eb1d764

anuragredbus commited on 13 days ago

fix: robust notebook setup (no magic shell) + local CWD auto-detect

8d09986

anuragredbus commited on 13 days ago

update

a1be3fe

vaibhav12332112312 commited on 13 days ago

fix: rewrite training notebook for real LoRA fine-tuning on Colab

4a29e22

anuragredbus commited on 13 days ago

la la la --123

e2c547b

anuragredbus commited on 13 days ago

firstiteration

fc3950d

vaibhav12332112312 commited on 13 days ago