Spaces:

ycwhencpp
/

final-iteration

Paused

App Files Files Community

final-iteration / training

Commit History

train: shrink to weekly horizon + bounded steps

abe4587

vaibhav12332112312 commited on 13 days ago

train: default HF Job flavor l4x1 -> l40sx1 (48GB VRAM)

76b19bd

vaibhav12332112312 commited on 13 days ago

train: per-step credit + drop replies + larger batches

9ee7a09

vaibhav12332112312 commited on 13 days ago

fix(notebook): py3.11 f-string backslash error in format_obs

56f70b1

vaibhav12332112312 commited on 13 days ago

Track PNGs with LFS

3e5148a

vaibhav12332112312 commited on 13 days ago

update

4419350

vaibhav12332112312 commited on 13 days ago

fix(notebook): pin typing_extensions>=4.13.0 to fix pydantic Sentinel ImportError

b1bd9cc

vaibhav12332112312 commited on 13 days ago

fix: restore parse_model_output exception parity with original bare except

aeedd8d

anuragredbus commited on 13 days ago

chore: align train_grpo.ipynb with smoke/syntax patterns for Colab

0587f05

anuragredbus commited on 13 days ago

add training/syntax_only.ipynb — kernel + Python syntax only (no project logic)

0e50d91

anuragredbus commited on 13 days ago

add train_grpo_smoke notebook; quote pip versions in train_grpo

b55c1ff

anuragredbus commited on 13 days ago

fix: notebook loads Qwen without bitsandbytes on Mac; optional training deps

eb1d764

anuragredbus commited on 13 days ago

fix: robust notebook setup (no magic shell) + local CWD auto-detect

8d09986

anuragredbus commited on 13 days ago

update

a1be3fe

vaibhav12332112312 commited on 13 days ago

fix: rewrite training notebook for real LoRA fine-tuning on Colab

4a29e22

anuragredbus commited on 13 days ago

la la la --123

e2c547b

anuragredbus commited on 13 days ago

firstiteration

fc3950d

vaibhav12332112312 commited on 13 days ago

Commit History

train: shrink to weekly horizon + bounded steps abe4587

train: default HF Job flavor l4x1 -> l40sx1 (48GB VRAM) 76b19bd

train: per-step credit + drop replies + larger batches 9ee7a09

fix(notebook): py3.11 f-string backslash error in format_obs 56f70b1

Track PNGs with LFS 3e5148a

update 4419350

fix(notebook): pin typing_extensions>=4.13.0 to fix pydantic Sentinel ImportError b1bd9cc

fix: restore parse_model_output exception parity with original bare except aeedd8d

chore: align train_grpo.ipynb with smoke/syntax patterns for Colab 0587f05

add training/syntax_only.ipynb — kernel + Python syntax only (no project logic) 0e50d91

add train_grpo_smoke notebook; quote pip versions in train_grpo b55c1ff

fix: notebook loads Qwen without bitsandbytes on Mac; optional training deps eb1d764

fix: robust notebook setup (no magic shell) + local CWD auto-detect 8d09986

update a1be3fe

fix: rewrite training notebook for real LoRA fine-tuning on Colab 4a29e22

la la la --123 e2c547b

firstiteration fc3950d

train: shrink to weekly horizon + bounded steps

abe4587

train: default HF Job flavor l4x1 -> l40sx1 (48GB VRAM)

76b19bd

train: per-step credit + drop replies + larger batches

9ee7a09

fix(notebook): py3.11 f-string backslash error in format_obs

56f70b1

Track PNGs with LFS

3e5148a

update

4419350

fix(notebook): pin typing_extensions>=4.13.0 to fix pydantic Sentinel ImportError

b1bd9cc

fix: restore parse_model_output exception parity with original bare except

aeedd8d

chore: align train_grpo.ipynb with smoke/syntax patterns for Colab

0587f05

add training/syntax_only.ipynb — kernel + Python syntax only (no project logic)

0e50d91

add train_grpo_smoke notebook; quote pip versions in train_grpo

b55c1ff

fix: notebook loads Qwen without bitsandbytes on Mac; optional training deps

eb1d764

fix: robust notebook setup (no magic shell) + local CWD auto-detect

8d09986

update

a1be3fe

fix: rewrite training notebook for real LoRA fine-tuning on Colab

4a29e22

la la la --123

e2c547b

firstiteration

fc3950d