Spaces:

Prasham1710
/

ci-triage-env

Sleeping

App Files Files Community

ci-triage-env / src /ci_triage_env /training

Commit History

final submission

5a79ca8

Prasham.Jain commited on 14 days ago

fix(sft): move max_seq_length + dataset_text_field to SFTTrainer

0ced196

Prasham.Jain Claude Sonnet 4.6 commited on 14 days ago

fix(training): follow unsloth's Qwen3 guide exactly

c49a155

Prasham.Jain Claude Sonnet 4.6 commited on 14 days ago

feat(training): switch from LoRA to QLoRA per mentor recommendation

8580936

Prasham.Jain Claude Sonnet 4.6 commited on 14 days ago

fix(training): upgrade to torch 2.5.1+cu124, restore unsloth for Qwen3

ddfe351

Prasham.Jain Claude Sonnet 4.6 commited on 14 days ago

fix(training): drop unsloth, use bitsandbytes+PEFT for SFT

68277e2

Prasham.Jain Claude Sonnet 4.6 commited on 14 days ago

feat(training): A10G-optimised pipeline — auto train.py, Dockerfile.train, GH Action sync

11f97d8

Prasham.Jain Claude Sonnet 4.6 commited on 14 days ago

perf(trajectory_gen): parallel workers + JSONL checkpoint for resume

ef5ead6

Prasham.Jain Claude Sonnet 4.6 commited on 14 days ago

fix(trajectory_gen): add --scenarios-dir flag for server-free generation

9fa7302

Prasham.Jain Claude Sonnet 4.6 commited on 14 days ago

fix(submission): Dockerfile, wire-format fixes, scenario loading, real-scenario MockEnvClient

ba93ec0

Prasham.Jain Claude Sonnet 4.6 commited on 14 days ago

feat(training): Phase C6 — ablations, training curves, readme finalization

e46f00b

Prasham.Jain Claude Sonnet 4.6 commited on 14 days ago

feat(training): Phase C5 — evaluation harness, baselines, plots, readme table

93e68bc

Prasham.Jain Claude Sonnet 4.6 commited on 14 days ago

feat(training): Phase C4 — GRPO training, SFT warmstart, rollout, custom trainer

3e29c8b

Prasham.Jain Claude Sonnet 4.6 commited on 14 days ago

feat(training): Phase C3 — SFT trajectory generator, env clients, mock env

5ae5581

Prasham.Jain Claude Sonnet 4.6 commited on 14 days ago

feat(phase-0): foundation — uv project, schemas, mocks, manifest, CI

19e2683

Prasham.Jain Claude Opus 4.7 (1M context) commited on 14 days ago

Commit History

final submission 5a79ca8

fix(sft): move max_seq_length + dataset_text_field to SFTTrainer 0ced196

fix(training): follow unsloth's Qwen3 guide exactly c49a155

feat(training): switch from LoRA to QLoRA per mentor recommendation 8580936

fix(training): upgrade to torch 2.5.1+cu124, restore unsloth for Qwen3 ddfe351

fix(training): drop unsloth, use bitsandbytes+PEFT for SFT 68277e2

feat(training): A10G-optimised pipeline — auto train.py, Dockerfile.train, GH Action sync 11f97d8

perf(trajectory_gen): parallel workers + JSONL checkpoint for resume ef5ead6

fix(trajectory_gen): add --scenarios-dir flag for server-free generation 9fa7302

fix(submission): Dockerfile, wire-format fixes, scenario loading, real-scenario MockEnvClient ba93ec0

feat(training): Phase C6 — ablations, training curves, readme finalization e46f00b

feat(training): Phase C5 — evaluation harness, baselines, plots, readme table 93e68bc

feat(training): Phase C4 — GRPO training, SFT warmstart, rollout, custom trainer 3e29c8b

feat(training): Phase C3 — SFT trajectory generator, env clients, mock env 5ae5581

feat(phase-0): foundation — uv project, schemas, mocks, manifest, CI 19e2683

final submission

5a79ca8

fix(sft): move max_seq_length + dataset_text_field to SFTTrainer

0ced196

fix(training): follow unsloth's Qwen3 guide exactly

c49a155

feat(training): switch from LoRA to QLoRA per mentor recommendation

8580936

fix(training): upgrade to torch 2.5.1+cu124, restore unsloth for Qwen3

ddfe351

fix(training): drop unsloth, use bitsandbytes+PEFT for SFT

68277e2

feat(training): A10G-optimised pipeline — auto train.py, Dockerfile.train, GH Action sync

11f97d8

perf(trajectory_gen): parallel workers + JSONL checkpoint for resume

ef5ead6

fix(trajectory_gen): add --scenarios-dir flag for server-free generation

9fa7302

fix(submission): Dockerfile, wire-format fixes, scenario loading, real-scenario MockEnvClient

ba93ec0

feat(training): Phase C6 — ablations, training curves, readme finalization

e46f00b

feat(training): Phase C5 — evaluation harness, baselines, plots, readme table

93e68bc

feat(training): Phase C4 — GRPO training, SFT warmstart, rollout, custom trainer

3e29c8b

feat(training): Phase C3 — SFT trajectory generator, env clients, mock env

5ae5581

feat(phase-0): foundation — uv project, schemas, mocks, manifest, CI

19e2683