Commit History

final submission
5a79ca8

Prasham.Jain commited on

fix(sft): move max_seq_length + dataset_text_field to SFTTrainer
0ced196

Prasham.Jain Claude Sonnet 4.6 commited on

fix(training): follow unsloth's Qwen3 guide exactly
c49a155

Prasham.Jain Claude Sonnet 4.6 commited on

feat(training): switch from LoRA to QLoRA per mentor recommendation
8580936

Prasham.Jain Claude Sonnet 4.6 commited on

fix(training): upgrade to torch 2.5.1+cu124, restore unsloth for Qwen3
ddfe351

Prasham.Jain Claude Sonnet 4.6 commited on

fix(training): drop unsloth, use bitsandbytes+PEFT for SFT
68277e2

Prasham.Jain Claude Sonnet 4.6 commited on

feat(training): A10G-optimised pipeline β€” auto train.py, Dockerfile.train, GH Action sync
11f97d8

Prasham.Jain Claude Sonnet 4.6 commited on

perf(trajectory_gen): parallel workers + JSONL checkpoint for resume
ef5ead6

Prasham.Jain Claude Sonnet 4.6 commited on

fix(trajectory_gen): add --scenarios-dir flag for server-free generation
9fa7302

Prasham.Jain Claude Sonnet 4.6 commited on

fix(submission): Dockerfile, wire-format fixes, scenario loading, real-scenario MockEnvClient
ba93ec0

Prasham.Jain Claude Sonnet 4.6 commited on

feat(training): Phase C6 β€” ablations, training curves, readme finalization
e46f00b

Prasham.Jain Claude Sonnet 4.6 commited on

feat(training): Phase C5 β€” evaluation harness, baselines, plots, readme table
93e68bc

Prasham.Jain Claude Sonnet 4.6 commited on

feat(training): Phase C4 β€” GRPO training, SFT warmstart, rollout, custom trainer
3e29c8b

Prasham.Jain Claude Sonnet 4.6 commited on

feat(training): Phase C3 β€” SFT trajectory generator, env clients, mock env
5ae5581

Prasham.Jain Claude Sonnet 4.6 commited on

feat(phase-0): foundation β€” uv project, schemas, mocks, manifest, CI
19e2683

Prasham.Jain Claude Opus 4.7 (1M context) commited on