Commit History

chore(spaces): add HF Space config for training [skip ci]
bd95811

Prasham.Jain commited on

final submission
5a79ca8

Prasham.Jain commited on

fix(sft): move max_seq_length + dataset_text_field to SFTTrainer
0ced196

Prasham.Jain Claude Sonnet 4.6 commited on

fix(training): follow unsloth's Qwen3 guide exactly
c49a155

Prasham.Jain Claude Sonnet 4.6 commited on

fix(training): upgrade base image to torch 2.6.0+cu126
15e36fe

Prasham.Jain Claude Sonnet 4.6 commited on

feat(training): switch from LoRA to QLoRA per mentor recommendation
8580936

Prasham.Jain Claude Sonnet 4.6 commited on

fix(training): pin torchao==0.5.0 after unsloth to fix torch.int1 error
4647df7

Prasham.Jain Claude Sonnet 4.6 commited on

fix(spaces): switch training Space to A10G Small, tune notebook for 24 GB
421885d

Prasham.Jain Claude Sonnet 4.6 commited on

fix(spaces): add app_port: 8000 to env Space YAML
b78f85d

Prasham.Jain Claude Sonnet 4.6 commited on

fix(training): upgrade to torch 2.5.1+cu124, restore unsloth for Qwen3
ddfe351

Prasham.Jain Claude Sonnet 4.6 commited on

fix(training): drop unsloth, use bitsandbytes+PEFT for SFT
68277e2

Prasham.Jain Claude Sonnet 4.6 commited on

fix(docker): pin torchao==0.5.0 + transformers/trl/peft before unsloth install
e3da0da

Prasham.Jain commited on

fix(notebook): Cell 3 load_dataset per split + write JSON (was snapshot_download finding 0 .json files)
127ef78

Prasham.Jain commited on

fix(docker): COPY README.md into build context (hatchling metadata requires it)
54e5bb5

Prasham.Jain commited on

fix(spaces): orange→yellow (not a valid HF colorFrom value)
f81c2bd

Prasham.Jain commited on

fix(spaces): inject HF Space YAML frontmatter into README on push
23c20a6

Prasham.Jain Claude Sonnet 4.6 commited on

chore: add push_to_hf.sh β€” one-shot deploy to both HF Spaces
042c15f

Prasham.Jain Claude Sonnet 4.6 commited on

feat(training): A10G-optimised pipeline β€” auto train.py, Dockerfile.train, GH Action sync
11f97d8

Prasham.Jain Claude Sonnet 4.6 commited on

parallelized the data generation for openai
1134123

Prasham.Jain commited on

perf(trajectory_gen): parallel workers + JSONL checkpoint for resume
ef5ead6

Prasham.Jain Claude Sonnet 4.6 commited on

fix(trajectory_gen): add --scenarios-dir flag for server-free generation
9fa7302

Prasham.Jain Claude Sonnet 4.6 commited on

fix(submission): Dockerfile, wire-format fixes, scenario loading, real-scenario MockEnvClient
ba93ec0

Prasham.Jain Claude Sonnet 4.6 commited on

merge(branch-b): data pipeline + rewards + training β€” B1-B5, C1-C6
5305230

Prasham.Jain commited on

merge(branch-a): env core β€” FastAPI server, 11 tool routes, episode lifecycle, replay visualizer (A1–A5)
dec324b

Prasham.Jain commited on

feat(training): Phase C6 β€” ablations, training curves, readme finalization
e46f00b

Prasham.Jain Claude Sonnet 4.6 commited on

feat(training): Phase C5 β€” evaluation harness, baselines, plots, readme table
93e68bc

Prasham.Jain Claude Sonnet 4.6 commited on

feat(training): Phase C4 β€” GRPO training, SFT warmstart, rollout, custom trainer
3e29c8b

Prasham.Jain Claude Sonnet 4.6 commited on

feat(training): Phase C3 β€” SFT trajectory generator, env clients, mock env
5ae5581

Prasham.Jain Claude Sonnet 4.6 commited on

feat(rewards): Phase C2 β€” composite reward, replay verifier, frozen weights
3553258

Prasham.Jain Claude Sonnet 4.6 commited on

feat(rewards): Phase C1 β€” all 9 reward components implemented
d11066d

Prasham.Jain Claude Sonnet 4.6 commited on

feat(data): Phase B5 β€” corpus instantiation, HF publish, annotations
18a3fbf

Prasham.Jain Claude Sonnet 4.6 commited on

feat(data): Phase B4 β€” 7 ScenarioFamilyGenerators with archetype loading
7a658b7

Prasham.Jain Claude Opus 4.7 (1M context) commited on

feat(data): Phase B3 β€” failure clustering and archetype extraction
cd61817

Prasham.Jain Claude Opus 4.7 (1M context) commited on

feat(branch-b): B2 GitHub Actions log mining β€” gh-CLI scraper, anonymizer, throttle, cache
a4ff035

Prasham.Jain Claude Opus 4.7 (1M context) commited on

feat(branch-b): B1 public dataset ingest β€” DeFlaker, iDFlakies, FlakeFlagger, LogHub loaders + CLI
54627d8

Prasham.Jain Claude Opus 4.7 (1M context) commited on

feat(branch-a): A5 replay visualizer β€” static HTML/JS viewer mounted under /viz
e1814ef

Prasham.Jain Claude Opus 4.7 (1M context) commited on

feat(branch-a): A3 episode lifecycle β€” budget enforcement, terminal handling, payload truncation, trace serialization
ed51f28

Prasham.Jain Claude Opus 4.7 (1M context) commited on

feat(branch-a): A2 tool implementations β€” route 11 handlers to scenario.tool_outputs with cost charging
272a052

Prasham.Jain Claude Opus 4.7 (1M context) commited on

refactor(branch-a): A1 β€” adopt openenv-core MCPEnvironment + create_app for canonical OpenEnv compliance
4c1ad51

Prasham.Jain Claude Opus 4.7 (1M context) commited on

feat(branch-a): A1 server scaffold β€” FastAPI /reset /step /state /mcp + 11 stub tool handlers
8be6018

Prasham.Jain Claude Opus 4.7 (1M context) commited on

feat(phase-0): foundation β€” uv project, schemas, mocks, manifest, CI
19e2683

Prasham.Jain Claude Opus 4.7 (1M context) commited on

chore: import planning artifacts
b506996

Prasham.Jain commited on