training/TRAINING.md: add "Quick start — just run the .sh" subsection 96d773b Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
training/TRAINING.md: add upfront "what the .sh launchers do" section e51b5ef Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
training/TRAINING.md: fix .sh / .py flag names so the recipe actually runs 8ac18d8 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
Dockerfile: enable openenv web UI at /web (fixes Space 404) a185317 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
docs: add multi-step training curves to README + BLOG_POST 125b737 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
BLOG_POST: clarify ambiguous "80% of 94-token human-prompt accuracy" in hook 6a82df5 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
Remove stale root TRAINING.md cf7a609 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
BLOG_POST: stronger 3-paragraph hook, move Prior Work above the fold, escape unsafe single tildes 5c9e0a4 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
README: use escaped \~ for single-tilde approximations 4a5fd24 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
docs: refresh BLOG_POST stale 87-task numbers; finish README ~ cleanup 802278c Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
Add training/TRAINING.md — end-to-end reproduction recipe 6206e8a Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
README: replace ~ with ≈ in intro to fix accidental strikethrough dec12b4 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
docs: add Prior Work section + replace Training/thinking-mode A/B with multi-step setup fc2c034 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
docs: surface multi-turn in How-it-works, defer training-process notes 4023371 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
docs: consolidate Results section — single master table + per-category + examples 31ce013 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
docs: add multi-step variant section to README + BLOG_POST e1e3cbe Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
demo(new-tab): expose the raw chat-templated string sent to target da41c85 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
demo(new-tab): also run target with verbose description e8bf76c Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
demo: add 'Try a new task' tab 82e3e94 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
BLOG_POST: drop 37× MSN policy compression callout 5f71cca Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
BLOG_POST: drop policy-tasks-as-headline-workload framing 86d20a9 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
docs: add demo screenshots for blog post 8d3be14 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
BLOG_POST: integrate user revisions 70ae05c Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
BLOG_POST: drop 37× compression claims 3f46c24 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
BLOG_POST: full rewrite — research framing, 10 sections, citations, image placeholders 4ea12d8 Don Rishabh commited on 12 days ago
build_before_after_csv: --min-verbose-accuracy flag ea78734 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
demo: filter tasks dead on target (verbose=0 AND trained=0) 86be5e0 Don Rishabh commited on 12 days ago
README: add Scorers section (21 scorers grouped by family) 433bfad Don Rishabh commited on 12 days ago
docs: drop the misleading 37× compression anecdote (0-accuracy task) 9867aa7 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
README + BLOG: drop multi-step + Llama-self mentions (in-progress runs) c3e14ba Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
remove untested Colab notebook + link training/ folder in README a56bede Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
trackio: post-hoc replay of train_metrics.jsonl into a HF Space dashboard 3724e90 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
demo CSVs: add reward_advantage_vs_verbose + accuracy_delta_vs_verbose 7dafc94 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
README: rewrite for hackathon submission — links-first, plots inline, kill verbose sections a1b7a09 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
demo: sample test input dropdown (per-task examples in CSV) bdd9948 Don Rishabh commited on 12 days ago
demo: apply chat template to target (fix rambling completion-mode outputs) 7d8d47c Don Rishabh commited on 12 days ago
multistep: gradient checkpointing + tighter memory defaults 7ca042f Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
space-demo: pin Python 3.11 + exact gradio + jinja2<3.1.5 (defuse 3.13 fallout) 837c5e2 Don Rishabh commited on 12 days ago
space-demo: pin huggingface_hub<1.0 + cap transformers/gradio majors 805f4c4 Don Rishabh commited on 12 days ago
space-demo: fix short_description length (HF Spaces 60-char cap) c968b24 Don Rishabh commited on 12 days ago
space-demo: bundle for HF Spaces Gradio demo cc1bf10 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
inference.py: enumerate all task banks, drop stale TASK_NAMES gate 34b5069 Don Rishabh Claude Opus 4.7 (1M context) commited on 12 days ago
ui: local Gradio demo app — verbose / untrained / trained side-by-side 1aee0c3 Don Rishabh Claude Opus 4.7 (1M context) commited on 13 days ago
tasks_policy: long-context policy-compression tasks e8ef5c3 Don Rishabh Claude Opus 4.7 (1M context) commited on 13 days ago
notebooks: minimal Colab training demo 7eae9f5 Don Rishabh Claude Opus 4.7 (1M context) commited on 13 days ago