Commit History

training: use -- and bash -c to bypass hf CLI typer flag stealing
bcc27a5

anuragredbus commited on

training: default Space to ycwhencpp/final-iteration
6279175

anuragredbus commited on

training: split remote runner into in-repo script
0980a17

anuragredbus commited on

Merge branch 'main' of github.com:VaibhavKhandare/viral-posts-env
360c721

vaibhav12332112312 commited on

Merge branch 'main' of https://github.com/VaibhavKhandare/viral-posts-env
b7ef274

anuragredbus commited on

Update hf_mini_blog.md
034a807

anuragredbus commited on

training: default flavor a10g-largex4 (4xA10G, 96GB VRAM)
ef79012

vaibhav12332112312 commited on

training: smoke-mode + hardcoded peak hint + valid tool IDs
1f72457

vaibhav12332112312 commited on

Upload folder using huggingface_hub
e52d302
verified

vaibhavkhandare commited on

Merge branch 'main' of github.com:VaibhavKhandare/viral-posts-env
037fe15

vaibhav12332112312 commited on

added more scenaiors
1a2a407

anuragredbus commited on

Upload folder using huggingface_hub
1d8435e
verified

vaibhavkhandare commited on

Strip heatmap leak from prompt; let model discover peak hours via tools
e82b235

vaibhav12332112312 commited on

Upload folder using huggingface_hub
3419724
verified

vaibhavkhandare commited on

Inject peak hours + history + post-mandate, run SFT every round
30614d3

vaibhav12332112312 commited on

Upload folder using huggingface_hub
1dc66ef
verified

vaibhavkhandare commited on

ReAct two-pass per day so model sees current-day tool results
b1c1732

vaibhav12332112312 commited on

Strip leaked peak-hour info from observation, force tool discovery
afbf541

vaibhav12332112312 commited on

Mandate tool calls in system prompt to debug zero-tool collapse
4299c91

vaibhav12332112312 commited on

Upload folder using huggingface_hub
9fac734
verified

vaibhavkhandare commited on

Match eval sampling to training, log all I/O, single round
271bf42

vaibhav12332112312 commited on

Upload folder using huggingface_hub
ad5d3b3
verified

vaibhavkhandare commited on

train(grpo): unified hint prompt, no-history chat, positive-advantage filter
3326716

vaibhav12332112312 commited on

Upload folder using huggingface_hub
e955a2d
verified

vaibhavkhandare commited on

fix: align notebook with 15-day horizon, drop unused replies field
f7b5241

vaibhav12332112312 commited on

Merge branch 'main' of https://huggingface.co/spaces/vaibhavkhandare/train-bhai-train
21edd7d

vaibhav12332112312 commited on

train: batched parallel rollouts on Qwen2.5-3B + parser hardening
a6b8df0

vaibhav12332112312 commited on

Stop tracking plots/*.png with Git LFS; use small inline PNGs for HF Hub.
81cdb34

anuragredbus commited on

Default repo clone branch to main for training notebooks and HF script.
ad48770

anuragredbus commited on

Set TASK_HORIZON to 15 days and align graders, UI, and training prompts.
99717c2

anuragredbus commited on

Upload folder using huggingface_hub
302be2b
verified

vaibhavkhandare commited on

fix(env): tolerate malformed predict_engagement scheduled_actions
4bfe286

vaibhav12332112312 commited on

train: shrink to weekly horizon + bounded steps
abe4587

vaibhav12332112312 commited on

train: default HF Job flavor l4x1 -> l40sx1 (48GB VRAM)
76b19bd

vaibhav12332112312 commited on

train: per-step credit + drop replies + larger batches
9ee7a09

vaibhav12332112312 commited on

fix(notebook): py3.11 f-string backslash error in format_obs
56f70b1

vaibhav12332112312 commited on

Merge branch 'main' of https://huggingface.co/spaces/vaibhavkhandare/train-bhai-train
383294c

vaibhav12332112312 commited on

fix(notebook): pin typing_extensions>=4.13.0 to fix pydantic Sentinel ImportError
b1bd9cc

vaibhav12332112312 commited on