Spaces:

ycwhencpp
/

final-iteration

Paused

App Files Files Community

final-iteration

Commit History

train_grpo: prebuilt flash-attn wheel + verbose training rollouts

1d82571

anuragredbus commited on 12 days ago

training/train_grpo.ipynb: add Kaggle (/kaggle/working) fresh-clone branch

9536a33

anuragredbus commited on 12 days ago

training: use -- and bash -c to bypass hf CLI typer flag stealing

bcc27a5

anuragredbus commited on 12 days ago

training: default Space to ycwhencpp/final-iteration

6279175

anuragredbus commited on 12 days ago

training: split remote runner into in-repo script

0980a17

anuragredbus commited on 12 days ago

update

225cdfe

vaibhav12332112312 commited on 12 days ago

Merge branch 'main' of github.com:VaibhavKhandare/viral-posts-env

360c721

vaibhav12332112312 commited on 12 days ago

Merge branch 'main' of https://github.com/VaibhavKhandare/viral-posts-env

b7ef274

anuragredbus commited on 12 days ago

Update hf_mini_blog.md

034a807

anuragredbus commited on 12 days ago

training: default flavor a10g-largex4 (4xA10G, 96GB VRAM)

ef79012

vaibhav12332112312 commited on 12 days ago

training: smoke-mode + hardcoded peak hint + valid tool IDs

1f72457

vaibhav12332112312 commited on 12 days ago

Upload folder using huggingface_hub

e52d302
verified

vaibhavkhandare commited on 12 days ago

Add blog post; readme tweak

a402a82

vaibhav12332112312 commited on 12 days ago

merge HF run-output artifacts

9bb1116

vaibhav12332112312 commited on 12 days ago

Merge branch 'main' of github.com:VaibhavKhandare/viral-posts-env

037fe15

vaibhav12332112312 commited on 12 days ago

pounteradds

8970072

vaibhav12332112312 commited on 12 days ago

changes

7a5c462

anuragredbus commited on 12 days ago

added more scenaiors

1a2a407

anuragredbus commited on 12 days ago

update

f0a8734

vaibhav12332112312 commited on 12 days ago

Upload folder using huggingface_hub

1d8435e
verified

vaibhavkhandare commited on 13 days ago

Strip heatmap leak from prompt; let model discover peak hours via tools

e82b235

vaibhav12332112312 commited on 13 days ago

Upload folder using huggingface_hub

3419724
verified

vaibhavkhandare commited on 13 days ago

Merge HF run-output upload

e299415

vaibhav12332112312 commited on 13 days ago

Inject peak hours + history + post-mandate, run SFT every round

30614d3

vaibhav12332112312 commited on 13 days ago

Upload folder using huggingface_hub

1dc66ef
verified

vaibhavkhandare commited on 13 days ago

ReAct two-pass per day so model sees current-day tool results

b1c1732

vaibhav12332112312 commited on 13 days ago

Strip leaked peak-hour info from observation, force tool discovery

afbf541

vaibhav12332112312 commited on 13 days ago

Mandate tool calls in system prompt to debug zero-tool collapse

4299c91

vaibhav12332112312 commited on 13 days ago

Upload folder using huggingface_hub

9fac734
verified

vaibhavkhandare commited on 13 days ago

Match eval sampling to training, log all I/O, single round

271bf42

vaibhav12332112312 commited on 13 days ago

Upload folder using huggingface_hub

ad5d3b3
verified

vaibhavkhandare commited on 13 days ago

train(grpo): unified hint prompt, no-history chat, positive-advantage filter

3326716

vaibhav12332112312 commited on 13 days ago

Upload folder using huggingface_hub

e955a2d
verified

vaibhavkhandare commited on 13 days ago

fix: align notebook with 15-day horizon, drop unused replies field

f7b5241

vaibhav12332112312 commited on 13 days ago

update

5459ec8

vaibhav12332112312 commited on 13 days ago

Merge branch 'main' of https://huggingface.co/spaces/vaibhavkhandare/train-bhai-train

21edd7d

vaibhav12332112312 commited on 13 days ago

train: batched parallel rollouts on Qwen2.5-3B + parser hardening

a6b8df0

vaibhav12332112312 commited on 13 days ago

Stop tracking plots/*.png with Git LFS; use small inline PNGs for HF Hub.

81cdb34

anuragredbus commited on 13 days ago

Default repo clone branch to main for training notebooks and HF script.

ad48770

anuragredbus commited on 13 days ago

Set TASK_HORIZON to 15 days and align graders, UI, and training prompts.

99717c2

anuragredbus commited on 13 days ago

update

c3e9b69

vaibhav12332112312 commited on 13 days ago

update

f9880dd

vaibhav12332112312 commited on 13 days ago

Upload folder using huggingface_hub

302be2b
verified

vaibhavkhandare commited on 13 days ago

fix(env): tolerate malformed predict_engagement scheduled_actions

4bfe286

vaibhav12332112312 commited on 13 days ago

train: shrink to weekly horizon + bounded steps

abe4587

vaibhav12332112312 commited on 13 days ago

train: default HF Job flavor l4x1 -> l40sx1 (48GB VRAM)

76b19bd

vaibhav12332112312 commited on 13 days ago

train: per-step credit + drop replies + larger batches

9ee7a09

vaibhav12332112312 commited on 13 days ago

fix(notebook): py3.11 f-string backslash error in format_obs

56f70b1

vaibhav12332112312 commited on 13 days ago

Track PNGs with LFS

3e5148a

vaibhav12332112312 commited on 13 days ago

Merge branch 'main' of https://huggingface.co/spaces/vaibhavkhandare/train-bhai-train

383294c

vaibhav12332112312 commited on 13 days ago