Spaces:

ycwhencpp
/

final-iteration

Paused

App Files Files Community

final-iteration

Commit History

Strip heatmap leak from prompt; let model discover peak hours via tools

e82b235

vaibhav12332112312 commited on 12 days ago

Upload folder using huggingface_hub

3419724
verified

vaibhavkhandare commited on 12 days ago

Merge HF run-output upload

e299415

vaibhav12332112312 commited on 12 days ago

Inject peak hours + history + post-mandate, run SFT every round

30614d3

vaibhav12332112312 commited on 12 days ago

Upload folder using huggingface_hub

1dc66ef
verified

vaibhavkhandare commited on 12 days ago

ReAct two-pass per day so model sees current-day tool results

b1c1732

vaibhav12332112312 commited on 12 days ago

Strip leaked peak-hour info from observation, force tool discovery

afbf541

vaibhav12332112312 commited on 12 days ago

Mandate tool calls in system prompt to debug zero-tool collapse

4299c91

vaibhav12332112312 commited on 12 days ago

Upload folder using huggingface_hub

9fac734
verified

vaibhavkhandare commited on 12 days ago

Match eval sampling to training, log all I/O, single round

271bf42

vaibhav12332112312 commited on 12 days ago

Upload folder using huggingface_hub

ad5d3b3
verified

vaibhavkhandare commited on 12 days ago

train(grpo): unified hint prompt, no-history chat, positive-advantage filter

3326716

vaibhav12332112312 commited on 12 days ago

Upload folder using huggingface_hub

e955a2d
verified

vaibhavkhandare commited on 12 days ago

fix: align notebook with 15-day horizon, drop unused replies field

f7b5241

vaibhav12332112312 commited on 12 days ago

update

5459ec8

vaibhav12332112312 commited on 12 days ago

Merge branch 'main' of https://huggingface.co/spaces/vaibhavkhandare/train-bhai-train

21edd7d

vaibhav12332112312 commited on 12 days ago

train: batched parallel rollouts on Qwen2.5-3B + parser hardening

a6b8df0

vaibhav12332112312 commited on 12 days ago

Stop tracking plots/*.png with Git LFS; use small inline PNGs for HF Hub.

81cdb34

anuragredbus commited on 12 days ago

Default repo clone branch to main for training notebooks and HF script.

ad48770

anuragredbus commited on 12 days ago

Set TASK_HORIZON to 15 days and align graders, UI, and training prompts.

99717c2

anuragredbus commited on 12 days ago

update

c3e9b69

vaibhav12332112312 commited on 12 days ago

update

f9880dd

vaibhav12332112312 commited on 12 days ago

Upload folder using huggingface_hub

302be2b
verified

vaibhavkhandare commited on 12 days ago

fix(env): tolerate malformed predict_engagement scheduled_actions

4bfe286

vaibhav12332112312 commited on 12 days ago

train: shrink to weekly horizon + bounded steps

abe4587

vaibhav12332112312 commited on 12 days ago

train: default HF Job flavor l4x1 -> l40sx1 (48GB VRAM)

76b19bd

vaibhav12332112312 commited on 12 days ago

train: per-step credit + drop replies + larger batches

9ee7a09

vaibhav12332112312 commited on 12 days ago

fix(notebook): py3.11 f-string backslash error in format_obs

56f70b1

vaibhav12332112312 commited on 13 days ago

Track PNGs with LFS

3e5148a

vaibhav12332112312 commited on 13 days ago

Merge branch 'main' of https://huggingface.co/spaces/vaibhavkhandare/train-bhai-train

383294c

vaibhav12332112312 commited on 13 days ago

update

4419350

vaibhav12332112312 commited on 13 days ago

fix(notebook): pin typing_extensions>=4.13.0 to fix pydantic Sentinel ImportError

b1bd9cc

vaibhav12332112312 commited on 13 days ago

fix: add missing metadata on display_data outputs

6573551

vaibhav12332112312 commited on 13 days ago

fix: add missing 'name' to stream outputs

4e45ebc

vaibhav12332112312 commited on 13 days ago

add: viraltest code (server, models, inference)

9c3eab8

vaibhav12332112312 commited on 13 days ago

add: train_grpo notebook

b9165e0

vaibhav12332112312 commited on 13 days ago

initial commit

04c5c43
unverified

vaibhavkhandare commited on 13 days ago

fix: restore parse_model_output exception parity with original bare except

aeedd8d

anuragredbus commited on 13 days ago

chore: align train_grpo.ipynb with smoke/syntax patterns for Colab

0587f05

anuragredbus commited on 13 days ago

add training/syntax_only.ipynb — kernel + Python syntax only (no project logic)

0e50d91

anuragredbus commited on 13 days ago

add train_grpo_smoke notebook; quote pip versions in train_grpo

b55c1ff

anuragredbus commited on 13 days ago

fix: notebook loads Qwen without bitsandbytes on Mac; optional training deps

eb1d764

anuragredbus commited on 13 days ago

fix: robust notebook setup (no magic shell) + local CWD auto-detect

8d09986

anuragredbus commited on 13 days ago

update

a1be3fe

vaibhav12332112312 commited on 13 days ago

Merge branch 'hack1' of github.com:VaibhavKhandare/viral-posts-env into hack1

6c01076

vaibhav12332112312 commited on 13 days ago

update

97ee7e7

vaibhav12332112312 commited on 13 days ago

fix: rewrite training notebook for real LoRA fine-tuning on Colab

4a29e22

anuragredbus commited on 13 days ago

la la la --123

e2c547b

anuragredbus commited on 13 days ago

firstiteration

fc3950d

vaibhav12332112312 commited on 13 days ago

reduced steps to fit out free tier

fcfbc38

anuragredbus commited on 30 days ago