Commit History

train: higher LR=1e-5, more SFT=20, lower temp=1.0->0.7
23c3a1b

YashashMathur commited on

fix: C-4 reward clamp, C-6 HF_TOKEN, W-2 citation floor, W-9 dup import
b022bda
verified

YashashMathur commited on

Upload hf_training/train.py with huggingface_hub
c51cef7
verified

YashashMathur commited on

Upload folder using huggingface_hub
206f794
verified

YashashMathur commited on

Upload hf_training
165a05f
verified

YashashMathur commited on