rollback: revert to last working Dockerfile and train.py e30d685 unverified Jayant-Kernel commited on 12 days ago
fix: proper GRPO with trl 0.12.2 no-deps + force hub downgrade 0efac4a unverified Jayant-Kernel commited on 12 days ago
fix: custom training loop without TRL dependency 5232a98 unverified Jayant-Kernel commited on 12 days ago
fix: trl 0.12.2 has GRPOTrainer, pin all deps before trl install 430098b unverified Jayant-Kernel commited on 12 days ago
fix: try multiple import paths for GRPOConfig 2cdce1f unverified Jayant-Kernel commited on 12 days ago
fix: trl 0.11.4 + transformers 4.46.0 + processing_class e8f541c unverified Jayant-Kernel commited on 12 days ago
fix: trl 0.9.4 + transformers 4.41.2 compatible versions e48f580 unverified Jayant-Kernel commited on 12 days ago
fix: tokenizer not processing_class, torch cu121 for GPU 56567fd unverified Jayant-Kernel commited on 12 days ago
improve: abstention penalty, better prompt, mixed curriculum, more steps 253d1ff Jayant-Kernel commited on 13 days ago
update: 500 steps L1 + 300 steps L2, higher lr for 1.5B f788873 Jayant-Kernel commited on 13 days ago
fix: copy data to multiple locations, fallback path for level2 d75e720 Jayant-Kernel commited on 13 days ago
fix: batch size 4 to match num_generations 4 42f691c unverified Jayant-Kernel commited on 13 days ago
fix: replace unsloth with standard transformers+peft, no version conflicts 09c2a70 unverified Jayant-Kernel commited on 13 days ago
fix: install deps in Dockerfile build, not runtime 3470129 unverified Jayant-Kernel commited on 13 days ago
upgrade: Qwen 1.5B model, 150 L1 + 80 L2 steps 8e853cb unverified Jayant-Kernel commited on 13 days ago
Add health server on port 7860 for HF Spaces keep-alive 32b9179 unverified Jayant-Kernel commited on 13 days ago
update: 200 steps L1 + 100 steps L2 training c5e3205 unverified Jayant-Kernel commited on 13 days ago
fix: download datasets from GitHub at runtime instead of relying on package data 0592f6a unverified Jayant-Kernel commited on 13 days ago
fix: install unsloth_zoo and nest-asyncio properly 2a3f319 unverified Jayant-Kernel commited on 13 days ago