Add prototype semantic evaluator for O1 NRM and A1 policy 02caf44 verified nraptisss commited on 38 minutes ago
Update paper tables with zero-shot baseline 045a049 verified nraptisss commited on about 20 hours ago
Add zero-shot vs fine-tuned baseline summary 0e636fc verified nraptisss commited on about 20 hours ago
Restore and update project journal with zero-shot baseline d7752eb verified nraptisss commited on about 20 hours ago
Record stage2 evaluation results and decision not to promote 3241031 verified nraptisss commited on 2 days ago
Update project journal with stage2 run evidence and success criteria 3e07f1e verified nraptisss commited on 3 days ago
Update nohup evaluator defaults for faster resumable batched generation f4beb76 verified nraptisss commited on 4 days ago
Speed up and resume OOD evaluation with batched dynamic generation 6f5475f verified nraptisss commited on 4 days ago
Harden Trackio Space validation to avoid startup crash 5a23de5 verified nraptisss commited on 7 days ago
Fix TRL conversational dataset detection and remove warmup_ratio deprecation 608f732 verified nraptisss commited on 7 days ago
Ensure CUDA GPU preflight and RTX 6000 Ada install path 91d636a verified nraptisss commited on 7 days ago
Add nohup run management and resumable checkpoint support a896ecd verified nraptisss commited on 8 days ago
Add RTX 6000 Ada QLoRA training and evaluation repo d9ba941 verified nraptisss commited on 8 days ago