Add zero-shot baseline mode: --adapter_path none skips adapter loading, --no_think suppresses Qwen3 thinking" a2801bb verified nraptisss commited on 9 days ago
Add evaluate_v3.py — stratified sampling, layer-aware max tokens, incremental saves, resume support 734da09 verified nraptisss commited on 9 days ago