koutch/short_paper_llama_1.json_train_dpo_v3_train_no_think Text Generation • 8B • Updated Jan 12 • 12
koutch/short_paper_llama_1.json_train_dpo_v2_train_no_think Text Generation • 8B • Updated Jan 12 • 8
koutch/short_paper_llama_1.json_train_dpo_v4_train_no_think Text Generation • 8B • Updated Jan 12 • 3
koutch/short_paper_llama_0.json_train_dpo_v4_train_no_think Text Generation • 8B • Updated Jan 11 • 1
koutch/short_paper_qwen_qwen3-instruct-4b_train_sft_train_think Text Generation • 4B • Updated Jan 9 • 8