Qwen 2.5 7B Instruct trained on synthetic "ARC-AGI like" tasks with GRPO
https://wandb.ai/graphcore/huggingface/runs/pe4km5hb/workspace?nw=nwusertompollak
Chat template
Files info