Qwen 2.5 7B Instruct trained on synthetic "ARC-AGI like" tasks with GRPO

https://wandb.ai/graphcore/huggingface/runs/pe4km5hb/workspace?nw=nwusertompollak

Downloads last month
7
Safetensors
Model size
7B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tommyp111/2D-grid-world-Qwen-2.5-7B-grpo

Quantizations
1 model

Collection including tommyp111/2D-grid-world-Qwen-2.5-7B-grpo