@alibidaran on Hugging Face: "🧠 Introducing Qwen2.5 — Cognitive Reasoning Mode I fine-tuned Qwen2.5 with…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

alibidaran

posted an update 18 days ago

Post

2053

🧠 Introducing Qwen2.5 — Cognitive Reasoning Mode

I fine-tuned Qwen2.5 with GRPO to actually think before it answers — not just pattern-match.

Most LLMs mimic reasoning. This one builds a real cognitive path:

📌 Plan → understand the task
🔍 Monitor → reason step by step
✅ Evaluate → verify before answering

Every response follows a strict structured protocol:
<think> <planning> ... <monitoring> ... <evaluation> ... </think>
Then a clean, reasoning-free <output>.

The model self-checks its own structure. If a section is missing or malformed → the response is invalid.

This isn't chain-of-thought slapped on top. The reasoning protocol is baked in via RL.

🔗 Full README + inference code below 👇
alibidaran/Qwen_COG_Thinker_Merged

#AI #LLM #Qwen #ReasoningModels #GRPO #OpenSource

inflatebot

16 days ago

Thanks GPT-4o

alibidaran

16 days ago

Why GPT-4o?

In this post