Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
alibidaranΒ 
posted an update 18 days ago
Post
2053
🧠 Introducing Qwen2.5 β€” Cognitive Reasoning Mode

I fine-tuned Qwen2.5 with GRPO to actually think before it answers β€” not just pattern-match.

Most LLMs mimic reasoning. This one builds a real cognitive path:

πŸ“Œ Plan β†’ understand the task
πŸ” Monitor β†’ reason step by step
βœ… Evaluate β†’ verify before answering

Every response follows a strict structured protocol:
<think> <planning> ... <monitoring> ... <evaluation> ... </think>
Then a clean, reasoning-free <output>.

The model self-checks its own structure. If a section is missing or malformed β†’ the response is invalid.

This isn't chain-of-thought slapped on top. The reasoning protocol is baked in via RL.

πŸ”— Full README + inference code below πŸ‘‡
alibidaran/Qwen_COG_Thinker_Merged

#AI #LLM #Qwen #ReasoningModels #GRPO #OpenSource

Thanks GPT-4o

Β·

Why GPT-4o?

In this post