Improve model card: Add library, usage, tags, and links

by nielsr HF Staff - opened Aug 12, 2025

←

This PR enhances the model card for Qwen2.5-3B-GRPO-MATH-1EPOCH by:

Adding library_name: transformers to the metadata, enabling the "Use in Transformers" button and improving integration.
Including specific tags like mathematical-reasoning, code-generation, reinforcement-learning, and reasoning for better discoverability and categorization.
Updating the model description to provide comprehensive context from the paper's abstract about the "Intuitor" project and "Reinforcement Learning from Internal Feedback (RLIF)".
Adding a direct link to the official GitHub repository for easy access to the code.
Providing a practical Python code snippet for sample usage with the transformers library, demonstrating how to load and perform text generation, particularly for mathematical reasoning.
Explicitly linking to the paper: Learning to Reason without External Rewards.

These updates will make the model more accessible, informative, and aligned with Hugging Face's model card best practices.

Xuandong changed pull request status to merged Aug 13, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment