Improve model card: Add library, usage, tags, and links

#1
by nielsr HF Staff - opened

This PR enhances the model card for Qwen2.5-3B-GRPO-MATH-1EPOCH by:

  • Adding library_name: transformers to the metadata, enabling the "Use in Transformers" button and improving integration.
  • Including specific tags like mathematical-reasoning, code-generation, reinforcement-learning, and reasoning for better discoverability and categorization.
  • Updating the model description to provide comprehensive context from the paper's abstract about the "Intuitor" project and "Reinforcement Learning from Internal Feedback (RLIF)".
  • Adding a direct link to the official GitHub repository for easy access to the code.
  • Providing a practical Python code snippet for sample usage with the transformers library, demonstrating how to load and perform text generation, particularly for mathematical reasoning.
  • Explicitly linking to the paper: Learning to Reason without External Rewards.

These updates will make the model more accessible, informative, and aligned with Hugging Face's model card best practices.

Xuandong changed pull request status to merged

Sign up or log in to comment