Improve model card: Add metadata, paper link, and correct GitHub link

#1
by nielsr HF Staff - opened

This PR enhances the model card for TMLR-Group-HF/Self-Certainty-Qwen3-1.7B-Base by:

  • Adding pipeline_tag: text-generation for improved discoverability on the Hub, as it's a causal language model.
  • Adding library_name: transformers to enable the automated "How to use" widget, based on its Qwen3ForCausalLM architecture and transformers_version in config.json.
  • Including relevant tags such as qwen3, fine-tuning, reinforcement-learning, reasoning, mathematical-reasoning, and self-supervised-learning.
  • Integrating a direct link to the paper: Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models.
  • Correcting the GitHub repository link to https://github.com/tmlr-group/Co-rewarding, which was previously outdated.
  • Clarifying the model's description, noting its role as a baseline model trained with the Self-Certainty method within the context of the Co-rewarding paper.
  • No sample usage is included, as no direct inference snippet was found in the provided GitHub README content, adhering to the specified guidelines.
resistz changed pull request status to merged

Sign up or log in to comment