Improve model card: Add metadata, paper link, and correct GitHub link
#1
by nielsr HF Staff - opened
This PR enhances the model card for TMLR-Group-HF/Self-Certainty-Qwen3-1.7B-Base by:
- Adding
pipeline_tag: text-generationfor improved discoverability on the Hub, as it's a causal language model. - Adding
library_name: transformersto enable the automated "How to use" widget, based on itsQwen3ForCausalLMarchitecture andtransformers_versioninconfig.json. - Including relevant
tagssuch asqwen3,fine-tuning,reinforcement-learning,reasoning,mathematical-reasoning, andself-supervised-learning. - Integrating a direct link to the paper: Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models.
- Correcting the GitHub repository link to https://github.com/tmlr-group/Co-rewarding, which was previously outdated.
- Clarifying the model's description, noting its role as a baseline model trained with the Self-Certainty method within the context of the Co-rewarding paper.
- No sample usage is included, as no direct inference snippet was found in the provided GitHub README content, adhering to the specified guidelines.
resistz changed pull request status to merged