Image-Text-to-Text
English

Improve model card: Update pipeline tag, add license, and enhance content with usage and results

#1
by nielsr HF Staff - opened

This PR significantly improves the model card for the visual prompt checkpoints by:

  • Updating the pipeline_tag from visual-question-answering to image-text-to-text to accurately reflect the model's functionality (image + text query -> text output), improving discoverability on the Hugging Face Hub.
  • Adding the license tag (apache-2.0) to the metadata for clarity, based on common practice and a majority vote from colleagues, as no explicit license was found in the provided source material.
  • Enriching the model card content with:
    • Informative badges from the GitHub README.
    • The full paper abstract, offering a comprehensive overview.
    • A "Key Features" section to highlight the model's capabilities.
    • The "Method Overview" image for better visual understanding.
    • Detailed "Usage" instructions, including environment setup, dataset preparation, checkpoint download, and how to run inference using the provided scripts (e.g., tester.py), strictly adhering to the "do not make up code yourself" disclaimer by quoting directly from the GitHub README.
    • Extensive "Results" tables, providing quantitative performance metrics.
    • The "Citation" and "Acknowledgments" sections for proper attribution.
  • Correcting a placeholder GitHub repository link in the "Available Checkpoints" section.

These enhancements make the model card more informative, user-friendly, and compliant with Hugging Face Hub best practices.

Thank you for your contribution!

yahya007 changed pull request status to merged

Sign up or log in to comment