Improve model card for llava-extract-qwen3-1.7B: Add metadata, links, and usage

by nielsr HF Staff - opened Nov 25, 2025

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+111

-3

nielsr

Nov 25, 2025

This PR significantly enhances the model card for markendo/llava-extract-qwen3-1.7B by providing crucial metadata and comprehensive usage instructions.

Key improvements include:

Adding pipeline_tag: image-text-to-text to accurately reflect its multimodal capabilities (image + text input, text output) and improve discoverability.
Specifying library_name: transformers based on config.json and usage patterns, enabling the automated "how to use" widget on the Hub.
Including descriptive tags (e.g., multimodal, vision-language-model, reasoning, small-language-model) and base_model information for better context and searchability.
Linking directly to the associated paper, project page, and GitHub repository for easy access to more information and code.
Adding a comprehensive "Usage" section with setup instructions and code snippets directly from the GitHub README, demonstrating how to use the Extract+Think framework for visual extraction and reasoning.

These additions will make the model more discoverable and user-friendly on the Hugging Face Hub.

Improve model card for llava-extract-qwen3-1.7B: Add metadata, links, and usagea85929c8

markendo changed pull request status to merged Nov 25, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment