nvidia/llama-nemotron-rerank-vl-1b-v2 · Add vLLM usage examples

Add vLLM usage examples

by nvidia-oliver-holworthy - opened 15 days ago

base: refs/heads/main

←

from: refs/pr/8

Discussion Files changed

+118

-0

nvidia-oliver-holworthy

NVIDIA org 15 days ago

•

edited 15 days ago

Add vLLM usage instructions for online serving and offline Python API.

The model requires a score template (--chat-template) to correctly format query-document pairs as question:... passage:.... Without it, vLLM does not apply the prompt template and produces incorrect scores.

Includes examples for text-only and multimodal (image + text) document reranking.

Add vLLM usage examples with required score templateab3c9e12

nvidia-oliver-holworthy changed pull request status to open 15 days ago

BoLiu changed pull request status to merged 15 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment