Add vLLM usage examples
#8
by nvidia-oliver-holworthy - opened
Add vLLM usage instructions for online serving and offline Python API.
The model requires a score template (--chat-template) to correctly format query-document pairs as question:... passage:.... Without it, vLLM does not apply the prompt template and produces incorrect scores.
Includes examples for text-only and multimodal (image + text) document reranking.
nvidia-oliver-holworthy changed pull request status to open
BoLiu changed pull request status to merged