vLLM Factory — Encoder Serving - a doubledsbv Collection

doubledsbv 's Collections

updated 16 days ago

Production inference for encoder models with vLLM plugins. colbert, colpali, GLiNER, GLiNER2 etc. — github.com/ddickmann/vllm-factory

Upvote

fastino/gliner2-large-v1

Updated about 20 hours ago • 157k • 65
knowledgator/gliner-x-large

Token Classification • Updated Feb 19 • 478 • 43
Note Supported in vLLM Factory for production NER serving. Benchmark highlight: MT5/GLiNER family reached up to 11.7x throughput on RTX A5000 with parity preserved. Repo: https://github.com/ddickmann/vllm-factory
LiquidAI/LFM2-ColBERT-350M

Sentence Similarity • 0.4B • Updated 19 days ago • 63.5k • 128
Note Supported in vLLM Factory for ColBERT-style retrieval serving. Benchmark highlight: 8.6x throughput on RTX A5000 with parity preserved. Repo: https://github.com/ddickmann/vllm-factory
VAGOsolutions/SauerkrautLM-Multi-Reason-ModernColBERT

Sentence Similarity • 0.1B • Updated Aug 3, 2025 • 812 • 11
Note Supported in vLLM Factory for retrieval serving. Benchmark highlight: ModernColBERT reached 3.3x throughput on RTX A5000 with parity preserved. Repo: https://github.com/ddickmann/vllm-factory
VAGOsolutions/SauerkrautLM-ColLFM2-450M-v0.1

Image-Text-to-Text • Updated Dec 14, 2025 • 217 • 9
knowledgator/gliner-linker-rerank-v1.0

Token Classification • Updated Feb 24 • 88 • 6
VAGOsolutions/SauerkrautLM-ColQwen3-1.7b-Turbo-v0.1

Image-Text-to-Text • Updated Dec 14, 2025 • 95 • 3
VAGOsolutions/SauerkrautLM-GLiNER

Token Classification • Updated Nov 19, 2025 • 549 • 14
urchade/gliner_small-v2.1

Token Classification • Updated Apr 10, 2024 • 6.72k • 9
nvidia/nemotron-colembed-vl-4b-v2

Visual Document Retrieval • 5B • Updated Feb 21 • 41k • 36
unsloth/embeddinggemma-300m

Sentence Similarity • 0.3B • Updated Jan 22 • 13.4k • • 9
knowledgator/gliner-linker-large-v1.0

Token Classification • Updated Feb 24 • 99 • 8

Upvote