Integrate with Sentence Transformers v5.4
#24
by tomaarsen HF Staff - opened
Hello!
Pull Request overview
- Integrate this model as a Sentence Transformers
CrossEncoder
Details
This PR adds the configuration files and code changes needed to load this model directly as a CrossEncoder via Sentence Transformers, with support for all four modality combinations: text-to-text, text-to-image, image-to-text, and image-to-image reranking.
The model's JinaVLForRanking architecture has several quirks that required handling:
- The model requires a special score token (ID 100) appended after tokenization. The
forward()method now auto-appends it if not already present, so bothcompute_score(which appended it manually) and Sentence Transformers (which doesn't) work correctly. - The model expects
**Document**:\n{doc}\n**Query**:\n{query}formatting rather than the standard Qwen2-VL chat template. A newchat_template.jinjareplaces the original template to produce this format from Sentence Transformers' structured query/document messages. - A custom
JinaRerankerTransformermodule swaps image-image pairs before preprocessing so the processor extracts images in doc-first order, matching the chat template's rendering order. - Moved the sigmoid+bias score normalization (
sigmoid(logit - 2.65)) fromcompute_scoreintoforward(), so Sentence Transformers gets normalized [0, 1] scores directly. - Fixed
config.hidden_sizeaccess (now viatext_config) andtie_word_embeddings(disabled before init to avoidnn.Identitylm_head issues), and extendedmm_token_type_idsalongsideinput_ids/attention_maskwhen appending the score token.
Added files:
modules.json: pipeline with a single customJinaRerankerTransformermodulesentence_bert_config.json:feature-extractiontask (to load viaAutoModel), structured message format, custommethod_output_name: nullto capture raw forward output as scoresconfig_sentence_transformers.json:CrossEncodermodel type withIdentityactivation (model already returns normalized scores)custom_transformer.py:JinaRerankerTransformerthat swaps image-image pair order inpreprocessto fix image extraction orderingchat_template.jinja: reranking-specific template with query/document roles and vision token support
Changed files:
modeling.py: auto-append score token inforward(), move sigmoid normalization fromcompute_scoretoforward(), fixhidden_size/tie_word_embeddingscompatibility, extendmm_token_type_idson score token append, add explicit vision params toforward()signaturepreprocessor_config.json: setmax_pixelsto 602112 to matchcompute_score's processor settingstokenizer_config.json: pointchat_templateto newchat_template.jinjaREADME.md: addedsentence-transformerstag, added usage section showing all four modality combinations (text-to-text, text-to-image, image-to-text, image-to-image)
Once the Sentence Transformers v5.4 release is out, the model can be used immediately like so:
from sentence_transformers import CrossEncoder
model = CrossEncoder("jinaai/jina-reranker-m0", trust_remote_code=True, revision="refs/pr/24")
query = "slm markdown"
documents = [
"We present ReaderLM-v2, a compact 1.5 billion parameter language model...",
"数据提取么?为什么不用正则啊,你用正则不就全解决了么?",
"During the California Gold Rush, some merchants made more money selling supplies to miners than the miners made finding gold.",
]
# Text-only reranking
rankings = model.rank(query, documents)
print(rankings)
# [{'corpus_id': 0, 'score': 0.6875}, {'corpus_id': 2, 'score': 0.5938}, {'corpus_id': 1, 'score': 0.4434}]
# Text-to-image reranking
image_docs = [
"https://raw.githubusercontent.com/jina-ai/multimodal-reranker-test/main/paper-11.png",
"https://raw.githubusercontent.com/jina-ai/multimodal-reranker-test/main/handelsblatt-preview.png",
]
scores = model.predict([(query, doc) for doc in image_docs])
print(scores)
# [0.7813 0.4980]
And after merging, the revision argument can be dropped.
Note that none of the old behaviour is affected/changed, feel free to double-check this. It only adds an additional way to run this model in a familiar and common format.
- Tom Aarsen
tomaarsen changed pull request status to open
numb3r3 changed pull request status to merged