cross-encoder/ms-marco-TinyBERT-L-2-v2 - LiteRT Optimized

This is a LiteRT (formerly TensorFlow Lite) export of cross-encoder/ms-marco-TinyBERT-L-2-v2.

It is optimized for mobile and edge inference (Android/iOS/Embedded).

Model Details

Attribute Value
Task Ultra-Fast Reranking
Format .tflite (Float32)
File Size 16.8 MB
Input Length 512 tokens
Output Dim 1

Usage

import numpy as np
from ai_edge_litert.interpreter import Interpreter
from transformers import AutoTokenizer

model_path = "cross-encoder_ms-marco-TinyBERT-L-2-v2.tflite"
interpreter = Interpreter(model_path=model_path)
interpreter.allocate_tensors()

tokenizer = AutoTokenizer.from_pretrained("cross-encoder/ms-marco-TinyBERT-L-2-v2")

def compute_score(query, doc):
    # Tokenize Pair: [CLS] query [SEP] doc [SEP]
    inputs = tokenizer(query, doc, max_length=512, padding="max_length", truncation=True, return_tensors="np")

    input_details = interpreter.get_input_details()
    interpreter.set_tensor(input_details[0]['index'], inputs['input_ids'].astype(np.int64))
    interpreter.set_tensor(input_details[1]['index'], inputs['attention_mask'].astype(np.int64))

    interpreter.invoke()

    # Output is a single score (logit)
    output_details = interpreter.get_output_details()
    score = interpreter.get_tensor(output_details[0]['index'])[0][0]
    return score

score = compute_score("What is python?", "Python is a programming language.")
print(f"Relevance Score: {score}")

Converted by Bombek1 using litert-torch

Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Bombek1/ms-marco-TinyBERT-L-2-v2-litert

Finetuned
(3)
this model