deepset-mxbai-embed-de-large-v1-onnx-fp16

ONNX FP16 export of mixedbread-ai/deepset-mxbai-embed-de-large-v1.

Usage

import onnxruntime as ort
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("thomasht86/deepset-mxbai-embed-de-large-v1-onnx-fp16")
session = ort.InferenceSession("model.onnx", providers=["CUDAExecutionProvider", "CPUExecutionProvider"])

inputs = dict(tokenizer("Your text here", return_tensors="np"))
outputs = session.run(None, inputs)
embeddings = outputs[0]
Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for thomasht86/deepset-mxbai-embed-de-large-v1-onnx-fp16

Quantized
(7)
this model