embedl
/

all-MiniLM-L6-v2-quantized-trt

Sentence Similarity

Model card Files Files and versions

dann-od commited on 2 days ago

Commit

69a8f8f

·

verified ·

1 Parent(s): 1e81cb5

Clean up some documentation

Files changed (1) hide show

README.md +1 -2

README.md CHANGED Viewed

@@ -57,8 +57,7 @@ extra_gated_fields:
 Deployable INT8-quantized version of [`sentence-transformers/all-MiniLM-L6-v2`](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2),
 optimized with [embedl-deploy](https://github.com/embedl/embedl-deploy)
 for low-latency NVIDIA TensorRT inference on edge GPUs. Produces
-the same L2-normalised sentence embedding as the upstream encoder,
-in ~1/n the runtime.
 ## Upstream Model

 Deployable INT8-quantized version of [`sentence-transformers/all-MiniLM-L6-v2`](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2),
 optimized with [embedl-deploy](https://github.com/embedl/embedl-deploy)
 for low-latency NVIDIA TensorRT inference on edge GPUs. Produces
+the same L2-normalised sentence embedding as the upstream encoder.
 ## Upstream Model