Enable Baseten-Embeddings-Inference

model_metadata:
  example_model_input:
    encoding_format: float
    input: text string
    model: model
model_name: BEI-Bert-pplx-embed
resources:
  accelerator: H100
  cpu: '1'
  memory: 10Gi
  use_gpu: true
trt_llm:
  build:
    base_model: encoder_bert
    checkpoint_repository:
      repo: perplexity-ai/pplx-embed-v1-0.6b
      revision: refs/pr/11
      source: HF
    max_num_tokens: 32768 
  runtime:
    webserver_default_route: /v1/embeddings

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment