Upload TRT model_l40s_fp16.plan for Nvidia L40S (batch_size=1024)

by sarah-cisco - opened 14 days ago

base: refs/heads/main

←

from: refs/pr/7

Discussion Files changed

+14

-14

sarah-cisco

Robust Intelligence org 14 days ago

---- Resolved TRT Profile ----
MIN_BATCH=1
OPT_BATCH=1024
MAX_BATCH=1024
MIN_SEQ_LEN=1
OPT_SEQ_LEN=512
MAX_SEQ_LEN=512
WORKSPACE_SIZE=24696061952
BUILDER_OPTIMIZATION_LEVEL=3
PRECISION=fp16

==== TensorRT Engine ====
Name: Unnamed Network 0 | Explicit Batch Engine

---- 2 Engine Input(s) ----
{input_ids [dtype=int64, shape=(-1, -1)],
attention_mask [dtype=int64, shape=(-1, -1)]}

---- 1 Engine Output(s) ----
{logits [dtype=float32, shape=(-1, 12)]}

---- Memory ----
Device Memory: 22011999744 bytes

---- 1 Profile(s) (3 Tensor(s) Each) ----

Profile: 0
Tensor: input_ids (Input), Index: 0 | Shapes: min=(1, 1), opt=(1024, 512), max=(1024, 512)
Tensor: attention_mask (Input), Index: 1 | Shapes: min=(1, 1), opt=(1024, 512), max=(1024, 512)
Tensor: logits (Output), Index: 2 | Shape: (-1, 12)

---- 453 Layer(s) ----

Upload TRT model_l40s_fp16.plan for Nvidia L40S (batch_size=1024)6be4390d

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Cannot merge

This branch has merge conflicts in the following files:

model_l40s_fp16.plan
trt_engine_layer_summary_l40s_fp16.txt

· Sign up or log in to comment