Upload TRT FP16 engine and layer summary for Nvidia RTX PRO 6000

by sarah-cisco - opened 25 days ago

base: refs/heads/main

←

from: refs/pr/4

Discussion Files changed

+513

-0

sarah-cisco

Robust Intelligence org 25 days ago

---- Resolved TRT Profile ----
MIN_BATCH=1
OPT_BATCH=3
MAX_BATCH=12
MIN_SEQ_LEN=1
OPT_SEQ_LEN=1024
MAX_SEQ_LEN=1024
WORKSPACE_SIZE=24696061952
BUILDER_OPTIMIZATION_LEVEL=3
PRECISION=fp16

==== TensorRT Engine ====
Name: Unnamed Network 0 | Explicit Batch Engine

---- 2 Engine Input(s) ----
{input_ids [dtype=int64, shape=(-1, -1)],
attention_mask [dtype=int64, shape=(-1, -1)]}

---- 1 Engine Output(s) ----
{logits [dtype=float32, shape=(-1, 12)]}

---- Memory ----
Device Memory: 705179136 bytes

---- 1 Profile(s) (3 Tensor(s) Each) ----

Profile: 0
Tensor: input_ids (Input), Index: 0 | Shapes: min=(1, 1), opt=(3, 1024), max=(12, 1024)
Tensor: attention_mask (Input), Index: 1 | Shapes: min=(1, 1), opt=(3, 1024), max=(12, 1024)
Tensor: logits (Output), Index: 2 | Shape: (-1, 12)

---- 478 Layer(s) ----

Upload TRT FP16 engine and layer summary for Nvidia RTX PRO 600018cddf59

sarah-cisco changed pull request status to merged 25 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment