Upload TRT FP16 engine and layer summary for Nvidia RTX PRO 6000
#4
by sarah-cisco - opened
---- Resolved TRT Profile ----
MIN_BATCH=1
OPT_BATCH=3
MAX_BATCH=12
MIN_SEQ_LEN=1
OPT_SEQ_LEN=1024
MAX_SEQ_LEN=1024
WORKSPACE_SIZE=24696061952
BUILDER_OPTIMIZATION_LEVEL=3
PRECISION=fp16
==== TensorRT Engine ====
Name: Unnamed Network 0 | Explicit Batch Engine
---- 2 Engine Input(s) ----
{input_ids [dtype=int64, shape=(-1, -1)],
attention_mask [dtype=int64, shape=(-1, -1)]}
---- 1 Engine Output(s) ----
{logits [dtype=float32, shape=(-1, 12)]}
---- Memory ----
Device Memory: 705179136 bytes
---- 1 Profile(s) (3 Tensor(s) Each) ----
- Profile: 0
Tensor: input_ids (Input), Index: 0 | Shapes: min=(1, 1), opt=(3, 1024), max=(12, 1024)
Tensor: attention_mask (Input), Index: 1 | Shapes: min=(1, 1), opt=(3, 1024), max=(12, 1024)
Tensor: logits (Output), Index: 2 | Shape: (-1, 12)
---- 478 Layer(s) ----
sarah-cisco changed pull request status to merged