embedl
/

mobilevit-small-quantized

Image Classification

Model card Files Files and versions

dann-od commited on 4 days ago

Commit

30d278b

·

verified ·

1 Parent(s): ed1810b

Add QAT note

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -49,6 +49,9 @@ for low-latency NVIDIA TensorRT inference on edge GPUs.
   semantics.
 - **Validated accuracy** within 3.30 pp of the FP32
   baseline on ImageNet (see Accuracy table below).
 - **Matches the latency of `trtexec --best`** on supported NVIDIA
   hardware while preserving INT8 accuracy (see Performance table
   below).

   semantics.
 - **Validated accuracy** within 3.30 pp of the FP32
   baseline on ImageNet (see Accuracy table below).
+- **Quantization-aware training (QAT)** further recovers accuracy
+  lost in INT8 conversion by fine-tuning the model with simulated
+  quantization in the forward pass.
 - **Matches the latency of `trtexec --best`** on supported NVIDIA
   hardware while preserving INT8 accuracy (see Performance table
   below).