nvidia
/

Gemma-4-31B-IT-NVFP4

Text Generation

Model Optimizer

Model card Files Files and versions

JamesShenNV commited on 20 days ago

Commit

1365cf7

·

verified ·

1 Parent(s): d4b34f9

Update Model Optimizer name

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ tags:
 # Model Overview
 ## Description:
-Gemma 4 31B IT is an open multimodal model built by Google DeepMind that handles text and image inputs, can process video as sequences of frames, and generates text output. It is designed to deliver frontier-level performance for reasoning, agentic workflows, coding, and multimodal understanding on consumer GPUs and workstations, with a 256K-token context window and support for over 140 languages. The model uses a hybrid attention mechanism that interleaves local sliding-window and full global attention, with unified Keys and Values in global layers and Proportional RoPE (p-RoPE) to support long-context performance. The NVIDIA Gemma 4 31B IT NVFP4 model is quantized with [TensorRT Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer).
 This model is ready for commercial/non-commercial use.  <br>

 # Model Overview
 ## Description:
+Gemma 4 31B IT is an open multimodal model built by Google DeepMind that handles text and image inputs, can process video as sequences of frames, and generates text output. It is designed to deliver frontier-level performance for reasoning, agentic workflows, coding, and multimodal understanding on consumer GPUs and workstations, with a 256K-token context window and support for over 140 languages. The model uses a hybrid attention mechanism that interleaves local sliding-window and full global attention, with unified Keys and Values in global layers and Proportional RoPE (p-RoPE) to support long-context performance. The NVIDIA Gemma 4 31B IT NVFP4 model is quantized with [NVIDIA Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer).
 This model is ready for commercial/non-commercial use.  <br>