Text Generation
Transformers
Safetensors
PyTorch
nemotron_h
nvidia
nemotron-3
latent-moe
mtp
conversational
custom_code
Eval Results

how to run on A100?

#15
by mark2000 - opened

We tried to run on A100X2 80GB, but failed.

Same, failed with nightly sglang and vllm. I got the vllm version to work but it spouted gibberish.

use llama.cpp

Sign up or log in to comment