Text Generation
Transformers
Safetensors
PyTorch
English
nvidia
conversational

Value error, The checkpoint you are trying to load has model type `nemotron_h` but Transformers does not recognize this architecture.

#2
by kshinoda - opened

Thank you for your hard work to release such a great model!!

Unfortunately, the following error occurred when using this FP8 model with vllm.

Value error, The checkpoint you are trying to load has model type `nemotron_h` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

Even though I have updated transformers up to 4.52.4 following the error statement, the error can not be resolved.

Is this error due to the missing 'modeling_nemotron_h.py', which Nemotron-H-47B-Reasoning-128K (non-FP8 version) has?
I could successfully use Nemotron-H-47B-Reasoning-128K (non-FP8 version) with vllm.

This error occurred with the following environment.

  • vllm == 0.9.1
  • transformers == 4.52.4
  • CUDA == 12.6
  • torch == 2.7.0

Thank you for your support in advance.

Sign up or log in to comment