Text Generation
Transformers
Safetensors
PyTorch
nemotron_h
nvidia
conversational
custom_code
Eval Results

Correct `get_decoder`/`set_decoder`

#40
No description provided.

All *ForCausalLM models have set_decoder and get_decoder methods which point to the actual decoder of the underlying transformer. Typically the get_decoder method points to the self.model attribute, however for the NemotronHForCausalLM, no such attribute exists, as the module function as the decoder is named backbone for this model. It would be useful if the get_decoder method pointed to backbone by default, as it would maintain a more consistent interface laid out by *ForCausalLM Transformer models.

kylemylonakisprotopia changed pull request status to open

All *ForCausalLM models have set_decoder and get_decoder methods which point to the actual decoder of the underlying transformer. Typically the get_decoder method points to the self.model attribute, however for the NemotronHForCausalLM, no such attribute exists, as the module function as the decoder is named backbone for this model. It would be useful if the get_decoder method pointed to backbone by default, as it would maintain a more consistent interface laid out by *ForCausalLM Transformer models.

hi @kylemylonakisprotopia - slightly offtopic im trying to reach out to you about: https://github.com/huggingface/transformers/pull/42901 - would you be able to reply there and provide infos?

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment