Update modeling_baichuan.py to handle empty Transformers cache objects in generation

#20

by sylwia-kuros - opened 22 days ago

base: refs/heads/main

←

from: refs/pr/20

Discussion Files changed

+28

-1

sylwia-kuros

22 days ago

Recent Transformers versions can pass a cache object into generate() before any KV tensors have actually been populated. Baichuan remote code still assumes legacy tuple-style past_key_values and treats any truthy cache as populated.

This causes two problems during generation:

BaichuanModel.forward() reads past_key_values[0][0].shape[2] without checking whether the first cached key is real.
prepare_inputs_for_generation() slices input_ids down to the last token whenever past_key_values is truthy, even if the cache is only an empty placeholder.

Update modeling_baichuan.py to handle empty Transformers cache objects in generationca34a2ca

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment