fix: embed chat_template in tokenizer_config.json

#21

by NERDDISCO - opened 5 days ago

base: refs/heads/main

←

from: refs/pr/21

Discussion Files changed

-3

NERDDISCO

5 days ago

The chat_template field is missing from tokenizer_config.json. The template exists as a separate chat_template.jinja file, but AutoTokenizer.from_pretrained() only reads from tokenizer_config.json. This causes apply_chat_template() to fail in transformers.js and other non-Python tooling.

Gemma 2 and Gemma 3 models include this field correctly. This PR embeds the existing chat_template.jinja content into tokenizer_config.json so tokenizers can find it without needing a separate file loader.

Discovered while building wandler, an OpenAI-compatible inference server powered by transformers.js.

See also: https://huggingface.co/google/gemma-4-E2B-it/discussions/8 (same fix for E2B by @piero-atelico )

fix: embed chat_template in tokenizer_config.json65dfb12c

w4nderlust

5 days ago

I think this fix, like mine, should just be applied everywhere on all models . Although it seems the default now is the jinja approach https://github.com/huggingface/transformers/issues/45205 ut this change is not fully documented and it has been very confusing to a bunch of people (you and me included)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment