Instructions to use tencent/Hy3-preview with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use tencent/Hy3-preview with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="tencent/Hy3-preview")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("tencent/Hy3-preview")
model = AutoModelForCausalLM.from_pretrained("tencent/Hy3-preview")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use tencent/Hy3-preview with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "tencent/Hy3-preview"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "tencent/Hy3-preview",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/tencent/Hy3-preview

SGLang

How to use tencent/Hy3-preview with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "tencent/Hy3-preview" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "tencent/Hy3-preview",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "tencent/Hy3-preview" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "tencent/Hy3-preview",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use tencent/Hy3-preview with Docker Model Runner:
```
docker model run hf.co/tencent/Hy3-preview
```

Special Token Disaster: Your Tech Lead Has Zero Design Taste

by ytgui - opened 13 days ago

Discussion

ytgui

13 days ago

chat_template.jinja

{#- ----------‑‑‑ special token variables ‑‑‑---------- -#}
{%- set bos_token = '<｜hy_begin▁of▁sentence｜>' %}
{%- set pad_token = '<｜hy_▁pad▁｜>' %}
{%- set user_token = '<｜hy_User｜>' %}

See <｜hy_begin▁of▁sentence｜> and <｜hy_▁pad▁｜>, | and ｜ and _ and ▁

The special token design for Hunyuan is a visual disaster that screams "zero design taste" from leadership. Using a bloated mix of fullwidth pipes (｜) and obscure geometric blocks (▁) doesn't make the model look high-tech, it makes it look like a corrupted encoding error.
A tech leader with an actual eye for polish understands that functional infrastructure should be clean and harmonious. Instead, we got a syntax that creates jagged, uneven visual noise.

Please take back your shit.

0xSero

13 days ago

That the hell lol, thank you for sharing such a nice model <3 don’t listen here

yiqichen01

Tencent org 12 days ago

Hi, thanks for the feedback!

The special token design using fullwidth pipes (｜) and block characters (▁) is actually an intentional engineering decision rather than an oversight. During pretraining and continual training, the model is trained on massive, diverse corpora where conventional special tokens like or <|im_start|> frequently appear as plain text. These collisions make it ambiguous whether a token is a genuine control signal or just content, which can degrade model behavior. Using visually distinctive Unicode characters significantly reduces collision probability and ensures a clean separation between control tokens and content.

It's also worth noting that these special tokens are handled internally by the tokenizer and chat template, so they should be completely transparent to end users and developers during normal usage — you won't need to type or deal with them directly.

That said, we completely understand the ergonomic concerns. The current token set in Hy3-preview prioritizes robustness, but we're actively working on an optimized version in a future release that better balances collision resistance with readability and developer experience. Stay tuned!

Thanks again for the candid feedback — it's genuinely appreciated.

ytgui

11 days ago

Hi, thanks for the feedback!

The special token design using fullwidth pipes (｜) and block characters (▁) is actually an intentional engineering decision rather than an oversight. During pretraining and continual training, the model is trained on massive, diverse corpora where conventional special tokens like or <|im_start|> frequently appear as plain text. These collisions make it ambiguous whether a token is a genuine control signal or just content, which can degrade model behavior. Using visually distinctive Unicode characters significantly reduces collision probability and ensures a clean separation between control tokens and content.

It's also worth noting that these special tokens are handled internally by the tokenizer and chat template, so they should be completely transparent to end users and developers during normal usage — you won't need to type or deal with them directly.

That said, we completely understand the ergonomic concerns. The current token set in Hy3-preview prioritizes robustness, but we're actively working on an optimized version in a future release that better balances collision resistance with readability and developer experience. Stay tuned!

Thanks again for the candid feedback — it's genuinely appreciated.

totally unconvincing, when you see "<|im_start|>" in your pre-training corpus, you should parse that data and convert it to conversational format.

Semisol

10 days ago

•

edited 10 days ago

@yiqichen01
It is possible to rename special tokens without impacting the model at all, by modifying the tokenizer.json files! It is the token ID that matters, as this is a special token without any merges.
As long as any downstream users do not use their own chat template, this is a non-breaking change.

It may be best to consider this in the next preview or the full release of the model, compared to breaking the model in a new rev.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment