Zhiyu Cheng's picture

Zhiyu Cheng

zhiyucheng

nvidia

·

AI & ML interests

None yet

Recent Activity

new activity about 14 hours ago

nvidia/Gemma-4-31B-IT-NVFP4:Update chat_template.jinja

new activity about 14 hours ago

nvidia/Gemma-4-31B-IT-NVFP4:Update tokenizer_config.json

updated a model 3 days ago

nvidia/MiniMax-M2.5-NVFP4

View all activity

Organizations

New activity in nvidia/Gemma-4-31B-IT-NVFP4 about 14 hours ago

Update chat_template.jinja

#11 opened 1 day ago by

Update tokenizer_config.json

#12 opened 1 day ago by

New activity in nvidia/Gemma-4-31B-IT-NVFP4 3 days ago

Update README.md

#10 opened 3 days ago by

tstarkey-nvidia

New activity in nvidia/GLM-5-NVFP4 8 days ago

Update README.md

#5 opened 12 days ago by

New activity in nvidia/Kimi-K2.5-NVFP4 24 days ago

Update model card with evaluation results

#6 opened 24 days ago by

New activity in nvidia/Kimi-K2.5-NVFP4 about 2 months ago

Fix: add .model after language_model in quantization ignore/exclude_modules

#5 opened about 2 months ago by

Fix: add .model after language_model in quantization ignore/exclude_modules

#4 opened about 2 months ago by

New activity in nvidia/Kimi-K2-Thinking-NVFP4 2 months ago

Transformers v5 support

#3 opened 2 months ago by

New activity in nvidia/Llama-4-Scout-17B-16E-Instruct-NVFP4 4 months ago

update config for exclude modules

#3 opened 4 months ago by

New activity in nvidia/Llama-4-Scout-17B-16E-Instruct-FP8 4 months ago

update config for exclude modules

#1 opened 4 months ago by

New activity in nvidia/Qwen2.5-VL-7B-Instruct-FP8 4 months ago

update config for exclude modules

#3 opened 4 months ago by

New activity in nvidia/Qwen2.5-VL-7B-Instruct-NVFP4 4 months ago

Use actual module path in ignore

#2 opened 4 months ago by

New activity in nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-FP8 6 months ago

Update README.md

#3 opened 6 months ago by

New activity in nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-NVFP4-QAD 6 months ago

Update README.md

#2 opened 6 months ago by

New activity in nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16 6 months ago

Update README.md

#2 opened 6 months ago by

New activity in nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1-FP4-QAD 7 months ago

Update README.md

#1 opened 7 months ago by

New activity in nvidia/Llama-3.3-70B-Instruct-FP8 12 months ago

Update README.md

#2 opened 12 months ago by

RestingCodeFace

Update README.md

#1 opened 12 months ago by

New activity in nvidia/DeepSeek-R1-NVFP4 about 1 year ago

Request for Detailed Benchmarking Setup with TensorRT-LLM on B200

#6 opened about 1 year ago by

Benchmark results compared to orig fp8 / int4 quants etc?

#1 opened about 1 year ago by