wenhua cheng
wenhuach
AI & ML interests
Model Compression, CV
Recent Activity
new activity 1 day ago
Intel/gemma-4-31B-it-int4-AutoRound:Installation Video and Testing - Step by Step updated a model 1 day ago
Intel/gemma-4-31B-it-int4-AutoRound new activity 1 day ago
Intel/gemma-4-26B-A4B-it-int4-mixed-AutoRound:GGUF versionOrganizations
Installation Video and Testing - Step by Step
π 2
4
#1 opened 2 days ago
by
fahdmirzac
GGUF version
1
#1 opened 2 days ago
by
limcheekin
Performance indicators
π 2
4
#1 opened 19 days ago
by
dehnhaide
This model always predicts some few nonsense sequences
8
#1 opened about 1 month ago
by
CharlesChen2023
Does the A100 work?
12
#1 opened about 1 month ago
by
xz123321
Thanks! And MTP key question
11
#1 opened about 1 month ago
by
seanthomaswilliams
Convert to gguf-q2ks-mixed-AutoRound?
π₯ 2
4
#2 opened 3 months ago
by
limcheekin
Qwen/Qwen3-Next-80B-A3B-Thinking has MMLU_PRO 82.7 but you guys get 0.7271
3
#2 opened 7 months ago
by
hlxxxxxx
AutoRound request: GLM-4.5-Air
1
#1 opened 3 months ago
by
babytifa
2507 Thinking model release
11
#4 opened 6 months ago
by
anjeysapkovski
How to use this kernel
#1 opened 3 months ago
by
wenhuach
Thinking version has been deleted?
1
#2 opened 4 months ago
by
reswewr
Improve model card: Add pipeline tag, library name, and update paper/citation
π₯ 1
#1 opened 4 months ago
by
nielsr
Could we get more w2a16 w3a16 and w4a16 Autoround
π 1
1
#1 opened 5 months ago
by
twhitworth
Practical performance feedback
1
#2 opened 5 months ago
by
maigonis
Works good with vLLM, just no tool calling
1
#1 opened 8 months ago
by
Ununnilium
Inference with llama.cpp + Open WebUI gives repeating `?`
4
#1 opened 6 months ago
by
whoisjeremylam
Adding `transformers` as the library tag
#3 opened 7 months ago
by
ariG23498
CPU only?
4
#2 opened 7 months ago
by
jujutechnology