Hannu Varjoranta
varjoranta
·
AI & ML interests
Weight and KV cache compression for production LLM serving. Building turboquant-plus-vllm.
Recent Activity
new activity about 13 hours ago
varjosoft/GLM-5.1-Open-TQ3:Pending GPU & vLLM validation updated a model 5 days ago
varjosoft/GLM-4.7-Flash-TQ3 published a model 5 days ago
varjosoft/GLM-4.7-Flash-TQ3