Prince Canuma's picture

Building on HF

Prince Canuma PRO

prince-canuma

·

AI & ML interests

None yet

Recent Activity

updated a model about 22 hours ago

mlx-community/Qwen3.6-35B-A3B-mxfp8

updated a model about 22 hours ago

mlx-community/Qwen3.6-35B-A3B-mxfp4

updated a model about 23 hours ago

mlx-community/Qwen3.6-35B-A3B-nvfp4

View all activity

Organizations

upvoted 2 papers 3 days ago

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Paper • 2604.10098 • Published 7 days ago • 74

The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping

Paper • 2604.11297 • Published 5 days ago • 135

upvoted 5 papers 11 days ago

Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models

Paper • 2503.16257 • Published Mar 20, 2025 • 28

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

Paper • 2402.02750 • Published Feb 5, 2024 • 5

Token Warping Helps MLLMs Look from Nearby Viewpoints

Paper • 2604.02870 • Published 15 days ago • 33

Self-Distilled RLVR

Paper • 2604.03128 • Published 15 days ago • 160

Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters

Paper • 2406.05955 • Published Jun 10, 2024 • 28

upvoted a paper about 2 months ago

GLM-5: from Vibe Coding to Agentic Engineering

Paper • 2602.15763 • Published Feb 17 • 144

upvoted an article 3 months ago

Article

Scaling Real-Time Voice Agents with Cache-Aware Streaming ASR

Jan 5

•

85

upvoted a collection 3 months ago

Nemotron Speech

Open, state-of-the-art, production‑ready enterprise speech models from the NVIDIA Speech research team for ASR, TTS, Speaker Diarization and S2S • 12 items • Updated 3 days ago • 46

upvoted 2 articles 3 months ago

Article

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

+4

Dec 18, 2025

•

124

Article

NVIDIA brings agents to life with DGX Spark and Reachy Mini

+1

Jan 5

•

66

upvoted a collection 5 months ago

INTELLECT 3

5 items • Updated Nov 27, 2025 • 1

upvoted a collection 7 months ago

EmbeddingGemma

7 items • Updated Sep 4, 2025 • 4

upvoted 2 collections 8 months ago

Gemma 3-270m

20 items • Updated Aug 14, 2025 • 6

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory. • 29 items • Updated Aug 14, 2025 • 32

upvoted a collection 11 months ago

Perception Encoder

16 items • Updated Mar 2 • 80

upvoted 3 collections 12 months ago

LLaMA-Omni

13 items • Updated May 17, 2025 • 20

VideoChat-R1

VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning • 4 items • Updated Sep 28, 2025 • 9

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated Mar 12 • 218