Collections
Discover the best community collections!
Collections including paper arxiv:2506.07900
-
Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts
Paper • 2508.07785 • Published • 29 -
MoBE: Mixture-of-Basis-Experts for Compressing MoE-based LLMs
Paper • 2508.05257 • Published • 13 -
SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment
Paper • 2507.20984 • Published • 58 -
MiniCPM4: Ultra-Efficient LLMs on End Devices
Paper • 2506.07900 • Published • 96
-
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
Paper • 2412.11605 • Published • 18 -
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper • 2412.09871 • Published • 108 -
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization
Paper • 2412.17739 • Published • 41 -
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval
Paper • 2412.15443 • Published • 10
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 60 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 53 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 45 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 64
-
FAN: Fourier Analysis Networks
Paper • 2410.02675 • Published • 29 -
Tensor Product Attention Is All You Need
Paper • 2501.06425 • Published • 90 -
Scalable-Softmax Is Superior for Attention
Paper • 2501.19399 • Published • 25 -
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling
Paper • 2502.09509 • Published • 9
-
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Paper • 2402.17485 • Published • 194 -
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding
Paper • 2403.15377 • Published • 29 -
nvidia/parakeet-tdt-0.6b-v2
Automatic Speech Recognition • Updated • 171k • 1.46k -
google/gemma-3n-E4B-it-litert-preview
Image-Text-to-Text • Updated • 1.48k
-
Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts
Paper • 2508.07785 • Published • 29 -
MoBE: Mixture-of-Basis-Experts for Compressing MoE-based LLMs
Paper • 2508.05257 • Published • 13 -
SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment
Paper • 2507.20984 • Published • 58 -
MiniCPM4: Ultra-Efficient LLMs on End Devices
Paper • 2506.07900 • Published • 96
-
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
Paper • 2412.11605 • Published • 18 -
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper • 2412.09871 • Published • 108 -
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization
Paper • 2412.17739 • Published • 41 -
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval
Paper • 2412.15443 • Published • 10
-
FAN: Fourier Analysis Networks
Paper • 2410.02675 • Published • 29 -
Tensor Product Attention Is All You Need
Paper • 2501.06425 • Published • 90 -
Scalable-Softmax Is Superior for Attention
Paper • 2501.19399 • Published • 25 -
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling
Paper • 2502.09509 • Published • 9
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 60 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 53 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 45 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 64
-
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Paper • 2402.17485 • Published • 194 -
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding
Paper • 2403.15377 • Published • 29 -
nvidia/parakeet-tdt-0.6b-v2
Automatic Speech Recognition • Updated • 171k • 1.46k -
google/gemma-3n-E4B-it-litert-preview
Image-Text-to-Text • Updated • 1.48k