Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2504.19874

UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Paper • 2410.14059 • Published Oct 17, 2024 • 63
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

Paper • 2503.05179 • Published Mar 7, 2025 • 46
Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6, 2025 • 96
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published Mar 13, 2025 • 53

Compression Research

PolarQuant: Quantizing KV Caches with Polar Transformation

Paper • 2502.02617 • Published Feb 4, 2025 • 1
TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate

Paper • 2504.19874 • Published Apr 28, 2025 • 32

OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published Mar 10 • 151
Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights

Paper • 2603.12228 • Published Mar 12 • 12
Efficient Memory Management for Large Language Model Serving with PagedAttention

Paper • 2309.06180 • Published Sep 12, 2023 • 53
1-bit AI Infra: Part 1.1, Fast and Lossless BitNet b1.58 Inference on CPUs

Paper • 2410.16144 • Published Oct 21, 2024 • 5

Week1 by CP-axtra

Time series task

TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate

Paper • 2504.19874 • Published Apr 28, 2025 • 32

Quantization techniques

TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate

Paper • 2504.19874 • Published Apr 28, 2025 • 32

TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate

Paper • 2504.19874 • Published Apr 28, 2025 • 32

Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models

Paper • 2506.19697 • Published Jun 24, 2025 • 44
Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning

Paper • 2509.23873 • Published Sep 28, 2025 • 68
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Paper • 2510.00515 • Published Oct 1, 2025 • 42
SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights

Paper • 2509.22944 • Published Sep 26, 2025 • 81

UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Paper • 2410.14059 • Published Oct 17, 2024 • 63
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

Paper • 2503.05179 • Published Mar 7, 2025 • 46
Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6, 2025 • 96
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published Mar 13, 2025 • 53

Quantization techniques

TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate

Paper • 2504.19874 • Published Apr 28, 2025 • 32

Compression Research

PolarQuant: Quantizing KV Caches with Polar Transformation

Paper • 2502.02617 • Published Feb 4, 2025 • 1
TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate

Paper • 2504.19874 • Published Apr 28, 2025 • 32

TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate

Paper • 2504.19874 • Published Apr 28, 2025 • 32

OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published Mar 10 • 151
Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights

Paper • 2603.12228 • Published Mar 12 • 12
Efficient Memory Management for Large Language Model Serving with PagedAttention

Paper • 2309.06180 • Published Sep 12, 2023 • 53
1-bit AI Infra: Part 1.1, Fast and Lossless BitNet b1.58 Inference on CPUs

Paper • 2410.16144 • Published Oct 21, 2024 • 5

Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models

Paper • 2506.19697 • Published Jun 24, 2025 • 44
Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning

Paper • 2509.23873 • Published Sep 28, 2025 • 68
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Paper • 2510.00515 • Published Oct 1, 2025 • 42
SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights

Paper • 2509.22944 • Published Sep 26, 2025 • 81

Week1 by CP-axtra

Time series task

TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate

Paper • 2504.19874 • Published Apr 28, 2025 • 32

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs