Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2405.09673

Low-rank attention

SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference

Paper • 2502.18137 • Published Feb 25, 2025 • 60
XAttention: Block Sparse Attention with Antidiagonal Scoring

Paper • 2503.16428 • Published Mar 20, 2025 • 15
On the Benefits of Rank in Attention Layers

Paper • 2407.16153 • Published Jul 23, 2024
Towards Understanding the Nature of Attention with Low-Rank Sparse Decomposition

Paper • 2504.20938 • Published Apr 29, 2025

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 120
Scaling Laws for Neural Language Models

Paper • 2001.08361 • Published Jan 23, 2020 • 10
RoFormer: Enhanced Transformer with Rotary Position Embedding

Paper • 2104.09864 • Published Apr 20, 2021 • 17
LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 91

MotionLLM: Understanding Human Behaviors from Human Motions and Videos

Paper • 2405.20340 • Published May 30, 2024 • 20
Spectrally Pruned Gaussian Fields with Neural Compensation

Paper • 2405.00676 • Published May 1, 2024 • 10
Paint by Inpaint: Learning to Add Image Objects by Removing Them First

Paper • 2404.18212 • Published Apr 28, 2024 • 30
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29, 2024 • 122

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 91
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29, 2024 • 122
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training

Paper • 2309.10400 • Published Sep 19, 2023 • 26
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms

Paper • 2410.18967 • Published Oct 24, 2024 • 1

LLM fine tuning

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 91

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29, 2024 • 122
LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 91

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10, 2024 • 71
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning

Paper • 2406.06469 • Published Jun 10, 2024 • 29
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models

Paper • 2406.04271 • Published Jun 6, 2024 • 29
Block Transformer: Global-to-Local Language Modeling for Fast Inference

Paper • 2406.02657 • Published Jun 4, 2024 • 41

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 91

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 91

Training models

collection of papers on how train model efficiently (and sometimes on poor gpu setup)

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 91

Low-rank attention

SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference

Paper • 2502.18137 • Published Feb 25, 2025 • 60
XAttention: Block Sparse Attention with Antidiagonal Scoring

Paper • 2503.16428 • Published Mar 20, 2025 • 15
On the Benefits of Rank in Attention Layers

Paper • 2407.16153 • Published Jul 23, 2024
Towards Understanding the Nature of Attention with Low-Rank Sparse Decomposition

Paper • 2504.20938 • Published Apr 29, 2025

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29, 2024 • 122
LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 91

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 120
Scaling Laws for Neural Language Models

Paper • 2001.08361 • Published Jan 23, 2020 • 10
RoFormer: Enhanced Transformer with Rotary Position Embedding

Paper • 2104.09864 • Published Apr 20, 2021 • 17
LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 91

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10, 2024 • 71
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning

Paper • 2406.06469 • Published Jun 10, 2024 • 29
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models

Paper • 2406.04271 • Published Jun 6, 2024 • 29
Block Transformer: Global-to-Local Language Modeling for Fast Inference

Paper • 2406.02657 • Published Jun 4, 2024 • 41

MotionLLM: Understanding Human Behaviors from Human Motions and Videos

Paper • 2405.20340 • Published May 30, 2024 • 20
Spectrally Pruned Gaussian Fields with Neural Compensation

Paper • 2405.00676 • Published May 1, 2024 • 10
Paint by Inpaint: Learning to Add Image Objects by Removing Them First

Paper • 2404.18212 • Published Apr 28, 2024 • 30
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29, 2024 • 122

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 91

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 91
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29, 2024 • 122
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training

Paper • 2309.10400 • Published Sep 19, 2023 • 26
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms

Paper • 2410.18967 • Published Oct 24, 2024 • 1

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 91

LLM fine tuning

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 91

Training models

collection of papers on how train model efficiently (and sometimes on poor gpu setup)

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 91

Previous
1
2
3
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs