-
CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging
Paper • 2503.01874 • Published • 1 -
Functionality-Oriented LLM Merging on the Fisher--Rao Manifold
Paper • 2603.04972 • Published • 3 -
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
Paper • 2406.11617 • Published • 10 -
FuseChat: Knowledge Fusion of Chat Models
Paper • 2408.07990 • Published • 15
Collections
Discover the best community collections!
Collections including paper arxiv:2403.19522
-
Evolutionary Optimization of Model Merging Recipes
Paper • 2403.13187 • Published • 58 -
Model Stock: All we need is just a few fine-tuned models
Paper • 2403.19522 • Published • 14 -
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
Paper • 2405.01535 • Published • 124
-
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning
Paper • 2310.20587 • Published • 18 -
SELF: Language-Driven Self-Evolution for Large Language Model
Paper • 2310.00533 • Published • 2 -
QLoRA: Efficient Finetuning of Quantized LLMs
Paper • 2305.14314 • Published • 61 -
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Paper • 2309.14717 • Published • 46
-
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Paper • 2403.13257 • Published • 22 -
Model Stock: All we need is just a few fine-tuned models
Paper • 2403.19522 • Published • 14 -
Mergenetic: a Simple Evolutionary Model Merging Library
Paper • 2505.11427 • Published • 14 -
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models
Paper • 2410.01335 • Published • 5
-
Demystifying CLIP Data
Paper • 2309.16671 • Published • 20 -
Model Stock: All we need is just a few fine-tuned models
Paper • 2403.19522 • Published • 14 -
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models
Paper • 2404.01367 • Published • 22 -
On the Scalability of Diffusion-based Text-to-Image Generation
Paper • 2404.02883 • Published • 19
-
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 191 -
Visual Instruction Tuning
Paper • 2304.08485 • Published • 21 -
Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering
Paper • 2403.09622 • Published • 17 -
Lumiere: A Space-Time Diffusion Model for Video Generation
Paper • 2401.12945 • Published • 86
-
CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging
Paper • 2503.01874 • Published • 1 -
Functionality-Oriented LLM Merging on the Fisher--Rao Manifold
Paper • 2603.04972 • Published • 3 -
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
Paper • 2406.11617 • Published • 10 -
FuseChat: Knowledge Fusion of Chat Models
Paper • 2408.07990 • Published • 15
-
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Paper • 2403.13257 • Published • 22 -
Model Stock: All we need is just a few fine-tuned models
Paper • 2403.19522 • Published • 14 -
Mergenetic: a Simple Evolutionary Model Merging Library
Paper • 2505.11427 • Published • 14 -
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models
Paper • 2410.01335 • Published • 5
-
Evolutionary Optimization of Model Merging Recipes
Paper • 2403.13187 • Published • 58 -
Model Stock: All we need is just a few fine-tuned models
Paper • 2403.19522 • Published • 14 -
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
Paper • 2405.01535 • Published • 124
-
Demystifying CLIP Data
Paper • 2309.16671 • Published • 20 -
Model Stock: All we need is just a few fine-tuned models
Paper • 2403.19522 • Published • 14 -
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models
Paper • 2404.01367 • Published • 22 -
On the Scalability of Diffusion-based Text-to-Image Generation
Paper • 2404.02883 • Published • 19
-
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning
Paper • 2310.20587 • Published • 18 -
SELF: Language-Driven Self-Evolution for Large Language Model
Paper • 2310.00533 • Published • 2 -
QLoRA: Efficient Finetuning of Quantized LLMs
Paper • 2305.14314 • Published • 61 -
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Paper • 2309.14717 • Published • 46
-
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 191 -
Visual Instruction Tuning
Paper • 2304.08485 • Published • 21 -
Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering
Paper • 2403.09622 • Published • 17 -
Lumiere: A Space-Time Diffusion Model for Video Generation
Paper • 2401.12945 • Published • 86