Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.18058

Xtra-Computing/XtraGPT-14B

Text Generation • Updated Dec 8, 2025 • 1.26k • 3
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development

Paper • 2601.11077 • Published Jan 16 • 67
Molecular Contrastive Learning with Chemical Element Knowledge Graph

Paper • 2112.00544 • Published Dec 1, 2021 • 1
Self-Demos: Eliciting Out-of-Demonstration Generalizability in Large Language Models

Paper • 2404.00884 • Published Apr 1, 2024 • 1

LLoCO: Learning Long Contexts Offline

Paper • 2404.07979 • Published Apr 11, 2024 • 22
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21, 2024 • 116
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration

Paper • 2402.11550 • Published Feb 18, 2024 • 19
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31, 2024 • 24

Data-efficient LLMs

dataset pruning for advancing the capabilities of LLMs

Effective pruning of web-scale datasets based on complexity of concept clusters

Paper • 2401.04578 • Published Jan 9, 2024
How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15, 2024 • 43
A Survey on Data Selection for LLM Instruction Tuning

Paper • 2402.05123 • Published Feb 4, 2024 • 3
LESS: Selecting Influential Data for Targeted Instruction Tuning

Paper • 2402.04333 • Published Feb 6, 2024 • 3

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Paper • 2402.00159 • Published Jan 31, 2024 • 65
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31, 2024 • 24

LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Paper • 2311.05556 • Published Nov 9, 2023 • 86
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31, 2024 • 24
Efficient Tool Use with Chain-of-Abstraction Reasoning

Paper • 2401.17464 • Published Jan 30, 2024 • 21
Transfer Learning for Text Diffusion Models

Paper • 2401.17181 • Published Jan 30, 2024 • 17

Alignment and Unlearning

Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15, 2024 • 90
Aligning Teacher with Student Preferences for Tailored Training Data Generation

Paper • 2406.19227 • Published Jun 27, 2024 • 25
Self-Play Preference Optimization for Language Model Alignment

Paper • 2405.00675 • Published May 1, 2024 • 28
CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues

Paper • 2404.03820 • Published Apr 4, 2024 • 25

Interesting things.

AtP*: An efficient and scalable method for localizing LLM behaviour to components

Paper • 2403.00745 • Published Mar 1, 2024 • 14
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT

Paper • 2402.16840 • Published Feb 26, 2024 • 25
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21, 2024 • 116

E^2-LLM: Efficient and Extreme Length Extension of Large Language Models

Paper • 2401.06951 • Published Jan 13, 2024 • 26
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Paper • 2401.01325 • Published Jan 2, 2024 • 27
Extending LLMs' Context Window with 100 Samples

Paper • 2401.07004 • Published Jan 13, 2024 • 16
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31, 2024 • 24

LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31, 2024 • 24

My (Denis Gordeev) collection of mostly NLP papers. You can message me at t.me/nlp_party

LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31, 2024 • 24
Efficient Tool Use with Chain-of-Abstraction Reasoning

Paper • 2401.17464 • Published Jan 30, 2024 • 21
Scavenging Hyena: Distilling Transformers into Long Convolution Models

Paper • 2401.17574 • Published Jan 31, 2024 • 17
Rethinking Interpretability in the Era of Large Language Models

Paper • 2402.01761 • Published Jan 30, 2024 • 23

Xtra-Computing/XtraGPT-14B

Text Generation • Updated Dec 8, 2025 • 1.26k • 3
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development

Paper • 2601.11077 • Published Jan 16 • 67
Molecular Contrastive Learning with Chemical Element Knowledge Graph

Paper • 2112.00544 • Published Dec 1, 2021 • 1
Self-Demos: Eliciting Out-of-Demonstration Generalizability in Large Language Models

Paper • 2404.00884 • Published Apr 1, 2024 • 1

Alignment and Unlearning

Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15, 2024 • 90
Aligning Teacher with Student Preferences for Tailored Training Data Generation

Paper • 2406.19227 • Published Jun 27, 2024 • 25
Self-Play Preference Optimization for Language Model Alignment

Paper • 2405.00675 • Published May 1, 2024 • 28
CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues

Paper • 2404.03820 • Published Apr 4, 2024 • 25

LLoCO: Learning Long Contexts Offline

Paper • 2404.07979 • Published Apr 11, 2024 • 22
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21, 2024 • 116
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration

Paper • 2402.11550 • Published Feb 18, 2024 • 19
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31, 2024 • 24

Interesting things.

AtP*: An efficient and scalable method for localizing LLM behaviour to components

Paper • 2403.00745 • Published Mar 1, 2024 • 14
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 628
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT

Paper • 2402.16840 • Published Feb 26, 2024 • 25
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21, 2024 • 116

Data-efficient LLMs

dataset pruning for advancing the capabilities of LLMs

Effective pruning of web-scale datasets based on complexity of concept clusters

Paper • 2401.04578 • Published Jan 9, 2024
How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15, 2024 • 43
A Survey on Data Selection for LLM Instruction Tuning

Paper • 2402.05123 • Published Feb 4, 2024 • 3
LESS: Selecting Influential Data for Targeted Instruction Tuning

Paper • 2402.04333 • Published Feb 6, 2024 • 3

E^2-LLM: Efficient and Extreme Length Extension of Large Language Models

Paper • 2401.06951 • Published Jan 13, 2024 • 26
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Paper • 2401.01325 • Published Jan 2, 2024 • 27
Extending LLMs' Context Window with 100 Samples

Paper • 2401.07004 • Published Jan 13, 2024 • 16
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31, 2024 • 24

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Paper • 2402.00159 • Published Jan 31, 2024 • 65
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31, 2024 • 24

LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31, 2024 • 24

LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Paper • 2311.05556 • Published Nov 9, 2023 • 86
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31, 2024 • 24
Efficient Tool Use with Chain-of-Abstraction Reasoning

Paper • 2401.17464 • Published Jan 30, 2024 • 21
Transfer Learning for Text Diffusion Models

Paper • 2401.17181 • Published Jan 30, 2024 • 17

My (Denis Gordeev) collection of mostly NLP papers. You can message me at t.me/nlp_party

LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31, 2024 • 24
Efficient Tool Use with Chain-of-Abstraction Reasoning

Paper • 2401.17464 • Published Jan 30, 2024 • 21
Scavenging Hyena: Distilling Transformers into Long Convolution Models

Paper • 2401.17574 • Published Jan 31, 2024 • 17
Rethinking Interpretability in the Era of Large Language Models

Paper • 2402.01761 • Published Jan 30, 2024 • 23

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs