Daily paper that worth reading in details later
updated
Paper
• 2402.13144
• Published • 100
Genie: Generative Interactive Environments
Paper
• 2402.15391
• Published • 72
Sora: A Review on Background, Technology, Limitations, and Opportunities
of Large Vision Models
Paper
• 2402.17177
• Published • 87
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper
• 2403.00522
• Published • 46
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Paper
• 2403.03206
• Published • 71
Stealing Part of a Production Language Model
Paper
• 2403.06634
• Published • 91
Gemma: Open Models Based on Gemini Research and Technology
Paper
• 2403.08295
• Published • 50
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion
Distillation
Paper
• 2403.12015
• Published • 70
Mixture-of-Depths: Dynamically allocating compute in transformer-based
language models
Paper
• 2404.02258
• Published • 108
Leave No Context Behind: Efficient Infinite Context Transformers with
Infini-attention
Paper
• 2404.07143
• Published • 111
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your
Phone
Paper
• 2404.14219
• Published • 260
The Instruction Hierarchy: Training LLMs to Prioritize Privileged
Instructions
Paper
• 2404.13208
• Published • 40
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
Paper
• 2404.16710
• Published • 81
What matters when building vision-language models?
Paper
• 2405.02246
• Published • 104
RLHF Workflow: From Reward Modeling to Online RLHF
Paper
• 2405.07863
• Published • 71
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper
• 2405.09818
• Published • 134
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
Paper
• 2405.12981
• Published • 33
To Believe or Not to Believe Your LLM
Paper
• 2406.02543
• Published • 35
ShareGPT4Video: Improving Video Understanding and Generation with Better
Captions
Paper
• 2406.04325
• Published • 74
Long Context Transfer from Language to Vision
Paper
• 2406.16852
• Published • 33
LongIns: A Challenging Long-context Instruction-based Exam for LLMs
Paper
• 2406.17588
• Published • 23
PaliGemma: A versatile 3B VLM for transfer
Paper
• 2407.07726
• Published • 72