Interesting Papers - a pencaharlangit Collection

Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

pencaharlangit 's Collections

Interesting Papers

Interesting Papers

updated Mar 3

ReZero: Enhancing LLM search ability by trying one-more-time

Paper • 2504.11001 • Published Apr 15, 2025 • 16
FonTS: Text Rendering with Typography and Style Controls

Paper • 2412.00136 • Published Nov 28, 2024 • 1
GenEx: Generating an Explorable World

Paper • 2412.09624 • Published Dec 12, 2024 • 98
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 163
An Empirical Study of GPT-4o Image Generation Capabilities

Paper • 2504.05979 • Published Apr 8, 2025 • 64
OmniSVG: A Unified Scalable Vector Graphics Generation Model

Paper • 2504.06263 • Published Apr 8, 2025 • 186
DreamO: A Unified Framework for Image Customization

Paper • 2504.16915 • Published Apr 23, 2025 • 24
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning

Paper • 2504.14509 • Published Apr 20, 2025 • 53
Tina: Tiny Reasoning Models via LoRA

Paper • 2504.15777 • Published Apr 22, 2025 • 57
A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment

Paper • 2504.15585 • Published Apr 22, 2025 • 14
Personalized Text-to-Image Generation with Auto-Regressive Models

Paper • 2504.13162 • Published Apr 17, 2025 • 18
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Paper • 2504.17192 • Published Apr 24, 2025 • 124
MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing

Paper • 2505.02823 • Published May 5, 2025 • 5
Style Customization of Text-to-Vector Generation with Image Diffusion Priors

Paper • 2505.10558 • Published May 15, 2025 • 16
InstanceGen: Image Generation with Instance-level Instructions

Paper • 2505.05678 • Published May 8, 2025 • 7
Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis

Paper • 2505.09358 • Published May 14, 2025 • 27
SageAttention2++: A More Efficient Implementation of SageAttention2

Paper • 2505.21136 • Published May 27, 2025 • 45
OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data

Paper • 2505.18445 • Published May 24, 2025 • 63
ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and Reactive Feedback

Paper • 2505.17908 • Published May 23, 2025 • 3
Inverse Virtual Try-On: Generating Multi-Category Product-Style Images from Clothed Individuals

Paper • 2505.21062 • Published May 27, 2025 • 4
ARM: Adaptive Reasoning Model

Paper • 2505.20258 • Published May 26, 2025 • 45
Jodi: Unification of Visual Generation and Understanding via Joint Modeling

Paper • 2505.19084 • Published May 25, 2025 • 20
D-AR: Diffusion via Autoregressive Models

Paper • 2505.23660 • Published May 29, 2025 • 34
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers

Paper • 2505.23758 • Published May 29, 2025 • 22
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment

Paper • 2505.18600 • Published May 24, 2025 • 49
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

Paper • 2506.03147 • Published Jun 3, 2025 • 58
RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers

Paper • 2506.02528 • Published Jun 3, 2025 • 16
Native-Resolution Image Synthesis

Paper • 2506.03131 • Published Jun 3, 2025 • 18
Voyager: Long-Range and World-Consistent Video Diffusion for Explorable 3D Scene Generation

Paper • 2506.04225 • Published Jun 4, 2025 • 28
Image Editing As Programs with Diffusion Models

Paper • 2506.04158 • Published Jun 4, 2025 • 24
SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training

Paper • 2506.05301 • Published Jun 5, 2025 • 59
Surfer-H Meets Holo1: Cost-Efficient Web Agent Powered by Open Weights

Paper • 2506.02865 • Published Jun 3, 2025 • 34
FlexPainter: Flexible and Multi-View Consistent Texture Generation

Paper • 2506.02620 • Published Jun 3, 2025 • 14
SkyReels-Audio: Omni Audio-Conditioned Talking Portraits in Video Diffusion Transformers

Paper • 2506.00830 • Published Jun 1, 2025 • 7
MARBLE: Material Recomposition and Blending in CLIP-Space

Paper • 2506.05313 • Published Jun 5, 2025 • 2
Text-Aware Image Restoration with Diffusion Models

Paper • 2506.09993 • Published Jun 11, 2025 • 45
AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation

Paper • 2506.10540 • Published Jun 12, 2025 • 37
PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework

Paper • 2506.10741 • Published Jun 12, 2025 • 27
DreamActor-H1: High-Fidelity Human-Product Demonstration Video Generation via Motion-designed Diffusion Transformers

Paper • 2506.10568 • Published Jun 12, 2025 • 8
PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models

Paper • 2506.16054 • Published Jun 19, 2025 • 60
Align Your Flow: Scaling Continuous-Time Flow Map Distillation

Paper • 2506.14603 • Published Jun 17, 2025 • 19
Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression

Paper • 2506.09482 • Published Jun 11, 2025 • 45
Auto-Regressively Generating Multi-View Consistent Images

Paper • 2506.18527 • Published Jun 23, 2025 • 8
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Paper • 2507.13344 • Published Jul 17, 2025 • 59
AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs

Paper • 2507.08616 • Published Jul 11, 2025 • 15
Neural-Driven Image Editing

Paper • 2507.05397 • Published Jul 7, 2025 • 27
S^2-Guidance: Stochastic Self Guidance for Training-Free Enhancement of Diffusion Models

Paper • 2508.12880 • Published Aug 18, 2025 • 48
DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 305
Thyme: Think Beyond Images

Paper • 2508.11630 • Published Aug 15, 2025 • 81
NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale

Paper • 2508.10711 • Published Aug 14, 2025 • 146
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published Aug 7, 2025 • 142
VertexRegen: Mesh Generation with Continuous Level of Detail

Paper • 2508.09062 • Published Aug 12, 2025 • 38
Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14, 2025 • 127

Collection guide
Browse collections

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs