pencaharlangit 's Collections Interesting Papers
updated
ReZero: Enhancing LLM search ability by trying one-more-time
Paper
• 2504.11001
• Published • 16
FonTS: Text Rendering with Typography and Style Controls
Paper
• 2412.00136
• Published • 1
GenEx: Generating an Explorable World
Paper
• 2412.09624
• Published • 98
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for
Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper
• 2412.13663
• Published • 163
An Empirical Study of GPT-4o Image Generation Capabilities
Paper
• 2504.05979
• Published • 64
OmniSVG: A Unified Scalable Vector Graphics Generation Model
Paper
• 2504.06263
• Published • 186
DreamO: A Unified Framework for Image Customization
Paper
• 2504.16915
• Published • 24
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via
Triplet ID Group Learning
Paper
• 2504.14509
• Published • 53
Tina: Tiny Reasoning Models via LoRA
Paper
• 2504.15777
• Published • 57
A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training
and Deployment
Paper
• 2504.15585
• Published • 14
Personalized Text-to-Image Generation with Auto-Regressive Models
Paper
• 2504.13162
• Published • 18
Paper2Code: Automating Code Generation from Scientific Papers in Machine
Learning
Paper
• 2504.17192
• Published • 124
MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset
via Attention Routing
Paper
• 2505.02823
• Published • 5
Style Customization of Text-to-Vector Generation with Image Diffusion
Priors
Paper
• 2505.10558
• Published • 16
InstanceGen: Image Generation with Instance-level Instructions
Paper
• 2505.05678
• Published • 7
Marigold: Affordable Adaptation of Diffusion-Based Image Generators for
Image Analysis
Paper
• 2505.09358
• Published • 27
SageAttention2++: A More Efficient Implementation of SageAttention2
Paper
• 2505.21136
• Published • 45
OmniConsistency: Learning Style-Agnostic Consistency from Paired
Stylization Data
Paper
• 2505.18445
• Published • 63
ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and
Reactive Feedback
Paper
• 2505.17908
• Published • 3
Inverse Virtual Try-On: Generating Multi-Category Product-Style Images
from Clothed Individuals
Paper
• 2505.21062
• Published • 4
ARM: Adaptive Reasoning Model
Paper
• 2505.20258
• Published • 45
Jodi: Unification of Visual Generation and Understanding via Joint
Modeling
Paper
• 2505.19084
• Published • 20
D-AR: Diffusion via Autoregressive Models
Paper
• 2505.23660
• Published • 34
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with
Rectified Flow Transformers
Paper
• 2505.23758
• Published • 22
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and
Preference Alignment
Paper
• 2505.18600
• Published • 49
UniWorld: High-Resolution Semantic Encoders for Unified Visual
Understanding and Generation
Paper
• 2506.03147
• Published • 58
RelationAdapter: Learning and Transferring Visual Relation with
Diffusion Transformers
Paper
• 2506.02528
• Published • 16
Native-Resolution Image Synthesis
Paper
• 2506.03131
• Published • 18
Voyager: Long-Range and World-Consistent Video Diffusion for Explorable
3D Scene Generation
Paper
• 2506.04225
• Published • 28
Image Editing As Programs with Diffusion Models
Paper
• 2506.04158
• Published • 24
SeedVR2: One-Step Video Restoration via Diffusion Adversarial
Post-Training
Paper
• 2506.05301
• Published • 59
Surfer-H Meets Holo1: Cost-Efficient Web Agent Powered by Open Weights
Paper
• 2506.02865
• Published • 34
FlexPainter: Flexible and Multi-View Consistent Texture Generation
Paper
• 2506.02620
• Published • 14
SkyReels-Audio: Omni Audio-Conditioned Talking Portraits in Video
Diffusion Transformers
Paper
• 2506.00830
• Published • 7
MARBLE: Material Recomposition and Blending in CLIP-Space
Paper
• 2506.05313
• Published • 2
Text-Aware Image Restoration with Diffusion Models
Paper
• 2506.09993
• Published • 45
AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven
Clip Generation
Paper
• 2506.10540
• Published • 37
PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a
Unified Framework
Paper
• 2506.10741
• Published • 27
DreamActor-H1: High-Fidelity Human-Product Demonstration Video
Generation via Motion-designed Diffusion Transformers
Paper
• 2506.10568
• Published • 8
PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and
Quantized Attention in Visual Generation Models
Paper
• 2506.16054
• Published • 60
Align Your Flow: Scaling Continuous-Time Flow Map Distillation
Paper
• 2506.14603
• Published • 19
Marrying Autoregressive Transformer and Diffusion with Multi-Reference
Autoregression
Paper
• 2506.09482
• Published • 45
Auto-Regressively Generating Multi-View Consistent Images
Paper
• 2506.18527
• Published • 8
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos
with Spatio-Temporal Diffusion Models
Paper
• 2507.13344
• Published • 59
AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs
Paper
• 2507.08616
• Published • 15
Neural-Driven Image Editing
Paper
• 2507.05397
• Published • 27
S^2-Guidance: Stochastic Self Guidance for Training-Free Enhancement of
Diffusion Models
Paper
• 2508.12880
• Published • 48
Paper
• 2508.10104
• Published • 305
Thyme: Think Beyond Images
Paper
• 2508.11630
• Published • 81
NextStep-1: Toward Autoregressive Image Generation with Continuous
Tokens at Scale
Paper
• 2508.10711
• Published • 146
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
Paper
• 2508.05748
• Published • 142
VertexRegen: Mesh Generation with Continuous Level of Detail
Paper
• 2508.09062
• Published • 38
Large Language Diffusion Models
Paper
• 2502.09992
• Published • 127