Collections
Discover the best community collections!
Collections including paper arxiv:2402.17177
-
Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era
Paper β’ 2305.06131 β’ Published β’ 2 -
Perpetual Humanoid Control for Real-time Simulated Avatars
Paper β’ 2305.06456 β’ Published β’ 1 -
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold
Paper β’ 2305.10973 β’ Published β’ 39 -
LDM3D: Latent Diffusion Model for 3D
Paper β’ 2305.10853 β’ Published β’ 13
-
Brain2Music: Reconstructing Music from Human Brain Activity
Paper β’ 2307.11078 β’ Published β’ 42 -
Decoding speech from non-invasive brain recordings
Paper β’ 2208.12266 β’ Published β’ 4 -
Seeing through the Brain: Image Reconstruction of Visual Perception from Human Brain Signals
Paper β’ 2308.02510 β’ Published β’ 23 -
DreamDiffusion: Generating High-Quality Images from Brain EEG Signals
Paper β’ 2306.16934 β’ Published β’ 32
-
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Paper β’ 2402.17177 β’ Published β’ 87 -
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Paper β’ 2402.17485 β’ Published β’ 194 -
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper β’ 2403.00522 β’ Published β’ 46 -
PixArt-Ξ£: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
Paper β’ 2403.04692 β’ Published β’ 40
-
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Paper β’ 2402.17177 β’ Published β’ 87 -
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework
Paper β’ 2403.13248 β’ Published β’ 78 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper β’ 2311.05437 β’ Published β’ 51 -
UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Paper β’ 2409.20551 β’ Published β’ 14
-
Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era
Paper β’ 2305.06131 β’ Published β’ 2 -
Perpetual Humanoid Control for Real-time Simulated Avatars
Paper β’ 2305.06456 β’ Published β’ 1 -
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold
Paper β’ 2305.10973 β’ Published β’ 39 -
LDM3D: Latent Diffusion Model for 3D
Paper β’ 2305.10853 β’ Published β’ 13
-
Brain2Music: Reconstructing Music from Human Brain Activity
Paper β’ 2307.11078 β’ Published β’ 42 -
Decoding speech from non-invasive brain recordings
Paper β’ 2208.12266 β’ Published β’ 4 -
Seeing through the Brain: Image Reconstruction of Visual Perception from Human Brain Signals
Paper β’ 2308.02510 β’ Published β’ 23 -
DreamDiffusion: Generating High-Quality Images from Brain EEG Signals
Paper β’ 2306.16934 β’ Published β’ 32
-
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Paper β’ 2402.17177 β’ Published β’ 87 -
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Paper β’ 2402.17485 β’ Published β’ 194 -
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper β’ 2403.00522 β’ Published β’ 46 -
PixArt-Ξ£: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
Paper β’ 2403.04692 β’ Published β’ 40
-
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Paper β’ 2402.17177 β’ Published β’ 87 -
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework
Paper β’ 2403.13248 β’ Published β’ 78 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper β’ 2311.05437 β’ Published β’ 51 -
UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Paper β’ 2409.20551 β’ Published β’ 14