Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2212.09748

2026 - Reading AI Research Papers with Ajinkya

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 245
A Survey on Diffusion Language Models

Paper • 2508.10875 • Published Aug 14, 2025 • 34
Scalable Diffusion Models with Transformers

Paper • 2212.09748 • Published Dec 19, 2022 • 17
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71

kaupane/DiT-Wikiart-Large

Text-to-Image • Updated Nov 1, 2025 • 10 • 1
kaupane/DiT-Wikiart-Small

Text-to-Image • Updated Nov 1, 2025 • 4
kaupane/DiT-Wikiart-Base

Text-to-Image • Updated Nov 1, 2025 • 2
Running on Zero

Agents

3

Diffusion

🔥

3

Custom DiT models trained on Wikiart dataset.

image-generation

aMUSEd: An Open MUSE Reproduction

Paper • 2401.01808 • Published Jan 3, 2024 • 31
black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27, 2025 • 697k • • 12.7k
Qwen/Qwen2-VL-7B-Instruct

Image-Text-to-Text • 8B • Updated Feb 6, 2025 • 1.41M • 1.27k
zer0int/CLIP-GmP-ViT-L-14

Zero-Shot Image Classification • Updated Jan 25 • 24.5k • 516

Scalable Diffusion Models with Transformers

Paper • 2212.09748 • Published Dec 19, 2022 • 17

Applied Machine Learning Papers

Reading List (Mainly Focused of VLM's and Diffusion Models)

Scalable Diffusion Models with Transformers

Paper • 2212.09748 • Published Dec 19, 2022 • 17
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

Paper • 2311.15127 • Published Nov 25, 2023 • 15
Learning Transferable Visual Models From Natural Language Supervision

Paper • 2103.00020 • Published Feb 26, 2021 • 21
U-Net: Convolutional Networks for Biomedical Image Segmentation

Paper • 1505.04597 • Published May 18, 2015 • 18

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published Apr 2, 2025 • 87
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 108
BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published Apr 16, 2025 • 85
FAST: Efficient Action Tokenization for Vision-Language-Action Models

Paper • 2501.09747 • Published Jan 16, 2025 • 29

Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models

Paper • 2409.07452 • Published Sep 11, 2024 • 21
Generating 3D-Consistent Videos from Unposed Internet Photos

Paper • 2411.13549 • Published Nov 20, 2024
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

Paper • 2411.04928 • Published Nov 7, 2024 • 56
CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models

Paper • 2412.12093 • Published Dec 16, 2024

2023 (and before) Papers of the Year

Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

Paper • 2306.00989 • Published Jun 1, 2023 • 1
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 64
Scalable Diffusion Models with Transformers

Paper • 2212.09748 • Published Dec 19, 2022 • 17
Matryoshka Representation Learning

Paper • 2205.13147 • Published May 26, 2022 • 25

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 150
Elucidating the Design Space of Diffusion-Based Generative Models

Paper • 2206.00364 • Published Jun 1, 2022 • 18
GLU Variants Improve Transformer

Paper • 2002.05202 • Published Feb 12, 2020 • 5
StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 156

Diffusion Models

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Paper • 2403.05135 • Published Mar 8, 2024 • 45
Understanding Diffusion Objectives as the ELBO with Simple Data Augmentation

Paper • 2303.00848 • Published Mar 1, 2023
Scalable Diffusion Models with Transformers

Paper • 2212.09748 • Published Dec 19, 2022 • 17
High-Resolution Image Synthesis with Latent Diffusion Models

Paper • 2112.10752 • Published Dec 20, 2021 • 17

2026 - Reading AI Research Papers with Ajinkya

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 245
A Survey on Diffusion Language Models

Paper • 2508.10875 • Published Aug 14, 2025 • 34
Scalable Diffusion Models with Transformers

Paper • 2212.09748 • Published Dec 19, 2022 • 17
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published Apr 2, 2025 • 87
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 108
BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published Apr 16, 2025 • 85
FAST: Efficient Action Tokenization for Vision-Language-Action Models

Paper • 2501.09747 • Published Jan 16, 2025 • 29

kaupane/DiT-Wikiart-Large

Text-to-Image • Updated Nov 1, 2025 • 10 • 1
kaupane/DiT-Wikiart-Small

Text-to-Image • Updated Nov 1, 2025 • 4
kaupane/DiT-Wikiart-Base

Text-to-Image • Updated Nov 1, 2025 • 2
Running on Zero

Agents

3

Diffusion

🔥

3

Custom DiT models trained on Wikiart dataset.

Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models

Paper • 2409.07452 • Published Sep 11, 2024 • 21
Generating 3D-Consistent Videos from Unposed Internet Photos

Paper • 2411.13549 • Published Nov 20, 2024
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

Paper • 2411.04928 • Published Nov 7, 2024 • 56
CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models

Paper • 2412.12093 • Published Dec 16, 2024

image-generation

aMUSEd: An Open MUSE Reproduction

Paper • 2401.01808 • Published Jan 3, 2024 • 31
black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27, 2025 • 697k • • 12.7k
Qwen/Qwen2-VL-7B-Instruct

Image-Text-to-Text • 8B • Updated Feb 6, 2025 • 1.41M • 1.27k
zer0int/CLIP-GmP-ViT-L-14

Zero-Shot Image Classification • Updated Jan 25 • 24.5k • 516

2023 (and before) Papers of the Year

Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

Paper • 2306.00989 • Published Jun 1, 2023 • 1
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 64
Scalable Diffusion Models with Transformers

Paper • 2212.09748 • Published Dec 19, 2022 • 17
Matryoshka Representation Learning

Paper • 2205.13147 • Published May 26, 2022 • 25

Scalable Diffusion Models with Transformers

Paper • 2212.09748 • Published Dec 19, 2022 • 17

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 150
Elucidating the Design Space of Diffusion-Based Generative Models

Paper • 2206.00364 • Published Jun 1, 2022 • 18
GLU Variants Improve Transformer

Paper • 2002.05202 • Published Feb 12, 2020 • 5
StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 156

Applied Machine Learning Papers

Reading List (Mainly Focused of VLM's and Diffusion Models)

Scalable Diffusion Models with Transformers

Paper • 2212.09748 • Published Dec 19, 2022 • 17
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

Paper • 2311.15127 • Published Nov 25, 2023 • 15
Learning Transferable Visual Models From Natural Language Supervision

Paper • 2103.00020 • Published Feb 26, 2021 • 21
U-Net: Convolutional Networks for Biomedical Image Segmentation

Paper • 1505.04597 • Published May 18, 2015 • 18

Diffusion Models

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Paper • 2403.05135 • Published Mar 8, 2024 • 45
Understanding Diffusion Objectives as the ELBO with Simple Data Augmentation

Paper • 2303.00848 • Published Mar 1, 2023
Scalable Diffusion Models with Transformers

Paper • 2212.09748 • Published Dec 19, 2022 • 17
High-Resolution Image Synthesis with Latent Diffusion Models

Paper • 2112.10752 • Published Dec 20, 2021 • 17

Previous
1
2
3
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs