krystv commited on
Commit
943ab10
Β·
verified Β·
1 Parent(s): 9d75008

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +236 -0
README.md ADDED
@@ -0,0 +1,236 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # LiquidFlow: Liquid Neural Network Γ— Mamba-2 SSD Image Generator
2
+
3
+ **A lightweight, physics-informed image generator combining Liquid Neural Networks (CfC) with Mamba-2 State Space Duality β€” trainable on Google Colab free tier, deployable on mobile devices.**
4
+
5
+ [![Model on HF](https://img.shields.io/badge/πŸ€—-LiquidFlow--Gen-blue)](https://huggingface.co/krystv/LiquidFlow-Gen)
6
+ [![Paper: CfC](https://img.shields.io/badge/πŸ“„-CfC_Nature_MI_2022-green)](https://arxiv.org/abs/2106.13898)
7
+ [![Paper: Mamba-2](https://img.shields.io/badge/πŸ“„-Mamba2_2024-orange)](https://arxiv.org/abs/2405.21060)
8
+ [![Paper: PINN Diffusion](https://img.shields.io/badge/πŸ“„-PINN_Diff_ICLR_2025-red)](https://arxiv.org/abs/2403.14404)
9
+
10
+ ## Architecture
11
+
12
+ ```
13
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
14
+ Image [128Γ—128] β†’ β”‚ TAESD VAE β”‚ β†’ Latent [16Γ—16Γ—4]
15
+ β”‚ (< 1M params) β”‚
16
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
17
+ ↓
18
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
19
+ β”‚ LiquidFlow Backbone β”‚
20
+ β”‚ β”‚
21
+ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
22
+ β”‚ β”‚ LiquidMamba Block (Γ—N) β”‚ β”‚
23
+ β”‚ β”‚ β”‚ β”‚
24
+ β”‚ β”‚ Input β†’ CfC Gate β”‚ β”‚
25
+ β”‚ β”‚ ↓ β”‚ β”‚
26
+ β”‚ β”‚ Mamba-2 SSD β”‚ β”‚
27
+ β”‚ β”‚ (multi-dir scan) β”‚ β”‚
28
+ β”‚ β”‚ ↓ β”‚ β”‚
29
+ β”‚ β”‚ CfC Gate β†’ Output β”‚ β”‚
30
+ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
31
+ β”‚ β”‚
32
+ β”‚ + Physics-Informed Loss β”‚
33
+ β”‚ (TV + Spectral + Gradient) β”‚
34
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
35
+ ↓
36
+ Predicted Noise
37
+ ```
38
+
39
+ ### Core Innovations
40
+
41
+ 1. **CfC (Closed-form Continuous-time) Liquid Neural Networks**
42
+ - `h(t) = Οƒ(-f(x,I)Β·t) βŠ™ g(x,I) + (1-Οƒ(-f(x,I)Β·t)) βŠ™ h(x,I)`
43
+ - No ODE solving β€” 100Γ— faster than Neural ODEs
44
+ - Time-continuous adaptive gating mechanism
45
+ - From: Hasani et al., Nature Machine Intelligence (2022)
46
+
47
+ 2. **Mamba-2 SSD (State Space Duality)**
48
+ - `h_t = A_tΒ·h_{t-1} + B_tΒ·x_t`, `y_t = C_t^TΒ·h_t`
49
+ - O(N) linear complexity (vs O(NΒ²) attention)
50
+ - Fully parallelizable via associative scan
51
+ - Pure PyTorch β€” no CUDA kernels needed
52
+ - From: Dao & Gu, "Transformers are SSMs" (2024)
53
+
54
+ 3. **Physics-Informed Regularization**
55
+ - Total Variation + Spectral + Gradient constraints
56
+ - Training-only regularizer β€” zero inference cost
57
+ - Pattern from: Bastek & Sun, ICLR 2025
58
+
59
+ 4. **TAESD VAE**
60
+ - < 1M parameters β€” 84Γ— smaller than SD VAE
61
+ - Near-instant encoding/decoding
62
+ - From: madebyollin/taesd
63
+
64
+ ## Model Variants
65
+
66
+ | Variant | Parameters | Hidden Dim | Stages | Blocks/Stage | T4 VRAM |
67
+ |---------|-----------|------------|--------|--------------|---------|
68
+ | **Tiny** | ~2M | 128 | 2 | 2 | < 2 GB |
69
+ | **Small** | ~8M | 256 | 4 | 4 | ~4 GB |
70
+ | **Base** | ~30M | 384 | 6 | 6 | ~8 GB |
71
+
72
+ ## Quick Start
73
+
74
+ ### Google Colab
75
+
76
+ [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/krystv/LiquidFlow-Gen/blob/main/LiquidFlow_Colab.ipynb)
77
+
78
+ 1. Open the notebook
79
+ 2. Runtime β†’ Change runtime type β†’ **GPU (T4)**
80
+ 3. Run all cells
81
+
82
+ ### Local Training
83
+
84
+ ```bash
85
+ # Clone
86
+ git clone https://huggingface.co/krystv/LiquidFlow-Gen
87
+ cd LiquidFlow-Gen
88
+
89
+ # Install
90
+ pip install torch torchvision diffusers tqdm pillow numpy
91
+
92
+ # Train (small model, 128px, CIFAR-10)
93
+ python train.py \
94
+ --dataset cifar10 \
95
+ --image_size 128 \
96
+ --variant small \
97
+ --batch_size 32 \
98
+ --epochs 100 \
99
+ --lr 2e-4
100
+
101
+ # Train (base model, 512px)
102
+ python train.py \
103
+ --dataset cifar10 \
104
+ --image_size 512 \
105
+ --variant base \
106
+ --batch_size 8 \
107
+ --epochs 200 \
108
+ --lr 1e-4
109
+ ```
110
+
111
+ ### Generate Samples
112
+
113
+ ```python
114
+ from liquid_flow.generator import create_liquidflow
115
+ from liquid_flow.vae_wrapper import TAESDWrapper
116
+
117
+ # Load model
118
+ model = create_liquidflow(variant='small', image_size=128)
119
+ model.load_state_dict(torch.load('best_model.pt'))
120
+ model = model.cuda().eval()
121
+
122
+ # Load VAE
123
+ vae = TAESDWrapper.load('cuda')
124
+
125
+ # Generate
126
+ latents = model.sample(batch_size=16, steps=50, ddim=True)
127
+ images = TAESDWrapper.decode(vae, latents)
128
+ ```
129
+
130
+ ## Training Details
131
+
132
+ ### Default Hyperparameters
133
+ - Optimizer: AdamW (β₁=0.9, Ξ²β‚‚=0.999)
134
+ - LR: 2Γ—10⁻⁴ (tiny/small), 1Γ—10⁻⁴ (base)
135
+ - Weight Decay: 10⁻⁴
136
+ - LR Schedule: Cosine annealing
137
+ - Gradient Clipping: 1.0
138
+ - AMP: Enabled (when CUDA available)
139
+
140
+ ### Physics Regularization Weights
141
+ - TV (Total Variation): 0.01
142
+ - Conservation of Intensity: 0.001
143
+ - Spectral Regularizer: 0.01
144
+ - Gradient Penalty: 0.001
145
+
146
+ ### Datasets Supported
147
+ - CIFAR-10, CIFAR-100, STL-10
148
+ - CelebA, LSUN (requires download)
149
+ - ImageNet (provide path)
150
+
151
+ ## Mobile Deployment
152
+
153
+ LiquidFlow uses pure PyTorch β€” **no custom CUDA kernels**:
154
+
155
+ ```python
156
+ # Export to ONNX
157
+ torch.onnx.export(model, (x, t), 'liquidflow.onnx',
158
+ input_names=['noisy_latent', 'timestep'],
159
+ output_names=['predicted_noise'],
160
+ opset_version=14)
161
+
162
+ # Convert to CoreML (iOS)
163
+ # coremltools.converters.onnx.convert(model='liquidflow.onnx')
164
+
165
+ # Convert to TFLite (Android)
166
+ # onnx-tf convert -i liquidflow.onnx -o liquidflow.pb
167
+ ```
168
+
169
+ ## Why This Works (Research Validation)
170
+
171
+ ### DiMSUM (NeurIPS 2024)
172
+ Mamba-based diffusion beats DiT transformers on ImageNet generation (FID 2.11 vs 2.27). Mamba's O(N) complexity enables 3Γ— faster convergence than attention-based models.
173
+
174
+ ### PINNMamba (ICML 2025)
175
+ SSM + Physics constraints are compatible and synergistic. Mamba's selective scan naturally handles the spatio-temporal nature of PDE residuals.
176
+
177
+ ### LiteVAE / TAESD
178
+ Wavelet-based and tiny VAEs provide sufficient latent quality for diffusion at < 1% of the parameter count of standard VAEs. TAESD is used by 100+ real-time diffusion demos on HF Spaces.
179
+
180
+ ### DeepSeek V3 Insights
181
+ - Auxiliary-loss-free training (apply to physics weights)
182
+ - Multi-head architecture for efficiency
183
+ - DualPipe for overlapping computation
184
+
185
+ ## Repository Structure
186
+
187
+ ```
188
+ LiquidFlow-Gen/
189
+ β”œβ”€β”€ liquid_flow/
190
+ β”‚ β”œβ”€β”€ __init__.py # Package init
191
+ β”‚ β”œβ”€β”€ cfc_cell.py # CfC Liquid NN implementation
192
+ β”‚ β”œβ”€β”€ mamba2_ssd.py # Mamba-2 SSD implementation
193
+ β”‚ β”œβ”€β”€ liquid_flow_block.py # Hybrid CfC+Mamba block
194
+ β”‚ β”œβ”€β”€ generator.py # Full diffusion generator
195
+ β”‚ β”œβ”€β”€ vae_wrapper.py # VAE interfaces
196
+ β”‚ β”œβ”€β”€ physics_loss.py # Physics regularizers
197
+ β”‚ └── trainer.py # Training utilities
198
+ β”œβ”€β”€ train.py # CLI training script
199
+ β”œβ”€β”€ LiquidFlow_Colab.ipynb # Colab notebook
200
+ └── README.md # This file
201
+ ```
202
+
203
+ ## Citations
204
+
205
+ ```bibtex
206
+ @article{hasani2022cfc,
207
+ title={Closed-form continuous-time neural networks},
208
+ author={Hasani, Ramin and Lechner, Mathias and Amini, Alexander and others},
209
+ journal={Nature Machine Intelligence},
210
+ year={2022}
211
+ }
212
+
213
+ @article{dao2024mamba2,
214
+ title={Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality},
215
+ author={Dao, Tri and Gu, Albert},
216
+ journal={arXiv:2405.21060},
217
+ year={2024}
218
+ }
219
+
220
+ @inproceedings{bastek2025physics,
221
+ title={Physics-Informed Diffusion Models},
222
+ author={Bastek, Jan-Hendrik and Sun, WaiChing},
223
+ booktitle={ICLR},
224
+ year={2025}
225
+ }
226
+
227
+ @article{pham2024dimsum,
228
+ title={DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation},
229
+ author={Pham, Hao and others},
230
+ journal={NeurIPS},
231
+ year={2024}
232
+ }
233
+ ```
234
+
235
+ ## License
236
+ MIT