| # LiquidFlow: Liquid Neural Network Γ Mamba-2 SSD Image Generator |
|
|
| **A lightweight, physics-informed image generator combining Liquid Neural Networks (CfC) with Mamba-2 State Space Duality β trainable on Google Colab free tier, deployable on mobile devices.** |
|
|
| [](https://huggingface.co/krystv/LiquidFlow-Gen) |
| [](https://arxiv.org/abs/2106.13898) |
| [](https://arxiv.org/abs/2405.21060) |
| [](https://arxiv.org/abs/2403.14404) |
|
|
| ## Architecture |
|
|
| ``` |
| ββββββββββββββββββββββββββββββββββ |
| Image [128Γ128] β β TAESD VAE β β Latent [16Γ16Γ4] |
| β (< 1M params) β |
| ββββββββββββββββββββββββββββββββββ |
| β |
| ββββββββββββββββββββββββββββββββββ |
| β LiquidFlow Backbone β |
| β β |
| β ββββββββββββββββββββββββββββ β |
| β β LiquidMamba Block (ΓN) β β |
| β β β β |
| β β Input β CfC Gate β β |
| β β β β β |
| β β Mamba-2 SSD β β |
| β β (multi-dir scan) β β |
| β β β β β |
| β β CfC Gate β Output β β |
| β ββββββββββββββββββββββββββββ β |
| β β |
| β + Physics-Informed Loss β |
| β (TV + Spectral + Gradient) β |
| ββββββββββββββββββββββββββββββββββ |
| β |
| Predicted Noise |
| ``` |
|
|
| ### Core Innovations |
|
|
| 1. **CfC (Closed-form Continuous-time) Liquid Neural Networks** |
| - `h(t) = Ο(-f(x,I)Β·t) β g(x,I) + (1-Ο(-f(x,I)Β·t)) β h(x,I)` |
| - No ODE solving β 100Γ faster than Neural ODEs |
| - Time-continuous adaptive gating mechanism |
| - From: Hasani et al., Nature Machine Intelligence (2022) |
|
|
| 2. **Mamba-2 SSD (State Space Duality)** |
| - `h_t = A_tΒ·h_{t-1} + B_tΒ·x_t`, `y_t = C_t^TΒ·h_t` |
| - O(N) linear complexity (vs O(NΒ²) attention) |
| - Fully parallelizable via associative scan |
| - Pure PyTorch β no CUDA kernels needed |
| - From: Dao & Gu, "Transformers are SSMs" (2024) |
|
|
| 3. **Physics-Informed Regularization** |
| - Total Variation + Spectral + Gradient constraints |
| - Training-only regularizer β zero inference cost |
| - Pattern from: Bastek & Sun, ICLR 2025 |
|
|
| 4. **TAESD VAE** |
| - < 1M parameters β 84Γ smaller than SD VAE |
| - Near-instant encoding/decoding |
| - From: madebyollin/taesd |
|
|
| ## Model Variants |
|
|
| | Variant | Parameters | Hidden Dim | Stages | Blocks/Stage | T4 VRAM | |
| |---------|-----------|------------|--------|--------------|---------| |
| | **Tiny** | ~2M | 128 | 2 | 2 | < 2 GB | |
| | **Small** | ~8M | 256 | 4 | 4 | ~4 GB | |
| | **Base** | ~30M | 384 | 6 | 6 | ~8 GB | |
|
|
| ## Quick Start |
|
|
| ### Google Colab |
|
|
| [](https://colab.research.google.com/github/krystv/LiquidFlow-Gen/blob/main/LiquidFlow_Colab.ipynb) |
|
|
| 1. Open the notebook |
| 2. Runtime β Change runtime type β **GPU (T4)** |
| 3. Run all cells |
|
|
| ### Local Training |
|
|
| ```bash |
| # Clone |
| git clone https://huggingface.co/krystv/LiquidFlow-Gen |
| cd LiquidFlow-Gen |
| |
| # Install |
| pip install torch torchvision diffusers tqdm pillow numpy |
| |
| # Train (small model, 128px, CIFAR-10) |
| python train.py \ |
| --dataset cifar10 \ |
| --image_size 128 \ |
| --variant small \ |
| --batch_size 32 \ |
| --epochs 100 \ |
| --lr 2e-4 |
| |
| # Train (base model, 512px) |
| python train.py \ |
| --dataset cifar10 \ |
| --image_size 512 \ |
| --variant base \ |
| --batch_size 8 \ |
| --epochs 200 \ |
| --lr 1e-4 |
| ``` |
|
|
| ### Generate Samples |
|
|
| ```python |
| from liquid_flow.generator import create_liquidflow |
| from liquid_flow.vae_wrapper import TAESDWrapper |
| |
| # Load model |
| model = create_liquidflow(variant='small', image_size=128) |
| model.load_state_dict(torch.load('best_model.pt')) |
| model = model.cuda().eval() |
| |
| # Load VAE |
| vae = TAESDWrapper.load('cuda') |
| |
| # Generate |
| latents = model.sample(batch_size=16, steps=50, ddim=True) |
| images = TAESDWrapper.decode(vae, latents) |
| ``` |
|
|
| ## Training Details |
|
|
| ### Default Hyperparameters |
| - Optimizer: AdamW (Ξ²β=0.9, Ξ²β=0.999) |
| - LR: 2Γ10β»β΄ (tiny/small), 1Γ10β»β΄ (base) |
| - Weight Decay: 10β»β΄ |
| - LR Schedule: Cosine annealing |
| - Gradient Clipping: 1.0 |
| - AMP: Enabled (when CUDA available) |
|
|
| ### Physics Regularization Weights |
| - TV (Total Variation): 0.01 |
| - Conservation of Intensity: 0.001 |
| - Spectral Regularizer: 0.01 |
| - Gradient Penalty: 0.001 |
|
|
| ### Datasets Supported |
| - CIFAR-10, CIFAR-100, STL-10 |
| - CelebA, LSUN (requires download) |
| - ImageNet (provide path) |
|
|
| ## Mobile Deployment |
|
|
| LiquidFlow uses pure PyTorch β **no custom CUDA kernels**: |
|
|
| ```python |
| # Export to ONNX |
| torch.onnx.export(model, (x, t), 'liquidflow.onnx', |
| input_names=['noisy_latent', 'timestep'], |
| output_names=['predicted_noise'], |
| opset_version=14) |
| |
| # Convert to CoreML (iOS) |
| # coremltools.converters.onnx.convert(model='liquidflow.onnx') |
| |
| # Convert to TFLite (Android) |
| # onnx-tf convert -i liquidflow.onnx -o liquidflow.pb |
| ``` |
|
|
| ## Why This Works (Research Validation) |
|
|
| ### DiMSUM (NeurIPS 2024) |
| Mamba-based diffusion beats DiT transformers on ImageNet generation (FID 2.11 vs 2.27). Mamba's O(N) complexity enables 3Γ faster convergence than attention-based models. |
|
|
| ### PINNMamba (ICML 2025) |
| SSM + Physics constraints are compatible and synergistic. Mamba's selective scan naturally handles the spatio-temporal nature of PDE residuals. |
|
|
| ### LiteVAE / TAESD |
| Wavelet-based and tiny VAEs provide sufficient latent quality for diffusion at < 1% of the parameter count of standard VAEs. TAESD is used by 100+ real-time diffusion demos on HF Spaces. |
|
|
| ### DeepSeek V3 Insights |
| - Auxiliary-loss-free training (apply to physics weights) |
| - Multi-head architecture for efficiency |
| - DualPipe for overlapping computation |
|
|
| ## Repository Structure |
|
|
| ``` |
| LiquidFlow-Gen/ |
| βββ liquid_flow/ |
| β βββ __init__.py # Package init |
| β βββ cfc_cell.py # CfC Liquid NN implementation |
| β βββ mamba2_ssd.py # Mamba-2 SSD implementation |
| β βββ liquid_flow_block.py # Hybrid CfC+Mamba block |
| β βββ generator.py # Full diffusion generator |
| β βββ vae_wrapper.py # VAE interfaces |
| β βββ physics_loss.py # Physics regularizers |
| β βββ trainer.py # Training utilities |
| βββ train.py # CLI training script |
| βββ LiquidFlow_Colab.ipynb # Colab notebook |
| βββ README.md # This file |
| ``` |
|
|
| ## Citations |
|
|
| ```bibtex |
| @article{hasani2022cfc, |
| title={Closed-form continuous-time neural networks}, |
| author={Hasani, Ramin and Lechner, Mathias and Amini, Alexander and others}, |
| journal={Nature Machine Intelligence}, |
| year={2022} |
| } |
| |
| @article{dao2024mamba2, |
| title={Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality}, |
| author={Dao, Tri and Gu, Albert}, |
| journal={arXiv:2405.21060}, |
| year={2024} |
| } |
| |
| @inproceedings{bastek2025physics, |
| title={Physics-Informed Diffusion Models}, |
| author={Bastek, Jan-Hendrik and Sun, WaiChing}, |
| booktitle={ICLR}, |
| year={2025} |
| } |
| |
| @article{pham2024dimsum, |
| title={DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation}, |
| author={Pham, Hao and others}, |
| journal={NeurIPS}, |
| year={2024} |
| } |
| ``` |
|
|
| ## License |
| MIT |
|
|