LiquidFlow-Gen / README.md
krystv's picture
Upload README.md
943ab10 verified
# LiquidFlow: Liquid Neural Network Γ— Mamba-2 SSD Image Generator
**A lightweight, physics-informed image generator combining Liquid Neural Networks (CfC) with Mamba-2 State Space Duality β€” trainable on Google Colab free tier, deployable on mobile devices.**
[![Model on HF](https://img.shields.io/badge/πŸ€—-LiquidFlow--Gen-blue)](https://huggingface.co/krystv/LiquidFlow-Gen)
[![Paper: CfC](https://img.shields.io/badge/πŸ“„-CfC_Nature_MI_2022-green)](https://arxiv.org/abs/2106.13898)
[![Paper: Mamba-2](https://img.shields.io/badge/πŸ“„-Mamba2_2024-orange)](https://arxiv.org/abs/2405.21060)
[![Paper: PINN Diffusion](https://img.shields.io/badge/πŸ“„-PINN_Diff_ICLR_2025-red)](https://arxiv.org/abs/2403.14404)
## Architecture
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
Image [128Γ—128] β†’ β”‚ TAESD VAE β”‚ β†’ Latent [16Γ—16Γ—4]
β”‚ (< 1M params) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ LiquidFlow Backbone β”‚
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ LiquidMamba Block (Γ—N) β”‚ β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ β”‚ Input β†’ CfC Gate β”‚ β”‚
β”‚ β”‚ ↓ β”‚ β”‚
β”‚ β”‚ Mamba-2 SSD β”‚ β”‚
β”‚ β”‚ (multi-dir scan) β”‚ β”‚
β”‚ β”‚ ↓ β”‚ β”‚
β”‚ β”‚ CfC Gate β†’ Output β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚
β”‚ + Physics-Informed Loss β”‚
β”‚ (TV + Spectral + Gradient) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
↓
Predicted Noise
```
### Core Innovations
1. **CfC (Closed-form Continuous-time) Liquid Neural Networks**
- `h(t) = Οƒ(-f(x,I)Β·t) βŠ™ g(x,I) + (1-Οƒ(-f(x,I)Β·t)) βŠ™ h(x,I)`
- No ODE solving β€” 100Γ— faster than Neural ODEs
- Time-continuous adaptive gating mechanism
- From: Hasani et al., Nature Machine Intelligence (2022)
2. **Mamba-2 SSD (State Space Duality)**
- `h_t = A_tΒ·h_{t-1} + B_tΒ·x_t`, `y_t = C_t^TΒ·h_t`
- O(N) linear complexity (vs O(NΒ²) attention)
- Fully parallelizable via associative scan
- Pure PyTorch β€” no CUDA kernels needed
- From: Dao & Gu, "Transformers are SSMs" (2024)
3. **Physics-Informed Regularization**
- Total Variation + Spectral + Gradient constraints
- Training-only regularizer β€” zero inference cost
- Pattern from: Bastek & Sun, ICLR 2025
4. **TAESD VAE**
- < 1M parameters β€” 84Γ— smaller than SD VAE
- Near-instant encoding/decoding
- From: madebyollin/taesd
## Model Variants
| Variant | Parameters | Hidden Dim | Stages | Blocks/Stage | T4 VRAM |
|---------|-----------|------------|--------|--------------|---------|
| **Tiny** | ~2M | 128 | 2 | 2 | < 2 GB |
| **Small** | ~8M | 256 | 4 | 4 | ~4 GB |
| **Base** | ~30M | 384 | 6 | 6 | ~8 GB |
## Quick Start
### Google Colab
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/krystv/LiquidFlow-Gen/blob/main/LiquidFlow_Colab.ipynb)
1. Open the notebook
2. Runtime β†’ Change runtime type β†’ **GPU (T4)**
3. Run all cells
### Local Training
```bash
# Clone
git clone https://huggingface.co/krystv/LiquidFlow-Gen
cd LiquidFlow-Gen
# Install
pip install torch torchvision diffusers tqdm pillow numpy
# Train (small model, 128px, CIFAR-10)
python train.py \
--dataset cifar10 \
--image_size 128 \
--variant small \
--batch_size 32 \
--epochs 100 \
--lr 2e-4
# Train (base model, 512px)
python train.py \
--dataset cifar10 \
--image_size 512 \
--variant base \
--batch_size 8 \
--epochs 200 \
--lr 1e-4
```
### Generate Samples
```python
from liquid_flow.generator import create_liquidflow
from liquid_flow.vae_wrapper import TAESDWrapper
# Load model
model = create_liquidflow(variant='small', image_size=128)
model.load_state_dict(torch.load('best_model.pt'))
model = model.cuda().eval()
# Load VAE
vae = TAESDWrapper.load('cuda')
# Generate
latents = model.sample(batch_size=16, steps=50, ddim=True)
images = TAESDWrapper.decode(vae, latents)
```
## Training Details
### Default Hyperparameters
- Optimizer: AdamW (β₁=0.9, Ξ²β‚‚=0.999)
- LR: 2Γ—10⁻⁴ (tiny/small), 1Γ—10⁻⁴ (base)
- Weight Decay: 10⁻⁴
- LR Schedule: Cosine annealing
- Gradient Clipping: 1.0
- AMP: Enabled (when CUDA available)
### Physics Regularization Weights
- TV (Total Variation): 0.01
- Conservation of Intensity: 0.001
- Spectral Regularizer: 0.01
- Gradient Penalty: 0.001
### Datasets Supported
- CIFAR-10, CIFAR-100, STL-10
- CelebA, LSUN (requires download)
- ImageNet (provide path)
## Mobile Deployment
LiquidFlow uses pure PyTorch β€” **no custom CUDA kernels**:
```python
# Export to ONNX
torch.onnx.export(model, (x, t), 'liquidflow.onnx',
input_names=['noisy_latent', 'timestep'],
output_names=['predicted_noise'],
opset_version=14)
# Convert to CoreML (iOS)
# coremltools.converters.onnx.convert(model='liquidflow.onnx')
# Convert to TFLite (Android)
# onnx-tf convert -i liquidflow.onnx -o liquidflow.pb
```
## Why This Works (Research Validation)
### DiMSUM (NeurIPS 2024)
Mamba-based diffusion beats DiT transformers on ImageNet generation (FID 2.11 vs 2.27). Mamba's O(N) complexity enables 3Γ— faster convergence than attention-based models.
### PINNMamba (ICML 2025)
SSM + Physics constraints are compatible and synergistic. Mamba's selective scan naturally handles the spatio-temporal nature of PDE residuals.
### LiteVAE / TAESD
Wavelet-based and tiny VAEs provide sufficient latent quality for diffusion at < 1% of the parameter count of standard VAEs. TAESD is used by 100+ real-time diffusion demos on HF Spaces.
### DeepSeek V3 Insights
- Auxiliary-loss-free training (apply to physics weights)
- Multi-head architecture for efficiency
- DualPipe for overlapping computation
## Repository Structure
```
LiquidFlow-Gen/
β”œβ”€β”€ liquid_flow/
β”‚ β”œβ”€β”€ __init__.py # Package init
β”‚ β”œβ”€β”€ cfc_cell.py # CfC Liquid NN implementation
β”‚ β”œβ”€β”€ mamba2_ssd.py # Mamba-2 SSD implementation
β”‚ β”œβ”€β”€ liquid_flow_block.py # Hybrid CfC+Mamba block
β”‚ β”œβ”€β”€ generator.py # Full diffusion generator
β”‚ β”œβ”€β”€ vae_wrapper.py # VAE interfaces
β”‚ β”œβ”€β”€ physics_loss.py # Physics regularizers
β”‚ └── trainer.py # Training utilities
β”œβ”€β”€ train.py # CLI training script
β”œβ”€β”€ LiquidFlow_Colab.ipynb # Colab notebook
└── README.md # This file
```
## Citations
```bibtex
@article{hasani2022cfc,
title={Closed-form continuous-time neural networks},
author={Hasani, Ramin and Lechner, Mathias and Amini, Alexander and others},
journal={Nature Machine Intelligence},
year={2022}
}
@article{dao2024mamba2,
title={Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality},
author={Dao, Tri and Gu, Albert},
journal={arXiv:2405.21060},
year={2024}
}
@inproceedings{bastek2025physics,
title={Physics-Informed Diffusion Models},
author={Bastek, Jan-Hendrik and Sun, WaiChing},
booktitle={ICLR},
year={2025}
}
@article{pham2024dimsum,
title={DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation},
author={Pham, Hao and others},
journal={NeurIPS},
year={2024}
}
```
## License
MIT