Upload README.md

943ab10 verified about 10 hours ago

8.45 kB

	# LiquidFlow: Liquid Neural Network × Mamba-2 SSD Image Generator

	A lightweight, physics-informed image generator combining Liquid Neural Networks (CfC) with Mamba-2 State Space Duality — trainable on Google Colab free tier, deployable on mobile devices.

	[![Model on HF](https://img.shields.io/badge/🤗-LiquidFlow--Gen-blue)](https://huggingface.co/krystv/LiquidFlow-Gen)
	[![Paper: CfC](https://img.shields.io/badge/📄-CfC_Nature_MI_2022-green)](https://arxiv.org/abs/2106.13898)
	[![Paper: Mamba-2](https://img.shields.io/badge/📄-Mamba2_2024-orange)](https://arxiv.org/abs/2405.21060)
	[![Paper: PINN Diffusion](https://img.shields.io/badge/📄-PINN_Diff_ICLR_2025-red)](https://arxiv.org/abs/2403.14404)

	## Architecture

	```
	┌────────────────────────────────┐
	Image [128×128] → │ TAESD VAE │ → Latent [16×16×4]
	│ (< 1M params) │
	└────────────────────────────────┘
	↓
	┌────────────────────────────────┐
	│ LiquidFlow Backbone │
	│ │
	│ ┌──────────────────────────┐ │
	│ │ LiquidMamba Block (×N) │ │
	│ │ │ │
	│ │ Input → CfC Gate │ │
	│ │ ↓ │ │
	│ │ Mamba-2 SSD │ │
	│ │ (multi-dir scan) │ │
	│ │ ↓ │ │
	│ │ CfC Gate → Output │ │
	│ └──────────────────────────┘ │
	│ │
	│ + Physics-Informed Loss │
	│ (TV + Spectral + Gradient) │
	└────────────────────────────────┘
	↓
	Predicted Noise
	```

	### Core Innovations

	1. CfC (Closed-form Continuous-time) Liquid Neural Networks
	- `h(t) = σ(-f(x,I)·t) ⊙ g(x,I) + (1-σ(-f(x,I)·t)) ⊙ h(x,I)`
	- No ODE solving — 100× faster than Neural ODEs
	- Time-continuous adaptive gating mechanism
	- From: Hasani et al., Nature Machine Intelligence (2022)

	2. Mamba-2 SSD (State Space Duality)
	- `h_t = A_t·h_{t-1} + B_t·x_t`, `y_t = C_t^T·h_t`
	- O(N) linear complexity (vs O(N²) attention)
	- Fully parallelizable via associative scan
	- Pure PyTorch — no CUDA kernels needed
	- From: Dao & Gu, "Transformers are SSMs" (2024)

	3. Physics-Informed Regularization
	- Total Variation + Spectral + Gradient constraints
	- Training-only regularizer — zero inference cost
	- Pattern from: Bastek & Sun, ICLR 2025

	4. TAESD VAE
	- < 1M parameters — 84× smaller than SD VAE
	- Near-instant encoding/decoding
	- From: madebyollin/taesd

	## Model Variants

	\| Variant \| Parameters \| Hidden Dim \| Stages \| Blocks/Stage \| T4 VRAM \|
	\|---------\|-----------\|------------\|--------\|--------------\|---------\|
	\| Tiny \| ~2M \| 128 \| 2 \| 2 \| < 2 GB \|
	\| Small \| ~8M \| 256 \| 4 \| 4 \| ~4 GB \|
	\| Base \| ~30M \| 384 \| 6 \| 6 \| ~8 GB \|

	## Quick Start

	### Google Colab

	[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/krystv/LiquidFlow-Gen/blob/main/LiquidFlow_Colab.ipynb)

	1. Open the notebook
	2. Runtime → Change runtime type → GPU (T4)
	3. Run all cells

	### Local Training

	```bash
	# Clone
	git clone https://huggingface.co/krystv/LiquidFlow-Gen
	cd LiquidFlow-Gen

	# Install
	pip install torch torchvision diffusers tqdm pillow numpy

	# Train (small model, 128px, CIFAR-10)
	python train.py \
	--dataset cifar10 \
	--image_size 128 \
	--variant small \
	--batch_size 32 \
	--epochs 100 \
	--lr 2e-4

	# Train (base model, 512px)
	python train.py \
	--dataset cifar10 \
	--image_size 512 \
	--variant base \
	--batch_size 8 \
	--epochs 200 \
	--lr 1e-4
	```

	### Generate Samples

	```python
	from liquid_flow.generator import create_liquidflow
	from liquid_flow.vae_wrapper import TAESDWrapper

	# Load model
	model = create_liquidflow(variant='small', image_size=128)
	model.load_state_dict(torch.load('best_model.pt'))
	model = model.cuda().eval()

	# Load VAE
	vae = TAESDWrapper.load('cuda')

	# Generate
	latents = model.sample(batch_size=16, steps=50, ddim=True)
	images = TAESDWrapper.decode(vae, latents)
	```

	## Training Details

	### Default Hyperparameters
	- Optimizer: AdamW (β₁=0.9, β₂=0.999)
	- LR: 2×10⁻⁴ (tiny/small), 1×10⁻⁴ (base)
	- Weight Decay: 10⁻⁴
	- LR Schedule: Cosine annealing
	- Gradient Clipping: 1.0
	- AMP: Enabled (when CUDA available)

	### Physics Regularization Weights
	- TV (Total Variation): 0.01
	- Conservation of Intensity: 0.001
	- Spectral Regularizer: 0.01
	- Gradient Penalty: 0.001

	### Datasets Supported
	- CIFAR-10, CIFAR-100, STL-10
	- CelebA, LSUN (requires download)
	- ImageNet (provide path)

	## Mobile Deployment

	LiquidFlow uses pure PyTorch — no custom CUDA kernels:

	```python
	# Export to ONNX
	torch.onnx.export(model, (x, t), 'liquidflow.onnx',
	input_names=['noisy_latent', 'timestep'],
	output_names=['predicted_noise'],
	opset_version=14)

	# Convert to CoreML (iOS)
	# coremltools.converters.onnx.convert(model='liquidflow.onnx')

	# Convert to TFLite (Android)
	# onnx-tf convert -i liquidflow.onnx -o liquidflow.pb
	```

	## Why This Works (Research Validation)

	### DiMSUM (NeurIPS 2024)
	Mamba-based diffusion beats DiT transformers on ImageNet generation (FID 2.11 vs 2.27). Mamba's O(N) complexity enables 3× faster convergence than attention-based models.

	### PINNMamba (ICML 2025)
	SSM + Physics constraints are compatible and synergistic. Mamba's selective scan naturally handles the spatio-temporal nature of PDE residuals.

	### LiteVAE / TAESD
	Wavelet-based and tiny VAEs provide sufficient latent quality for diffusion at < 1% of the parameter count of standard VAEs. TAESD is used by 100+ real-time diffusion demos on HF Spaces.

	### DeepSeek V3 Insights
	- Auxiliary-loss-free training (apply to physics weights)
	- Multi-head architecture for efficiency
	- DualPipe for overlapping computation

	## Repository Structure

	```
	LiquidFlow-Gen/
	├── liquid_flow/
	│ ├── __init__.py # Package init
	│ ├── cfc_cell.py # CfC Liquid NN implementation
	│ ├── mamba2_ssd.py # Mamba-2 SSD implementation
	│ ├── liquid_flow_block.py # Hybrid CfC+Mamba block
	│ ├── generator.py # Full diffusion generator
	│ ├── vae_wrapper.py # VAE interfaces
	│ ├── physics_loss.py # Physics regularizers
	│ └── trainer.py # Training utilities
	├── train.py # CLI training script
	├── LiquidFlow_Colab.ipynb # Colab notebook
	└── README.md # This file
	```

	## Citations

	```bibtex
	@article{hasani2022cfc,
	title={Closed-form continuous-time neural networks},
	author={Hasani, Ramin and Lechner, Mathias and Amini, Alexander and others},
	journal={Nature Machine Intelligence},
	year={2022}
	}

	@article{dao2024mamba2,
	title={Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality},
	author={Dao, Tri and Gu, Albert},
	journal={arXiv:2405.21060},
	year={2024}
	}

	@inproceedings{bastek2025physics,
	title={Physics-Informed Diffusion Models},
	author={Bastek, Jan-Hendrik and Sun, WaiChing},
	booktitle={ICLR},
	year={2025}
	}

	@article{pham2024dimsum,
	title={DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation},
	author={Pham, Hao and others},
	journal={NeurIPS},
	year={2024}
	}
	```

	## License
	MIT