krystv
/

LiquidFlow-Gen

Model card Files Files and versions

xet

Community

krystv commited on about 9 hours ago

Commit

943ab10

verified ·

1 Parent(s): 9d75008

Upload README.md

Browse files

Files changed (1) hide show

README.md +236 -0

README.md ADDED Viewed

	@@ -0,0 +1,236 @@

+# LiquidFlow: Liquid Neural Network × Mamba-2 SSD Image Generator
+**A lightweight, physics-informed image generator combining Liquid Neural Networks (CfC) with Mamba-2 State Space Duality — trainable on Google Colab free tier, deployable on mobile devices.**
+[![Model on HF](https://img.shields.io/badge/🤗-LiquidFlow--Gen-blue)](https://huggingface.co/krystv/LiquidFlow-Gen)
+[![Paper: CfC](https://img.shields.io/badge/📄-CfC_Nature_MI_2022-green)](https://arxiv.org/abs/2106.13898)
+[![Paper: Mamba-2](https://img.shields.io/badge/📄-Mamba2_2024-orange)](https://arxiv.org/abs/2405.21060)
+[![Paper: PINN Diffusion](https://img.shields.io/badge/📄-PINN_Diff_ICLR_2025-red)](https://arxiv.org/abs/2403.14404)
+## Architecture
+```
+                    ┌────────────────────────────────┐
+Image [128×128] →  │         TAESD VAE              │ → Latent [16×16×4]
+                    │      (< 1M params)             │
+                    └────────────────────────────────┘
+                                    ↓
+                    ┌────────────────────────────────┐
+                    │    LiquidFlow Backbone          │
+                    │                                 │
+                    │  ┌──────────────────────────┐  │
+                    │  │  LiquidMamba Block (×N)   │  │
+                    │  │                           │  │
+                    │  │  Input → CfC Gate         │  │
+                    │  │           ↓               │  │
+                    │  │       Mamba-2 SSD         │  │
+                    │  │    (multi-dir scan)       │  │
+                    │  │           ↓               │  │
+                    │  │       CfC Gate → Output   │  │
+                    │  └──────────────────────────┘  │
+                    │                                 │
+                    │  + Physics-Informed Loss        │
+                    │    (TV + Spectral + Gradient)   │
+                    └────────────────────────────────┘
+                                    ↓
+                            Predicted Noise
+```
+### Core Innovations
+1. **CfC (Closed-form Continuous-time) Liquid Neural Networks**
+   - `h(t) = σ(-f(x,I)·t) ⊙ g(x,I) + (1-σ(-f(x,I)·t)) ⊙ h(x,I)`
+   - No ODE solving — 100× faster than Neural ODEs
+   - Time-continuous adaptive gating mechanism
+   - From: Hasani et al., Nature Machine Intelligence (2022)
+2. **Mamba-2 SSD (State Space Duality)**
+   - `h_t = A_t·h_{t-1} + B_t·x_t`, `y_t = C_t^T·h_t`
+   - O(N) linear complexity (vs O(N²) attention)
+   - Fully parallelizable via associative scan
+   - Pure PyTorch — no CUDA kernels needed
+   - From: Dao & Gu, "Transformers are SSMs" (2024)
+3. **Physics-Informed Regularization**
+   - Total Variation + Spectral + Gradient constraints
+   - Training-only regularizer — zero inference cost
+   - Pattern from: Bastek & Sun, ICLR 2025
+4. **TAESD VAE**
+   - < 1M parameters — 84× smaller than SD VAE
+   - Near-instant encoding/decoding
+   - From: madebyollin/taesd
+## Model Variants
+| Variant | Parameters | Hidden Dim | Stages | Blocks/Stage | T4 VRAM |
+|---------|-----------|------------|--------|--------------|---------|
+| **Tiny** | ~2M | 128 | 2 | 2 | < 2 GB |
+| **Small** | ~8M | 256 | 4 | 4 | ~4 GB |
+| **Base** | ~30M | 384 | 6 | 6 | ~8 GB |
+## Quick Start
+### Google Colab
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/krystv/LiquidFlow-Gen/blob/main/LiquidFlow_Colab.ipynb)
+1. Open the notebook
+2. Runtime → Change runtime type → **GPU (T4)**
+3. Run all cells
+### Local Training
+```bash
+# Clone
+git clone https://huggingface.co/krystv/LiquidFlow-Gen
+cd LiquidFlow-Gen
+# Install
+pip install torch torchvision diffusers tqdm pillow numpy
+# Train (small model, 128px, CIFAR-10)
+python train.py \
+    --dataset cifar10 \
+    --image_size 128 \
+    --variant small \
+    --batch_size 32 \
+    --epochs 100 \
+    --lr 2e-4
+# Train (base model, 512px)
+python train.py \
+    --dataset cifar10 \
+    --image_size 512 \
+    --variant base \
+    --batch_size 8 \
+    --epochs 200 \
+    --lr 1e-4
+```
+### Generate Samples
+```python
+from liquid_flow.generator import create_liquidflow
+from liquid_flow.vae_wrapper import TAESDWrapper
+# Load model
+model = create_liquidflow(variant='small', image_size=128)
+model.load_state_dict(torch.load('best_model.pt'))
+model = model.cuda().eval()
+# Load VAE
+vae = TAESDWrapper.load('cuda')
+# Generate
+latents = model.sample(batch_size=16, steps=50, ddim=True)
+images = TAESDWrapper.decode(vae, latents)
+```
+## Training Details
+### Default Hyperparameters
+- Optimizer: AdamW (β₁=0.9, β₂=0.999)
+- LR: 2×10⁻⁴ (tiny/small), 1×10⁻⁴ (base)
+- Weight Decay: 10⁻⁴
+- LR Schedule: Cosine annealing
+- Gradient Clipping: 1.0
+- AMP: Enabled (when CUDA available)
+### Physics Regularization Weights
+- TV (Total Variation): 0.01
+- Conservation of Intensity: 0.001
+- Spectral Regularizer: 0.01
+- Gradient Penalty: 0.001
+### Datasets Supported
+- CIFAR-10, CIFAR-100, STL-10
+- CelebA, LSUN (requires download)
+- ImageNet (provide path)
+## Mobile Deployment
+LiquidFlow uses pure PyTorch — **no custom CUDA kernels**:
+```python
+# Export to ONNX
+torch.onnx.export(model, (x, t), 'liquidflow.onnx',
+                  input_names=['noisy_latent', 'timestep'],
+                  output_names=['predicted_noise'],
+                  opset_version=14)
+# Convert to CoreML (iOS)
+# coremltools.converters.onnx.convert(model='liquidflow.onnx')
+# Convert to TFLite (Android)
+# onnx-tf convert -i liquidflow.onnx -o liquidflow.pb
+```
+## Why This Works (Research Validation)
+### DiMSUM (NeurIPS 2024)
+Mamba-based diffusion beats DiT transformers on ImageNet generation (FID 2.11 vs 2.27). Mamba's O(N) complexity enables 3× faster convergence than attention-based models.
+### PINNMamba (ICML 2025)
+SSM + Physics constraints are compatible and synergistic. Mamba's selective scan naturally handles the spatio-temporal nature of PDE residuals.
+### LiteVAE / TAESD
+Wavelet-based and tiny VAEs provide sufficient latent quality for diffusion at < 1% of the parameter count of standard VAEs. TAESD is used by 100+ real-time diffusion demos on HF Spaces.
+### DeepSeek V3 Insights
+- Auxiliary-loss-free training (apply to physics weights)
+- Multi-head architecture for efficiency
+- DualPipe for overlapping computation
+## Repository Structure
+```
+LiquidFlow-Gen/
+├── liquid_flow/
+│   ├── __init__.py          # Package init
+│   ├── cfc_cell.py          # CfC Liquid NN implementation
+│   ├── mamba2_ssd.py        # Mamba-2 SSD implementation
+│   ├── liquid_flow_block.py # Hybrid CfC+Mamba block
+│   ├── generator.py         # Full diffusion generator
+│   ├── vae_wrapper.py       # VAE interfaces
+│   ├── physics_loss.py      # Physics regularizers
+│   └── trainer.py           # Training utilities
+├── train.py                 # CLI training script
+├── LiquidFlow_Colab.ipynb   # Colab notebook
+└── README.md                # This file
+```
+## Citations
+```bibtex
+@article{hasani2022cfc,
+  title={Closed-form continuous-time neural networks},
+  author={Hasani, Ramin and Lechner, Mathias and Amini, Alexander and others},
+  journal={Nature Machine Intelligence},
+  year={2022}
+}
+@article{dao2024mamba2,
+  title={Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality},
+  author={Dao, Tri and Gu, Albert},
+  journal={arXiv:2405.21060},
+  year={2024}
+}
+@inproceedings{bastek2025physics,
+  title={Physics-Informed Diffusion Models},
+  author={Bastek, Jan-Hendrik and Sun, WaiChing},
+  booktitle={ICLR},
+  year={2025}
+}
+@article{pham2024dimsum,
+  title={DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation},
+  author={Pham, Hao and others},
+  journal={NeurIPS},
+  year={2024}
+}
+```
+## License
+MIT