{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 🌊 LiquidDiffusion: Attention-Free Image Generation with Liquid Neural Networks\n", "\n", "**A novel image generation architecture** that replaces attention with Parallel CfC (Closed-form Continuous-depth) blocks from Liquid Neural Networks.\n", "\n", "## Key Innovations\n", "- **No attention mechanism** — all spatial mixing via multi-scale depthwise convolutions\n", "- **Fully parallelizable** — no sequential ODE solving loops (unlike original LTC/Neural ODE)\n", "- **Diffusion timestep IS the liquid time constant** — natural CfC-diffusion bridge\n", "- **Liquid relaxation residuals** — time-aware skip connections that adapt to noise level\n", "- **Fits in 16GB VRAM** — designed for Colab free tier (T4 GPU)\n", "\n", "## Architecture Based On\n", "- [CfC Networks](https://arxiv.org/abs/2106.13898) (Hasani et al., Nature Machine Intelligence 2022)\n", "- [LiquidTAD](https://arxiv.org/abs/2604.18274) — parallel liquid relaxation\n", "- [USM](https://arxiv.org/abs/2504.13499) — U-Shape architecture for diffusion\n", "- [Rectified Flow](https://arxiv.org/abs/2209.03003) — simplest flow matching objective\n", "\n", "## Training: Rectified Flow\n", "```\n", "x_t = (1-t)*x0 + t*noise, t ~ U[0,1]\n", "Loss = MSE(model(x_t, t), noise - x0) # velocity prediction\n", "```\n", "That's it — no noise schedule, no variance, just MSE on a straight-line velocity." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 🔧 Setup" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Install dependencies\n", "!pip install -q torch torchvision datasets Pillow matplotlib tqdm accelerate" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Clone the repo\n", "!git clone https://huggingface.co/krystv/liquid-diffusion\n", "%cd liquid-diffusion" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import torch\n", "print(f'PyTorch: {torch.__version__}')\n", "print(f'CUDA available: {torch.cuda.is_available()}')\n", "if torch.cuda.is_available():\n", " print(f'GPU: {torch.cuda.get_device_name(0)}')\n", " print(f'VRAM: {torch.cuda.get_device_properties(0).total_mem / 1e9:.1f} GB')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 📐 Architecture Overview\n", "\n", "The core innovation is the **ParallelCfCBlock** — a parallelized version of CfC (Closed-form Continuous-depth) networks adapted for 2D image features:\n", "\n", "```\n", "CfC Equation (Hasani et al. 2022, Eq. 10):\n", " x(t) = σ(-f·t) ⊙ g + (1 - σ(-f·t)) ⊙ h\n", "\n", "Our adaptation for image generation:\n", " backbone = SiLU(PointwiseConv(DepthwiseConv(features))) # shared spatial context\n", " f = Conv1x1(backbone) # time-constant gate\n", " g = DWConv→SiLU→Conv1x1(backbone) # \"from\" state\n", " h = DWConv→SiLU→Conv1x1(backbone) # \"to\" state (attractor)\n", " gate = σ(time_a(t_emb) · f - time_b(t_emb)) # liquid time gate\n", " cfc_out = gate · g + (1 - gate) · h # CfC interpolation\n", " \n", " # Liquid relaxation (from LiquidTAD):\n", " α = exp(-softplus(ρ) · |t|) # time-aware residual weight\n", " output = α · input + (1 - α) · cfc_out # adapts to noise level\n", "```\n", "\n", "The **diffusion timestep t** serves double duty:\n", "1. Standard: conditions the denoiser via AdaLN scale/shift\n", "2. Novel: acts as the CfC time parameter — controls interpolation between g and h\n", "\n", "This means: at low noise (t≈0), the gate is balanced → flexible processing.\n", "At high noise (t≈1), the gate saturates → specialized denoising." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 🧪 Quick Test (verify model works)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Run the test suite\n", "!python test_model.py" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ⚙️ Training Configuration\n", "\n", "Choose your config based on GPU and target resolution:\n", "\n", "| Config | Params | Resolution | Batch Size | VRAM | Training Time |\n", "|--------|--------|-----------|------------|------|---------------|\n", "| tiny | ~8M | 256×256 | 8 | ~6GB | ~3h (100K steps) |\n", "| small | ~25M | 256×256 | 4 | ~10GB | ~6h (100K steps) |\n", "| base | ~65M | 512×512 | 2 | ~14GB | ~12h (100K steps) |\n", "\n", "Recommended datasets:\n", "- `huggan/CelebA-HQ` — 30K high-quality face images (256px)\n", "- `huggan/flowers-102-categories` — flowers (various)\n", "- `lambdalabs/naruto-blip-captions` — anime style (~1K)\n", "- `Norod78/simpsons-blip-captions` — cartoon style\n", "- Any folder of images" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#@title Training Configuration {display-mode: \"form\"}\n", "\n", "#@markdown ### Model\n", "model_size = \"tiny\" #@param [\"tiny\", \"small\", \"base\"]\n", "\n", "#@markdown ### Data\n", "dataset_name = \"huggan/CelebA-HQ\" #@param {type:\"string\"}\n", "image_column = \"image\" #@param {type:\"string\"}\n", "image_size = 256 #@param [64, 128, 256, 512] {type:\"integer\"}\n", "max_samples = 0 #@param {type:\"integer\"}\n", "\n", "#@markdown ### Training\n", "batch_size = 8 #@param {type:\"integer\"}\n", "learning_rate = 1e-4 #@param {type:\"number\"}\n", "weight_decay = 0.01 #@param {type:\"number\"}\n", "total_steps = 100000 #@param {type:\"integer\"}\n", "warmup_steps = 1000 #@param {type:\"integer\"}\n", "grad_clip = 1.0 #@param {type:\"number\"}\n", "ema_decay = 0.9999 #@param {type:\"number\"}\n", "time_sampling = \"logit_normal\" #@param [\"uniform\", \"logit_normal\"]\n", "\n", "#@markdown ### Sampling & Logging\n", "sample_every = 2000 #@param {type:\"integer\"}\n", "save_every = 5000 #@param {type:\"integer\"}\n", "num_sample_steps = 50 #@param {type:\"integer\"}\n", "num_sample_images = 4 #@param {type:\"integer\"}\n", "\n", "#@markdown ### Hardware\n", "use_amp = True #@param {type:\"boolean\"}\n", "amp_dtype = \"float16\" #@param [\"float16\", \"bfloat16\"]\n", "num_workers = 2 #@param {type:\"integer\"}\n", "\n", "# Auto-adjust batch size for resolution\n", "if image_size >= 512 and batch_size > 4:\n", " batch_size = min(batch_size, 2)\n", " print(f\"Auto-reduced batch_size to {batch_size} for {image_size}px\")\n", "\n", "if max_samples == 0:\n", " max_samples = None\n", "\n", "print(f\"\\nConfig: {model_size} model, {image_size}px, batch={batch_size}, lr={learning_rate}\")\n", "print(f\"Dataset: {dataset_name}, time_sampling={time_sampling}\")\n", "print(f\"Total steps: {total_steps:,}, AMP: {use_amp} ({amp_dtype})\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 📦 Load Dataset" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from datasets import load_dataset\n", "from liquid_diffusion.trainer import ImageDataset\n", "from torch.utils.data import DataLoader\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "\n", "# Load dataset\n", "print(f\"Loading {dataset_name}...\")\n", "dataset = ImageDataset(\n", " source=dataset_name,\n", " image_size=image_size,\n", " image_column=image_column,\n", " max_samples=max_samples,\n", ")\n", "print(f\"Dataset size: {len(dataset)} images\")\n", "\n", "dataloader = DataLoader(\n", " dataset, batch_size=batch_size, shuffle=True,\n", " num_workers=num_workers, pin_memory=True, drop_last=True,\n", ")\n", "\n", "# Show some samples\n", "sample_batch = next(iter(dataloader))\n", "fig, axes = plt.subplots(1, min(4, batch_size), figsize=(16, 4))\n", "for i, ax in enumerate(axes):\n", " img = sample_batch[i].permute(1, 2, 0).numpy() * 0.5 + 0.5 # [-1,1] -> [0,1]\n", " ax.imshow(np.clip(img, 0, 1))\n", " ax.axis('off')\n", "plt.suptitle(f'Training samples ({image_size}×{image_size})')\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 🏗️ Build Model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from liquid_diffusion.model import (\n", " liquid_diffusion_tiny, liquid_diffusion_small, liquid_diffusion_base\n", ")\n", "\n", "# Build model\n", "model_factories = {\n", " 'tiny': liquid_diffusion_tiny,\n", " 'small': liquid_diffusion_small,\n", " 'base': liquid_diffusion_base,\n", "}\n", "\n", "model = model_factories[model_size]()\n", "total_params, trainable_params = model.count_params()\n", "print(f\"Model: liquid_diffusion_{model_size}\")\n", "print(f\"Parameters: {total_params:,} ({total_params/1e6:.1f}M)\")\n", "print(f\"Trainable: {trainable_params:,}\")\n", "\n", "# Quick forward pass test\n", "device = 'cuda' if torch.cuda.is_available() else 'cpu'\n", "model = model.to(device)\n", "test_x = torch.randn(1, 3, image_size, image_size, device=device)\n", "test_t = torch.tensor([0.5], device=device)\n", "with torch.no_grad():\n", " test_out = model(test_x, test_t)\n", "print(f\"Forward pass OK: {test_x.shape} → {test_out.shape}\")\n", "del test_x, test_out\n", "if device == 'cuda':\n", " torch.cuda.empty_cache()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 🚀 Train!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import time\n", "import math\n", "from tqdm.auto import tqdm\n", "from torchvision.utils import save_image, make_grid\n", "from liquid_diffusion.trainer import RectifiedFlowTrainer, get_cosine_schedule_with_warmup\n", "\n", "# Create output directories\n", "os.makedirs('checkpoints', exist_ok=True)\n", "os.makedirs('samples', exist_ok=True)\n", "\n", "# Build trainer\n", "trainer = RectifiedFlowTrainer(\n", " model=model,\n", " lr=learning_rate,\n", " weight_decay=weight_decay,\n", " ema_decay=ema_decay,\n", " grad_clip=grad_clip,\n", " time_sampling=time_sampling,\n", " device=device,\n", " use_amp=use_amp,\n", " amp_dtype=amp_dtype,\n", ")\n", "\n", "# Learning rate scheduler\n", "scheduler = get_cosine_schedule_with_warmup(\n", " trainer.optimizer, warmup_steps, total_steps\n", ")\n", "\n", "# Optional: resume from checkpoint\n", "resume_path = 'checkpoints/latest.pt'\n", "if os.path.exists(resume_path):\n", " trainer.load_checkpoint(resume_path)\n", " print(f\"Resumed from step {trainer.step}\")\n", "\n", "print(f\"\\n{'='*60}\")\n", "print(f\"Starting training: {total_steps:,} steps\")\n", "print(f\"Model: liquid_diffusion_{model_size} ({total_params/1e6:.1f}M params)\")\n", "print(f\"Resolution: {image_size}×{image_size}, Batch: {batch_size}\")\n", "print(f\"LR: {learning_rate}, Warmup: {warmup_steps}, AMP: {use_amp}\")\n", "print(f\"{'='*60}\\n\")\n", "\n", "# Training loop\n", "start_time = time.time()\n", "data_iter = iter(dataloader)\n", "pbar = tqdm(range(trainer.step, total_steps), desc='Training', dynamic_ncols=True)\n", "loss_history = []\n", "\n", "for step in pbar:\n", " # Get batch (cycle through dataset)\n", " try:\n", " batch = next(data_iter)\n", " except StopIteration:\n", " data_iter = iter(dataloader)\n", " batch = next(data_iter)\n", " \n", " x0 = batch.to(device)\n", " \n", " # Train step\n", " metrics = trainer.train_step(x0)\n", " scheduler.step()\n", " \n", " # Logging\n", " loss_history.append(metrics['loss'])\n", " avg_loss = sum(loss_history[-100:]) / len(loss_history[-100:])\n", " lr_current = scheduler.get_last_lr()[0]\n", " \n", " pbar.set_postfix({\n", " 'loss': f\"{metrics['loss']:.4f}\",\n", " 'avg': f\"{avg_loss:.4f}\",\n", " 'lr': f\"{lr_current:.6f}\",\n", " 'gn': f\"{metrics['grad_norm']:.2f}\",\n", " })\n", " \n", " # Generate samples\n", " if (step + 1) % sample_every == 0 or step == 0:\n", " print(f\"\\nGenerating samples at step {step+1}...\")\n", " samples = trainer.sample(\n", " batch_size=num_sample_images, image_size=image_size,\n", " num_steps=num_sample_steps, use_ema=True\n", " )\n", " # Save grid\n", " grid = make_grid(samples * 0.5 + 0.5, nrow=int(math.sqrt(num_sample_images)), padding=2)\n", " save_image(grid, f'samples/step_{step+1:06d}.png')\n", " \n", " # Display\n", " fig, axes = plt.subplots(1, num_sample_images, figsize=(4*num_sample_images, 4))\n", " if num_sample_images == 1:\n", " axes = [axes]\n", " for i, ax in enumerate(axes):\n", " img = samples[i].cpu().permute(1, 2, 0).numpy() * 0.5 + 0.5\n", " ax.imshow(np.clip(img, 0, 1))\n", " ax.axis('off')\n", " plt.suptitle(f'Step {step+1} (EMA samples, {num_sample_steps} Euler steps)')\n", " plt.tight_layout()\n", " plt.show()\n", " \n", " # Save checkpoint\n", " if (step + 1) % save_every == 0:\n", " trainer.save_checkpoint(f'checkpoints/step_{step+1:06d}.pt', extra={'config': {\n", " 'model_size': model_size, 'image_size': image_size,\n", " 'batch_size': batch_size, 'learning_rate': learning_rate,\n", " }})\n", " trainer.save_checkpoint('checkpoints/latest.pt')\n", " print(f\"Saved checkpoint at step {step+1}\")\n", " \n", " # Safety: check for NaN\n", " if math.isnan(metrics['loss']):\n", " print(\"\\n⚠️ NaN loss detected! Stopping training.\")\n", " print(\"Try: reduce learning_rate, increase grad_clip, or use smaller model\")\n", " break\n", "\n", "elapsed = time.time() - start_time\n", "print(f\"\\nTraining complete! {trainer.step:,} steps in {elapsed/3600:.1f}h\")\n", "print(f\"Final avg loss: {sum(loss_history[-100:])/len(loss_history[-100:]):.4f}\")\n", "\n", "# Final save\n", "trainer.save_checkpoint('checkpoints/final.pt')\n", "print(\"Saved final checkpoint.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 📊 Training Loss Curve" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "\n", "if loss_history:\n", " # Smooth the loss\n", " window = min(100, len(loss_history) // 5 + 1)\n", " smoothed = np.convolve(loss_history, np.ones(window)/window, mode='valid')\n", " \n", " fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))\n", " \n", " ax1.plot(loss_history, alpha=0.3, label='Raw')\n", " ax1.plot(range(window-1, len(loss_history)), smoothed, label=f'Smoothed (w={window})')\n", " ax1.set_xlabel('Step')\n", " ax1.set_ylabel('Loss')\n", " ax1.set_title('Training Loss')\n", " ax1.legend()\n", " ax1.grid(True, alpha=0.3)\n", " \n", " ax2.plot(loss_history[-min(1000, len(loss_history)):], alpha=0.5)\n", " ax2.set_xlabel('Recent Steps')\n", " ax2.set_ylabel('Loss')\n", " ax2.set_title('Recent Loss (last 1000 steps)')\n", " ax2.grid(True, alpha=0.3)\n", " \n", " plt.tight_layout()\n", " plt.show()\n", "else:\n", " print(\"No training history yet.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 🎨 Generate Images" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#@title Generation Settings {display-mode: \"form\"}\n", "num_images = 8 #@param {type:\"integer\"}\n", "sampling_steps = 50 #@param [25, 50, 100, 200] {type:\"integer\"}\n", "use_ema_model = True #@param {type:\"boolean\"}\n", "\n", "print(f\"Generating {num_images} images with {sampling_steps} Euler steps...\")\n", "samples = trainer.sample(\n", " batch_size=num_images, image_size=image_size,\n", " num_steps=sampling_steps, use_ema=use_ema_model,\n", ")\n", "\n", "# Display\n", "ncols = min(4, num_images)\n", "nrows = (num_images + ncols - 1) // ncols\n", "fig, axes = plt.subplots(nrows, ncols, figsize=(4*ncols, 4*nrows))\n", "if nrows == 1 and ncols == 1:\n", " axes = [[axes]]\n", "elif nrows == 1:\n", " axes = [axes]\n", "for i in range(num_images):\n", " r, c = i // ncols, i % ncols\n", " img = samples[i].cpu().permute(1, 2, 0).numpy() * 0.5 + 0.5\n", " axes[r][c].imshow(np.clip(img, 0, 1))\n", " axes[r][c].axis('off')\n", "# Hide unused axes\n", "for i in range(num_images, nrows * ncols):\n", " r, c = i // ncols, i % ncols\n", " axes[r][c].axis('off')\n", "plt.suptitle(f'LiquidDiffusion Samples ({sampling_steps} steps, {\"EMA\" if use_ema_model else \"online\"})')\n", "plt.tight_layout()\n", "plt.show()\n", "\n", "# Save\n", "grid = make_grid(samples * 0.5 + 0.5, nrow=ncols, padding=2)\n", "save_image(grid, 'samples/generated.png')\n", "print(\"Saved to samples/generated.png\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 🔬 Visualize the Denoising Process" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Show step-by-step denoising\n", "num_vis_steps = 10\n", "total_euler_steps = 50\n", "vis_interval = total_euler_steps // num_vis_steps\n", "\n", "model_vis = trainer.ema_model\n", "model_vis.eval()\n", "\n", "z = torch.randn(1, 3, image_size, image_size, device=device)\n", "dt = 1.0 / total_euler_steps\n", "intermediates = [z.clone()]\n", "\n", "with torch.no_grad():\n", " for i in range(total_euler_steps, 0, -1):\n", " t = torch.full((1,), i / total_euler_steps, device=device)\n", " v = model_vis(z, t)\n", " z = z - v * dt\n", " if (total_euler_steps - i + 1) % vis_interval == 0:\n", " intermediates.append(z.clone())\n", "\n", "intermediates.append(z.clamp(-1, 1))\n", "\n", "fig, axes = plt.subplots(1, len(intermediates), figsize=(3*len(intermediates), 3))\n", "for idx, (ax, img_t) in enumerate(zip(axes, intermediates)):\n", " img = img_t[0].cpu().permute(1, 2, 0).numpy() * 0.5 + 0.5\n", " ax.imshow(np.clip(img, 0, 1))\n", " ax.axis('off')\n", " if idx == 0:\n", " ax.set_title('Noise (t=1)')\n", " elif idx == len(intermediates) - 1:\n", " ax.set_title('Output (t=0)')\n", " else:\n", " ax.set_title(f't={1-idx*vis_interval/total_euler_steps:.1f}')\n", "plt.suptitle('LiquidDiffusion Denoising Process')\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 💾 Save & Export Model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Save final checkpoint\n", "trainer.save_checkpoint('checkpoints/final.pt', extra={\n", " 'config': {\n", " 'model_size': model_size,\n", " 'image_size': image_size,\n", " 'total_params': total_params,\n", " 'training_steps': trainer.step,\n", " 'dataset': dataset_name,\n", " }\n", "})\n", "print(f\"Saved checkpoint: checkpoints/final.pt\")\n", "print(f\"Model: liquid_diffusion_{model_size} ({total_params/1e6:.1f}M params)\")\n", "print(f\"Trained for {trainer.step:,} steps on {dataset_name}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Optional: Push to Hugging Face Hub\n", "# Uncomment and fill in your details:\n", "\n", "# from huggingface_hub import HfApi, login\n", "# login() # or use token\n", "# api = HfApi()\n", "# repo_id = \"your-username/liquid-diffusion-celebahq-256\" # change this\n", "# api.create_repo(repo_id, exist_ok=True)\n", "# api.upload_file('checkpoints/final.pt', 'model.pt', repo_id)\n", "# api.upload_folder('liquid_diffusion/', 'liquid_diffusion/', repo_id)\n", "# print(f\"Uploaded to https://huggingface.co/{repo_id}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 📚 Architecture Details & Theory\n", "\n", "### Why Liquid Neural Networks for Image Generation?\n", "\n", "**Liquid Time-Constant (LTC) Networks** (Hasani et al., 2020) define neurons with input-dependent time constants:\n", "\n", "```\n", "dx/dt = -[1/τ + f(x,I,θ)] · x + f(x,I,θ) · A\n", "```\n", "\n", "The system time constant `τ_sys = τ/(1 + τ·f)` adapts dynamically based on input — the neuron speeds up or slows down its response depending on what it sees. This is the \"liquid\" property.\n", "\n", "**CfC (Closed-form Continuous-depth)** networks (Hasani et al., 2022) solve this ODE in closed form:\n", "\n", "```\n", "x(t) = σ(-f·t) ⊙ g + (1 - σ(-f·t)) ⊙ h\n", "```\n", "\n", "This eliminates the ODE solver — making CfC **fully parallelizable** while preserving the adaptive time constant behavior.\n", "\n", "### Our Innovation: CfC × Diffusion Timestep\n", "\n", "In diffusion models, the network must process images at different noise levels `t ∈ [0,1]`. We observe that:\n", "\n", "1. CfC's time parameter `t` controls interpolation between two learned states\n", "2. Diffusion's noise level `t` controls how the denoiser should behave\n", "3. **These are the same concept** — the CfC time parameter IS the diffusion timestep\n", "\n", "This gives us:\n", "- At `t≈0` (clean images): σ(-f·t)≈0.5, balanced processing for detail refinement\n", "- At `t≈1` (noisy images): σ(-f·t) saturates, specialized denoising\n", "- The gate `f` is **input-dependent** — different image content gets different time responses\n", "\n", "### References\n", "\n", "1. Hasani et al., \"Liquid Time-constant Networks\" (AAAI 2021) — arxiv:2006.04439\n", "2. Hasani et al., \"Closed-form Continuous-time Neural Networks\" (Nature MI 2022) — arxiv:2106.13898\n", "3. LiquidTAD: Parallel liquid relaxation — arxiv:2604.18274\n", "4. USM: U-Shape Mamba for diffusion — arxiv:2504.13499\n", "5. DiffuSSM: Diffusion without attention — arxiv:2311.18257\n", "6. Liu et al., \"Flow Straight and Fast: Rectified Flow\" (ICLR 2023) — arxiv:2209.03003" ] } ], "metadata": { "accelerator": "GPU", "colab": { "gpuType": "T4", "provenance": [], "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "name": "python", "version": "3.10.0" } }, "nbformat": 4, "nbformat_minor": 0 }