Add SAII-CLDM inference pipeline

Add a unified infer.py entry point for SAII-LDDPM and SAII-CLDM, add the differentiable Overthrust forward operator, update Overthrust evaluation to support CLDM, and align CLDM sampling defaults with the published SAII-CLDM resampling setup.

Files changed (5) hide show

README.md +279 -80
inference/eval_overthrust.py +33 -6
inference/{infer_LDDPM.py → infer.py} +55 -18
inference/util.py +106 -0
pipeline.py +331 -1

README.md CHANGED Viewed

@@ -1,113 +1,312 @@
----
-library_name: diffusers
-pipeline_tag: image-to-image
-tags:
-- seismic-inversion
-- impedance-inversion
-- diffusion
-- ddpm
-- overthrust
----
-# Seismic-LDDPM
-Seismic-LDDPM is a latent DDPM pipeline for seismic impedance inversion. The
-pipeline takes a low-frequency impedance image (`dipin`) and a synthetic seismic
-record (`record`) and predicts the impedance image.
-This repository includes:
-- Diffusers-format model components: `vq_model`, `unet`, `scheduler`, and
-  `condition_encoder`.
-- `SeismicImpInvLDDPMPipeline` in `pipeline.py`.
-- A complete Overthrust benchmark sample at `data/Overthrust_trueimp.mat`.
-- Inference scripts under `inference/`.
-## Installation
 ```bash
-git clone https://huggingface.co/mally-2000/seismic-lddpm
-cd seismic-lddpm
-pip install -r requirements.txt
 ```
-## Overthrust Evaluation
-The Overthrust evaluation script is intentionally fixed to the bundled
-`data/Overthrust_trueimp.mat`. It cuts the full model into six `256 x 256`
-patches, synthesizes the seismic records and low-frequency impedance inputs,
-runs inference, stitches the six predictions back together, and computes the
-metrics.
 ```bash
-python inference/eval_overthrust.py \
-  --model . \
-  --output outputs/overthrust \
-  --num-inference-steps 1000
 ```
-Outputs:
-- `outputs/overthrust/full_target.npy`
-- `outputs/overthrust/full_prediction.npy`
-- `outputs/overthrust/full_reconstruction.npy`
-- `outputs/overthrust/comparison_impedance.png`
-- `outputs/overthrust/metrics_summary.json`
-## Benchmark Result
-Evaluated locally on the bundled Overthrust benchmark with 1000 DDPM steps,
-`noise_snr=15`, `dipin_v=0.012`, `f0=30`, `phase=0`, `seed=1234`, and patch
-indices `[0, 1, 2, 3, 4, 5]`.
-| Space | PSNR | SSIM | PCC | RRE | NMSE |
-|---|---:|---:|---:|---:|---:|
-| Normalized | 30.7698 | 0.9339 | 0.9963 | 0.0435 | 0.001894 |
-| Impedance | 33.4413 | 0.9554 | 0.9957 | 0.0324 | 0.001050 |
-| VQ reconstruction | 37.7954 | 0.9677 | 0.9983 | 0.0209 | 0.000435 |
-![Overthrust evaluation](assets/demo.png)
-## Single-Sample Inference
-For a single default Overthrust patch:
 ```bash
-python inference/infer_LDDPM.py
 ```
-The script builds one Overthrust test sample internally, synthesizes the
-low-frequency impedance and seismic record, and saves `prediction.npy`,
-`target.npy`, and `comparison.png` under `outputs/infer_LDDPM`.
-## Python Usage
-```python
-import torch
-from pipeline import SeismicImpInvLDDPMPipeline
-pipe = SeismicImpInvLDDPMPipeline.from_pretrained(
-    "mally-2000/seismic-lddpm",
-    torch_dtype=torch.float32,
-    trust_remote_code=True,
-).to("cuda")
-result = pipe(
-    dipin=dipin,      # torch.Tensor, BCHW
-    record=record,    # torch.Tensor, BCHW
-    num_inference_steps=1000,
-    seed=1234,
-)
-prediction = result.impedance_samples
 ```
-## Notes
-- `inference/dataset.py` contains a lightweight `SeismicBase` and
-  `OverthrustTrueimpDataset`; it does not depend on the original training
-  repository's `ldm.data.seisimic`.
-- Synthetic record generation is seeded through the benchmark configuration so
-  the published Overthrust evaluation is reproducible.
-- The bundled Overthrust file is used only as a compact benchmark input for
-  reproducing this model's inference pipeline.

+## 快速开始
+## Seismic-LDDPM 开源推理
+面向 Hugging Face 模型仓库的最小推理入口包括：
+- `inference/infer.py`：统一单样本推理入口，默认使用 SAII-LDDPM，传入 `CLDM` 使用 SAII-CLDM。
+- `inference/eval_overthrust.py`：固定 Overthrust benchmark，不接收外部数据路径，使用仓库内 `data/Overthrust_trueimp.mat`，完成 6 个 patch 推理、拼接、指标计算和对比图保存。
+Overthrust 评估示例：
+```bash
+python inference/eval_overthrust.py \
+  --model mally-2000/seismic-lddpm \
+  --output outputs/overthrust \
+  --num-inference-steps 1000
+```
+单样本推理示例：
 ```bash
+python inference/infer.py
 ```
+SAII-CLDM 单样本推理示例：
+```bash
+python inference/infer.py CLDM
+```
+### 环境
 ```bash
+uv sync
+source .venv/bin/activate
 ```
+### 训练
+正式训练入口现在统一为 `scripts/train.py`，保留 YAML 配置驱动，但不再依赖 `pytorch_lightning.Trainer` 作为主线：
+```bash
+uv run scripts/train.py \
+  --config-path configs/task/F02_diffusers.yaml \
+  --output-dir tmp/train_f02 \
+  --max-train-steps 10 \
+  data.params.batch_size=1 data.params.train.params.number=1
+```
+常用参数：
+- `--config-path`：指定训练配置。
+- `--output-dir`：训练 summary 和 `diffusers-export/` 输出目录。
+- `--max-train-steps`：跑多少步；`0` 表示只做 dry-run / 装配检查。
+- `--train-batch-size` / `--learning-rate`：覆盖 YAML 默认值。
+- 额外的 `key=value` 参数会按 OmegaConf dotlist 覆盖 YAML。
+`F02` 兼容包装脚本仍保留：
+```bash
+uv run scripts/train_diffusers_f02.py --max-train-steps 10 data.params.batch_size=1
+```
+两阶段训练现在可以直接串起来：
+```bash
+uv run scripts/train_two_stage_latent_diffusion.py \
+  --output-dir tmp/two_stage_train \
+  --stage1-max-train-steps 10 \
+  --stage2-max-train-steps 10 \
+  --stage1-set data.params.batch_size=1 \
+  --stage1-set data.params.train.params.number=1 \
+  --stage2-set data.params.batch_size=1 \
+  --stage2-set data.params.train.params.number=1
+```
+这个脚本会：
+- 先跑 `configs/task/F01_diffusers.yaml`
+- 再把 `stage1_vq/diffusers-export/vq_model` 自动注入 `F02` 的 `model.params.official_vq_pretrained_dir`
+- `configs/task/F02_diffusers.yaml` 本身不再内置硬编码的 VQ / UNet / condition encoder 资产路径
+- 最后在输出根目录写 `two_stage_summary.json`
+`main.py` 仍可作为 legacy 入口使用，但不再是正式训练主线。
+## 迁移状态
+当前仓库关于 `2D -> diffusers` 的实际落地状态，以 [docs/diffusers_status.md](/root/test/cldm2/docs/diffusers_status.md) 为准。
+这轮额外补了两份面向落地使用的说明：
+- [docs/colab_quickstart.md](/root/test/cldm2/docs/colab_quickstart.md)
+- [docs/diffusers_cleanup_plan.md](/root/test/cldm2/docs/diffusers_cleanup_plan.md)
+简述：
+- `2D` 推理主线已经迁到 `diffusers`
+- `2D` 训练主线还没有迁完
+- `3D` 运行入口和相关 Python 主线已清理，不在当前 diffusers 主线内
+- 旧路径保留为回归 oracle，不作为正式 `2D` 用户入口
+当前训练迁移实验配置见：
+- [docs/diffusers_f02_training_mvp.md](/root/test/cldm2/docs/diffusers_f02_training_mvp.md)
+- `configs/task/F02_diffusers.yaml`
+- `configs/task/F01_diffusers.yaml`
+当前已确认：
+- `scripts/train.py --config-path configs/task/F02_diffusers.yaml` 已可作为新的统一训练入口
+- `scripts/train.py` 可自动识别 `F02 diffusers`、legacy `F01/VQ` 和官方 `diffusers.VQModel` 的 `F01_diffusers`
+- 训练结束会导出 `diffusers-export/`，其中包含 `vq_model/`、`unet/`、`scheduler/`、`condition_encoder.pt`
+- `sample.py --export-dir ...` 在显式 dataset preset 模式下不再强制要求 `--project-config`，更适合 Colab
+- `main.py` 仍保留为 legacy 兼容路径，不再建议作为主训练入口继续扩展
+## 2D 推理主线
+当前 2D 推理默认入口已经切到 diffusers 主线：
+- `sample.py`：统一的 2D 推理入口，默认单样本，`batch` 子命令用于批量推理
+- `ldm/pipelines/seismic_inversion_pipeline.py`：共享的 `SeismicInversionPipeline`
+这条主线使用：
+- 官方 `diffusers.VQModel`
+- 官方 `diffusers.UNet2DModel`
+- 官方 scheduler buffer / timestep
+- 自定义 inversion loop 来保留 `ddim_resample + DPS`
+历史 `3D` 入口已删除；当前仓库只保留 `2D` 推理与训练迁移主线。
+### 单样本推理
+默认走 `field_testdata`：
 ```bash
+uv run sample.py \
+  --project-config /root/use_model_param/2025-04-18T15-59-17_A101/configs/2025-04-18T15-59-17-project.yaml \
+  --legacy-checkpoint /root/use_model_param/2025-04-18T15-59-17_A101/checkpoints/epoch=000211-step=000013991.ckpt \
+  --vq-dir /root/use_model_param/old_vqgan_diffusers_vqmodel \
+  --unet-dir tmp/a101_diffusers_unet \
+  --sample-index 0 \
+  --sampler-type ddim_resample \
+  --num-inference-steps 30
 ```
+默认输出：
+- `output-dir/summary.json`
+- `output-dir/sample_000_overview.png`
+- `output-dir/sample_000_*.png`
+如需关闭图像保存：
+```bash
+uv run sample.py --no-save-images
+```
+如需交互式展示保存后的 overview：
+```bash
+uv run sample.py --show
+```
+如需优先加载训练导出的自包含模型组件：
+```bash
+uv run sample.py \
+  --export-dir debuglogs/<run>/diffusers-export \
+  --sample-index 0 \
+  --dataset-name field_testdata \
+  --sampler-type ddim \
+  --num-inference-steps 30
+```
+说明：
+- `vq_model` / `unet` / `scheduler` / `condition_encoder` 会从 `--export-dir` 加载
+- 对 `field_testdata`、`feild_traindata`、`Marmousi*`、`Overthrust` 这类显式 dataset preset，不再强制需要 `project-config`
+- 如果使用 `--dataset-name config_train`，仍然需要 `project-config`
+### 批量推理
+默认走 `feild_traindata` 兼容 preset：
+```bash
+uv run sample.py batch \
+  --project-config /root/use_model_param/2025-04-18T15-59-17_A101/configs/2025-04-18T15-59-17-project.yaml \
+  --legacy-checkpoint /root/use_model_param/2025-04-18T15-59-17_A101/checkpoints/epoch=000211-step=000013991.ckpt \
+  --vq-dir /root/use_model_param/old_vqgan_diffusers_vqmodel \
+  --unet-dir tmp/a101_diffusers_unet \
+  --max-samples 8
+```
+默认输出：
+- `output-dir/x_samples_2d.npy`
+- `output-dir/x_true_2d.npy`
+- `output-dir/summary.json`
+按需保存逐样本图像：
+```bash
+uv run sample.py batch --save-images
+```
+## Colab
+当前最稳的 Colab 路线是：
+1. clone 仓库
+2. `pip install -r requirements-colab.txt`
+3. `pip install -e ./src/taming-transformers`
+4. 使用 `diffusers-export/` 直接跑 `sample.py`
+最小单样本命令示例：
+```bash
+python sample.py \
+  --export-dir /content/drive/MyDrive/SAII-CLDM/two_stage_run/stage2_f02/diffusers-export \
+  --dataset-name Marmousi3 \
+  --dataset-dt-path /content/drive/MyDrive/SAII-CLDM/data/dtA89-1.npz \
+  --sample-index 0 \
+  --sampler-type ddim \
+  --num-inference-steps 30 \
+  --output-dir /content/drive/MyDrive/SAII-CLDM/colab_outputs/sample_single \
+  --device cuda
+```
+完整步骤见 [docs/colab_quickstart.md](/root/test/cldm2/docs/colab_quickstart.md)。
+### 常用推理参数
+- `--dataset-name`：`config_train`、`feild_traindata`、`field_testdata`、`Marmousi*`、`Overthrust`
+- `--dataset-interval`
+- `--img-size`
+- `--f0`
+- `--f0-phase`
+- `--dipin-v`
+- `--noise-snr`
+- `--zhengyan-type`
+- `--noise-type`
+- `--sampler-type`：`ddim_resample`、`ddim`、`ddpm`
+- `--num-inference-steps`
+- `--eta`
+- `--use-dps` / `--no-use-dps`
+- `--dps-scale`
+- `--resample-interval`
+- `--sigma-a`
+- `--pixel-max-iters`
+- `--last-pixel-max-iters`
+- `--device`
+- `--seed`
+- `--export-dir`
+- `batch --start-index`
+- `batch --max-samples`
+## 推理流程
+当前 `sample.py` 的 2D 推理流程如下：
+```mermaid
+sequenceDiagram
+    participant Entry as sample.py
+    participant Runner as diffusers_inference_runner
+    participant Dataset as Dataset preset
+    participant Operator as zhengyan operator
+    participant Pipe as SeismicInversionPipeline
+    participant UNet as diffusers.UNet2DModel
+    participant VQ as diffusers.VQModel
+    Entry->>Runner: 解析 CLI
+    Runner->>Runner: 加载 project config / cond encoder / scheduler
+    Runner->>Dataset: 构建 2D 数据集并取样
+    Runner->>Operator: 构建正演算子
+    Runner->>Pipe: 调用 pipeline(image, dipin, record, measurement)
+    activate Pipe
+    Pipe->>VQ: 编码 dipin / image
+    Pipe->>UNet: 逐步预测噪声
+    alt sampler_type = ddim_resample
+        Pipe->>Pipe: DPS + resample + 可选像素优化
+    else sampler_type = ddim / ddpm
+        Pipe->>Pipe: 标准逆扩散更新
+    end
+    Pipe->>VQ: 解码最终 latent
+    Pipe-->>Runner: prediction / latents / measurement error
+    deactivate Pipe
+    Runner->>Runner: 写 summary / npy / 可视化
 ```
+## 回归与对比
+旧采样实现没有作为用户主入口保留，但继续作为回归 oracle：
+```bash
+uv run scripts/compare_legacy_and_diffusers_inversion.py \
+  --dataset-name config_train \
+  --sample-index 0 \
+  --sampler-type ddim_resample \
+  --num-inference-steps 10
+```
+这条脚本会同时产出：
+- `legacy/` 可视化
+- `official/` 可视化
+- `summary.json`
+当前仓库约定：
+- `legacy-like` / legacy 路径只用于验证
+- diffusers pipeline 路径用于正式 2D 推理主线
+更多背景可见：
+- `docs/diffusers_status.md`
+- `docs/official_sampling_comparison.md`
+## 代码结构
+- `ldm/pipelines`：diffusers pipeline 与可视化工具
+- `ldm/data`：地震数据集
+- `ldm/ldm_inverse`：正演与测量相关逻辑
+- `scripts`：转换、验证、对比和推理 runner
+- `tests`：pipeline 与 CLI smoke tests

inference/eval_overthrust.py CHANGED Viewed

@@ -17,7 +17,8 @@ if str(REPO_ROOT) not in sys.path:
     sys.path.insert(0, str(REPO_ROOT))
 from inference.dataset import OverthrustTrueimpDataset
-from pipeline import SeismicImpInvLDDPMPipeline
 OVERTHRUST_CONFIG = {
@@ -89,10 +90,17 @@ def save_comparison(
 def evaluate_overthrust(
     pipe: SeismicImpInvLDDPMPipeline,
     output_dir: str | Path = "outputs/overthrust",
-    num_inference_steps: int = 1000,
     device: str | torch.device | None = None,
 ) -> dict[str, object]:
     output_dir = Path(output_dir)
     output_dir.mkdir(parents=True, exist_ok=True)
     device = torch.device(device or ("cuda" if torch.cuda.is_available() else "cpu"))
@@ -132,12 +140,24 @@ def evaluate_overthrust(
         dipin = batch["dipin"].to(device)
         record = batch["record"].to(device)
         image = batch["image"].to(device)
         output = pipe(
             dipin=dipin,
             record=record,
             image=image,
             num_inference_steps=num_inference_steps,
             seeds=seeds,
         )
         prediction = output.impedance_samples
         reconstruction = output.impedance_reconstructed
@@ -161,7 +181,11 @@ def evaluate_overthrust(
     full_reconstruction_impedance = dataset.fan(full_reconstruction)
     metrics_summary = {
-        "config": {**OVERTHRUST_CONFIG, "num_inference_steps": num_inference_steps},
         "normalized": compute_metrics(full_prediction, full_target),
         "impedance": compute_metrics(full_prediction_impedance, full_target_impedance),
         "encode_impedance": compute_metrics(
@@ -188,23 +212,26 @@ def evaluate_overthrust(
 def parse_args() -> argparse.Namespace:
-    parser = argparse.ArgumentParser(description="Evaluate SAII-LDDPM on Overthrust.")
     parser.add_argument("--model", default="mally-2000/seismic-lddpm")
     parser.add_argument("--output", default="outputs/overthrust")
     parser.add_argument("--device", default=None)
-    parser.add_argument("--num-inference-steps", type=int, default=1000)
     return parser.parse_args()
 def main() -> None:
     args = parse_args()
-    pipe = SeismicImpInvLDDPMPipeline.from_pretrained(
         args.model,
         torch_dtype=torch.float32,
         trust_remote_code=True,
     )
     result = evaluate_overthrust(
         pipe,
         output_dir=args.output,
         num_inference_steps=args.num_inference_steps,
         device=args.device,

     sys.path.insert(0, str(REPO_ROOT))
 from inference.dataset import OverthrustTrueimpDataset
+from inference.util import OverthrustForwardOperator
+from pipeline import SeismicImpInvCLDMPipeline, SeismicImpInvLDDPMPipeline
 OVERTHRUST_CONFIG = {
 def evaluate_overthrust(
     pipe: SeismicImpInvLDDPMPipeline,
+    method: str = "LDDPM",
     output_dir: str | Path = "outputs/overthrust",
+    num_inference_steps: int | None = None,
     device: str | torch.device | None = None,
 ) -> dict[str, object]:
+    method = method.upper()
+    if method not in {"LDDPM", "CLDM"}:
+        raise ValueError("method must be LDDPM or CLDM")
+    if num_inference_steps is None:
+        num_inference_steps = 30 if method == "CLDM" else 1000
     output_dir = Path(output_dir)
     output_dir.mkdir(parents=True, exist_ok=True)
     device = torch.device(device or ("cuda" if torch.cuda.is_available() else "cpu"))
         dipin = batch["dipin"].to(device)
         record = batch["record"].to(device)
         image = batch["image"].to(device)
+        extra_kwargs = {}
+        if method == "CLDM":
+            f0 = int(batch["rick_v"][0].item())
+            f0_phase = int(batch["rick_phase"][0].item())
+            extra_kwargs = {
+                "measurement": record,
+                "operator": OverthrustForwardOperator(
+                    wavelet=dataset.wavelets[f0][f0_phase],
+                    device=device,
+                ),
+            }
         output = pipe(
             dipin=dipin,
             record=record,
             image=image,
             num_inference_steps=num_inference_steps,
             seeds=seeds,
+            **extra_kwargs,
         )
         prediction = output.impedance_samples
         reconstruction = output.impedance_reconstructed
     full_reconstruction_impedance = dataset.fan(full_reconstruction)
     metrics_summary = {
+        "config": {
+            **OVERTHRUST_CONFIG,
+            "method": method,
+            "num_inference_steps": num_inference_steps,
+        },
         "normalized": compute_metrics(full_prediction, full_target),
         "impedance": compute_metrics(full_prediction_impedance, full_target_impedance),
         "encode_impedance": compute_metrics(
 def parse_args() -> argparse.Namespace:
+    parser = argparse.ArgumentParser(description="Evaluate SAII-LDDPM/CLDM on Overthrust.")
+    parser.add_argument("method", nargs="?", choices=["LDDPM", "CLDM"], default="LDDPM")
     parser.add_argument("--model", default="mally-2000/seismic-lddpm")
     parser.add_argument("--output", default="outputs/overthrust")
     parser.add_argument("--device", default=None)
+    parser.add_argument("--num-inference-steps", type=int, default=None)
     return parser.parse_args()
 def main() -> None:
     args = parse_args()
+    pipe_cls = SeismicImpInvCLDMPipeline if args.method == "CLDM" else SeismicImpInvLDDPMPipeline
+    pipe = pipe_cls.from_pretrained(
         args.model,
         torch_dtype=torch.float32,
         trust_remote_code=True,
     )
     result = evaluate_overthrust(
         pipe,
+        method=args.method,
         output_dir=args.output,
         num_inference_steps=args.num_inference_steps,
         device=args.device,

inference/{infer_LDDPM.py → infer.py} RENAMED Viewed

@@ -11,15 +11,15 @@ REPO_ROOT = Path(__file__).resolve().parents[1]
 if str(REPO_ROOT) not in sys.path:
     sys.path.insert(0, str(REPO_ROOT))
-from inference.dataset import OverthrustTrueimpDataset
-from pipeline import SeismicImpInvLDDPMPipeline
-MODEL_ID = "mally-2000/seismic-lddpm"
-OUT_DIR = REPO_ROOT / "outputs" / "infer_LDDPM"
-NUM_INFERENCE_STEPS = 1000
 PATCH_INDEX = 0
 def save_comparison(dipin, record, target, prediction, output_path):
     fig, axes = plt.subplots(1, 4, figsize=(16, 4))
@@ -37,21 +37,16 @@ def save_comparison(dipin, record, target, prediction, output_path):
     fig.savefig(output_path, dpi=150)
     plt.close(fig)
 if __name__ == "__main__":
     OUT_DIR.mkdir(parents=True, exist_ok=True)
     device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
     print(f"Using device: {device}")
-    pipe = SeismicImpInvLDDPMPipeline.from_pretrained(
-        MODEL_ID,
-        torch_dtype=torch.float32,
-        trust_remote_code=True,
-    ).to(device)
-    print(f"UNet device: {pipe.unet.device}")
-    # One default Overthrust patch. Dataset defaults define the LDDPM test setup:
-    # nonlinear forward model, 30 Hz Ricker wavelet, 15 dB noise, and dipin=0.012.
     dataset = OverthrustTrueimpDataset(
         patch_indices=[PATCH_INDEX],
         data_dir=REPO_ROOT / "data",
@@ -61,13 +56,51 @@ if __name__ == "__main__":
     dipin = sample["dipin"].unsqueeze(0).to(device)
     record = sample["record"].unsqueeze(0).to(device)
     image = sample["image"].unsqueeze(0).to(device)
     output = pipe(
         dipin=dipin,
         record=record,
         image=image,
-        num_inference_steps=NUM_INFERENCE_STEPS,
-        seeds=[int(sample["seed"])],
     )
     prediction = output.impedance_samples[0, 0].detach().cpu().numpy()
@@ -83,3 +116,7 @@ if __name__ == "__main__":
     print(f"Saved: {OUT_DIR / 'target.npy'}")
     print(f"Saved: {OUT_DIR / 'comparison.png'}")

 if str(REPO_ROOT) not in sys.path:
     sys.path.insert(0, str(REPO_ROOT))
+from inference.dataset import OverthrustTrueimpDataset, SeismicBase
+from inference.util import OverthrustForwardOperator, ricker_wavelet
+from pipeline import SeismicImpInvCLDMPipeline, SeismicImpInvLDDPMPipeline
+METHOD = sys.argv[1].upper() if len(sys.argv) > 1 else "LDDPM"
+OUT_DIR = REPO_ROOT / "outputs" / f"infer_{METHOD}"
 PATCH_INDEX = 0
+RUN_EVAL = True
 def save_comparison(dipin, record, target, prediction, output_path):
     fig, axes = plt.subplots(1, 4, figsize=(16, 4))
     fig.savefig(output_path, dpi=150)
     plt.close(fig)
 if __name__ == "__main__":
+    if METHOD not in {"LDDPM", "CLDM"}:
+        raise ValueError("METHOD must be LDDPM or CLDM. Example: python inference/infer.py CLDM")
     OUT_DIR.mkdir(parents=True, exist_ok=True)
     device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
     print(f"Using device: {device}")
+    print(f"Method: {METHOD}")
     dataset = OverthrustTrueimpDataset(
         patch_indices=[PATCH_INDEX],
         data_dir=REPO_ROOT / "data",
     dipin = sample["dipin"].unsqueeze(0).to(device)
     record = sample["record"].unsqueeze(0).to(device)
     image = sample["image"].unsqueeze(0).to(device)
+    seed = int(sample["seed"])
+    if METHOD == "LDDPM":
+        num_inference_steps = 1000
+        extra_kwargs = {}
+        pipe = SeismicImpInvLDDPMPipeline.from_pretrained(
+            "mally-2000/seismic-lddpm",
+            torch_dtype=torch.float32,
+            trust_remote_code=True,
+        ).to(device)
+    else:
+        pipe = SeismicImpInvCLDMPipeline.from_pretrained(
+            "mally-2000/seismic-lddpm",
+            torch_dtype=torch.float32,
+            trust_remote_code=True,
+        ).to(device)
+        num_inference_steps = 30
+        f0 = int(sample["rick_v"].item())
+        f0_phase = int(sample["rick_phase"].item())
+        # NOTE: The forward operator's wavelet must match the dataset's wavelet
+        # to ensure consistency between simulated measurements and actual data.
+        # The parameters (f0=30Hz, dt=0.002s) must match the values used in
+        # OverthrustTrueimpDataset._build_wavelets() to generate the seismic records.
+        wavelet = ricker_wavelet(f0=f0, nt=256 // 2, dt=0.002)
+        # Apply phase shift to match the dataset's wavelet phase
+        wavelet = SeismicBase.phaseshift(wavelet, f0_phase)
+        operator = OverthrustForwardOperator(
+            wavelet=wavelet,
+            device=device,
+        )
+        extra_kwargs = dict(
+            measurement=record,
+            operator=operator,
+        )
     output = pipe(
         dipin=dipin,
         record=record,
         image=image,
+        num_inference_steps=num_inference_steps,
+        seeds=[seed],
+        **extra_kwargs,
     )
     prediction = output.impedance_samples[0, 0].detach().cpu().numpy()
     print(f"Saved: {OUT_DIR / 'target.npy'}")
     print(f"Saved: {OUT_DIR / 'comparison.png'}")
+    if RUN_EVAL:
+        from inference.eval_overthrust import evaluate_overthrust
+        evaluate_overthrust(pipe, method=METHOD, output_dir=OUT_DIR / "eval")

inference/util.py ADDED Viewed

	@@ -0,0 +1,106 @@

+from __future__ import annotations
+import numpy as np
+import torch
+def ricker_wavelet(f0: float, nt: int, dt: float) -> np.ndarray:
+    """Ricker (Mexican hat) wavelet - pure NumPy implementation.
+    Replaces pylops.utils.wavelets.ricker with identical output.
+    Creates a Ricker wavelet given time axis parameters and central frequency.
+    Args:
+        f0: Central frequency in Hz
+        nt: Number of time samples (positive part including zero)
+        dt: Time sampling interval in seconds
+    Returns:
+        Wavelet array with symmetric time axis
+    """
+    # Construct positive time axis (including zero)
+    t_positive = np.arange(nt) * dt
+    # _tcrop: if even length, remove last sample to ensure odd length
+    if len(t_positive) % 2 == 0:
+        t_positive = t_positive[:-1]
+    # Construct symmetric time axis (negative + positive)
+    t = np.concatenate((np.flipud(-t_positive[1:]), t_positive), axis=0)
+    # Ricker wavelet formula
+    w = (1 - 2 * (np.pi * f0 * t) ** 2) * np.exp(-((np.pi * f0 * t) ** 2))
+    return w
+def build_convmtx(wavelet: np.ndarray, size: int) -> np.ndarray:
+    """Build convolution matrix (Toeplitz matrix) - pure NumPy implementation.
+    Replaces pylops.utils.signalprocessing.convmtx with identical output.
+    Args:
+        wavelet: 1D wavelet array
+        size: Output matrix size (size x size)
+    Returns:
+        Convolution matrix of shape (size, size)
+    """
+    wlen = len(wavelet)
+    offset = wlen // 2
+    matrix = np.zeros((size, size), dtype=wavelet.dtype)
+    for i in range(size):
+        for j, w_val in enumerate(wavelet):
+            col_idx = i - offset + j
+            if 0 <= col_idx < size:
+                matrix[i, col_idx] = w_val
+    return matrix
+class OverthrustForwardOperator:
+    """Differentiable seismic forward model matching OverthrustTrueimpDataset."""
+    def __init__(
+        self,
+        *,
+        wavelet: np.ndarray,
+        size: int = 256,
+        normal_min: float = 5.0931,
+        normal_max: float = 6.501110975896774,
+        record_scale: float = 0.3215932963300079,
+        normalize: str = "minmax",
+        device: torch.device | None = None,
+        dtype: torch.dtype = torch.float32,
+    ):
+        device = device or torch.device("cuda" if torch.cuda.is_available() else "cpu")
+        wavelet_matrix = build_convmtx(wavelet, size)
+        s1 = np.eye(size, k=1) - np.eye(size, k=0)
+        s2 = np.eye(size, k=1) + np.eye(size, k=0)
+        s1[-1] = 0
+        s2[-1] = 0
+        self.wavelet_matrix = torch.as_tensor(wavelet_matrix, device=device, dtype=dtype)
+        self.s1 = torch.as_tensor(s1, device=device, dtype=dtype)
+        self.s2 = torch.as_tensor(s2, device=device, dtype=dtype)
+        self.normal_min = float(normal_min)
+        self.normal_max = float(normal_max)
+        self.record_scale = float(record_scale)
+        self.normalize = normalize
+    def _inv_normal(self, image: torch.Tensor) -> torch.Tensor:
+        if self.normalize == "minmax":
+            return image * (self.normal_max - self.normal_min) + self.normal_min
+        if self.normalize == "max":
+            return image * self.normal_max
+        raise ValueError(f"Unsupported normalize: {self.normalize}")
+    def __call__(self, image: torch.Tensor) -> torch.Tensor:
+        impedance = torch.exp(self._inv_normal(image))
+        numerator = torch.matmul(self.s1.to(dtype=image.dtype), impedance)
+        denominator = torch.matmul(self.s2.to(dtype=image.dtype), impedance)
+        reflectivity = numerator / torch.clamp(denominator, min=1e-6)
+        record = torch.matmul(self.wavelet_matrix.to(dtype=image.dtype), reflectivity)
+        return record / self.record_scale

pipeline.py CHANGED Viewed

@@ -1,10 +1,11 @@
 from __future__ import annotations
 from dataclasses import dataclass
 import numpy as np
 import torch
-from diffusers import DDPMScheduler, DiffusionPipeline, UNet2DModel, VQModel
 from diffusers.utils import BaseOutput
@@ -232,3 +233,332 @@ class SeismicImpInvLDDPMPipeline(DiffusionPipeline):
         if output_type == "np":
             return reconstruction.detach().cpu().numpy()
         return reconstruction

 from __future__ import annotations
 from dataclasses import dataclass
+from typing import Any, Callable
 import numpy as np
 import torch
+from diffusers import DDIMScheduler, DDPMScheduler, DiffusionPipeline, UNet2DModel, VQModel
 from diffusers.utils import BaseOutput
         if output_type == "np":
             return reconstruction.detach().cpu().numpy()
         return reconstruction
+class SeismicImpInvCLDMPipeline(SeismicImpInvLDDPMPipeline):
+    """SAII-CLDM inference pipeline.
+    This reuses the same trained components as SAII-LDDPM and replaces only the
+    reverse sampling procedure with DDIM plus model-driven resampling.
+    """
+    @staticmethod
+    def _get_operator_fn(operator: Any) -> Callable[[torch.Tensor], torch.Tensor]:
+        if callable(operator):
+            return operator
+        if hasattr(operator, "forward") and callable(operator.forward):
+            return operator.forward
+        raise TypeError("`operator` must be callable or expose a callable `forward` method.")
+    @staticmethod
+    def _build_ddim_scheduler(
+        scheduler: DDPMScheduler,
+        num_inference_steps: int,
+        device: torch.device,
+    ) -> DDIMScheduler:
+        ddim_scheduler = DDIMScheduler.from_config(
+            scheduler.config,
+            clip_sample=False,
+            set_alpha_to_one=False,
+            steps_offset=1,
+            timestep_spacing="leading",
+        )
+        ddim_scheduler.set_timesteps(num_inference_steps, device=device)
+        return ddim_scheduler
+    @staticmethod
+    def _default_pixel_optimization_param() -> dict[str, float | int]:
+        return {
+            "eps": 1e-4,
+            "max_iters": 100,
+            "lr": 1e-5,
+            "y_coef": 1.0,
+            "x_coef": 0.0,
+            "tv_coef": 0.0,
+            "dh_coef": 1.0,
+            "dw_coef": 1.5,
+        }
+    @staticmethod
+    def _default_last_pixel_optimization_param() -> dict[str, float | int]:
+        return {
+            "eps": 1e-4,
+            "max_iters": 1,
+            "lr": 1e-4,
+            "y_coef": 1.0,
+            "x_coef": 0.1,
+            "tv_coef": 0.0,
+            "dh_coef": 1.0,
+            "dw_coef": 1.5,
+        }
+    @staticmethod
+    def _tv_loss(x: torch.Tensor, *, dh_coef: float, dw_coef: float) -> torch.Tensor:
+        dh = dh_coef * torch.abs(x[..., :, 1:] - x[..., :, :-1])
+        dw = dw_coef * torch.abs(x[..., 1:, :] - x[..., :-1, :])
+        return torch.mean(dh[..., :-1, :] + dw[..., :, :-1])
+    def _ddim_step(
+        self,
+        latents: torch.Tensor,
+        conditioning: torch.Tensor,
+        timestep: int,
+        scheduler: DDIMScheduler,
+        eta: float,
+        generator: torch.Generator | list[torch.Generator] | None,
+        quantize_denoised: bool,
+    ) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor, dict[str, torch.Tensor]]:
+        model_input = torch.cat(
+            [
+                scheduler.scale_model_input(latents, timestep),
+                conditioning.to(dtype=latents.dtype),
+            ],
+            dim=1,
+        )
+        timestep_tensor = torch.full(
+            (latents.shape[0],), timestep, device=latents.device, dtype=torch.long
+        )
+        noise_pred = self.unet(model_input, timestep_tensor).sample
+        alpha_t = scheduler.alphas_cumprod[timestep].to(
+            device=latents.device, dtype=latents.dtype
+        )
+        prev_timestep = timestep - (
+            scheduler.config.num_train_timesteps // scheduler.num_inference_steps
+        )
+        if prev_timestep >= 0:
+            alpha_prev = scheduler.alphas_cumprod[prev_timestep].to(
+                device=latents.device, dtype=latents.dtype
+            )
+        else:
+            alpha_prev = scheduler.final_alpha_cumprod.to(
+                device=latents.device, dtype=latents.dtype
+            )
+        beta_t = 1.0 - alpha_t
+        pred_x0 = (latents - beta_t.sqrt() * noise_pred) / alpha_t.sqrt()
+        pseudo_x0 = (latents - beta_t * noise_pred) / alpha_t.sqrt()
+        if quantize_denoised:
+            pred_x0 = self.vq_model.quantize(pred_x0.to(dtype=self.vq_model.dtype))[0].to(
+                dtype=latents.dtype
+            )
+            noise_pred = (latents - alpha_t.sqrt() * pred_x0) / beta_t.sqrt()
+        variance = scheduler._get_variance(timestep, prev_timestep).to(
+            device=latents.device, dtype=latents.dtype
+        )
+        sigma_t = eta * variance.sqrt()
+        direction = torch.clamp(1.0 - alpha_prev - sigma_t**2, min=0.0).sqrt() * noise_pred
+        noise = torch.zeros_like(latents)
+        if eta > 0:
+            noise = sigma_t * self._randn_like_sample(latents, generator)
+        prev_sample = alpha_prev.sqrt() * pred_x0 + direction + noise
+        batch_shape = (latents.shape[0], 1, 1, 1)
+        return (
+            prev_sample,
+            pred_x0,
+            pseudo_x0,
+            {
+                "a_t": torch.full(
+                    batch_shape,
+                    float(alpha_t.item()),
+                    device=latents.device,
+                    dtype=latents.dtype,
+                ),
+                "a_prev": torch.full(
+                    batch_shape,
+                    float(alpha_prev.item()),
+                    device=latents.device,
+                    dtype=latents.dtype,
+                ),
+            },
+        )
+    def _optimize_pixels(
+        self,
+        x_prime: torch.Tensor,
+        measurement: torch.Tensor,
+        operator_fn: Callable[[torch.Tensor], torch.Tensor],
+        params: dict[str, Any],
+    ) -> torch.Tensor:
+        merged = {**self._default_pixel_optimization_param(), **params}
+        if int(merged["max_iters"]) <= 0:
+            return x_prime.detach()
+        loss_fn = torch.nn.MSELoss(reduction="mean")
+        opt_var = x_prime.detach().clone().requires_grad_(True)
+        opt_init = x_prime.detach().clone()
+        optimizer = torch.optim.AdamW([opt_var], lr=float(merged["lr"]))
+        for _ in range(int(merged["max_iters"])):
+            optimizer.zero_grad(set_to_none=True)
+            measurement_loss = (
+                loss_fn(measurement, operator_fn(opt_var)) * float(merged["y_coef"])
+                + loss_fn(opt_init, opt_var) * float(merged["x_coef"])
+            )
+            if float(merged["tv_coef"]) != 0.0:
+                measurement_loss = measurement_loss + float(merged["tv_coef"]) * self._tv_loss(
+                    opt_var,
+                    dh_coef=float(merged["dh_coef"]),
+                    dw_coef=float(merged["dw_coef"]),
+                )
+            measurement_loss.backward()
+            optimizer.step()
+            if float(measurement_loss.detach().cpu().item()) < float(merged["eps"]):
+                break
+        return opt_var.detach()
+    def _stochastic_resample(
+        self,
+        pseudo_x0: torch.Tensor,
+        x_t: torch.Tensor,
+        a_t: torch.Tensor,
+        sigma: torch.Tensor,
+        generator: torch.Generator | list[torch.Generator] | None,
+    ) -> torch.Tensor:
+        sigma = torch.clamp(sigma, min=1e-12)
+        one_minus_a_t = torch.clamp(1.0 - a_t, min=1e-12)
+        noise = self._randn_like_sample(pseudo_x0, generator)
+        return (
+            (sigma * a_t.sqrt() * pseudo_x0 + one_minus_a_t * x_t)
+            / (sigma + one_minus_a_t)
+            + noise * torch.sqrt(1.0 / (1.0 / sigma + 1.0 / one_minus_a_t))
+        )
+    def __call__(
+        self,
+        dipin: torch.Tensor,
+        record: torch.Tensor,
+        measurement: torch.Tensor | None = None,
+        operator: Any | None = None,
+        image: torch.Tensor | None = None,
+        num_inference_steps: int = 30,
+        seed: int | None = None,
+        seeds: list[int] | tuple[int, ...] | torch.Tensor | None = None,
+        generator: torch.Generator | None = None,
+        eta: float = 0.01,
+        interval: int = 6,
+        sigma_a: float = 20.0,
+        pixel_optimization_param: dict[str, Any] | None = None,
+        last_pixel_optimization_param: dict[str, Any] | None = None,
+        quantize_denoised: bool = False,
+        output_type: str = "tensor",
+    ) -> SeismicImpInvLDDPMPipelineOutput:
+        if measurement is None:
+            measurement = record
+        if operator is None:
+            raise ValueError("SAII-CLDM requires a forward `operator`.")
+        if interval <= 0:
+            raise ValueError("`interval` must be a positive integer.")
+        device = self.unet.device
+        if seeds is not None:
+            if isinstance(seeds, torch.Tensor):
+                seeds = seeds.detach().cpu().tolist()
+            seeds = [int(value) for value in seeds]
+            if len(seeds) != dipin.shape[0]:
+                raise ValueError(f"Expected {dipin.shape[0]} seeds, got {len(seeds)}")
+            generator = [
+                torch.Generator(device=device).manual_seed(value) for value in seeds
+            ]
+        elif seed is not None:
+            generator = torch.Generator(device=device).manual_seed(seed)
+        elif generator is None:
+            generator = torch.Generator(device=device)
+        with torch.no_grad():
+            dipin = dipin.to(device=device, dtype=self.vq_model.dtype)
+            record = record.to(device=device, dtype=self.unet.dtype)
+            measurement = measurement.to(device=device, dtype=self.unet.dtype)
+            impedance_dipin, record_features = self._encode_conditioning(dipin, record)
+            conditioning = torch.cat([impedance_dipin, record_features], dim=1)
+            impedance_latents = self._randn_like_sample(
+                torch.empty(
+                    impedance_dipin.shape,
+                    device=device,
+                    dtype=self.unet.dtype,
+                ),
+                generator,
+            )
+        operator_fn = self._get_operator_fn(operator)
+        pixel_params = pixel_optimization_param or {}
+        last_pixel_params = last_pixel_optimization_param or self._default_last_pixel_optimization_param()
+        schedule = self._build_ddim_scheduler(self.scheduler, num_inference_steps, device)
+        time_range = [int(timestep) for timestep in schedule.timesteps.tolist()]
+        resample_start_index = len(time_range) // 4
+        for step_idx, timestep in enumerate(time_range):
+            index = len(time_range) - step_idx - 1
+            with torch.no_grad():
+                next_latents, pred_x0, pseudo_x0, step_stats = self._ddim_step(
+                    impedance_latents,
+                    conditioning,
+                    timestep,
+                    schedule,
+                    eta,
+                    generator,
+                    quantize_denoised,
+                )
+            if (index >= resample_start_index or index == 0) and (
+                index % interval == 0 or index == 0
+            ):
+                x_t_reference = next_latents.detach().clone()
+                sigma = sigma_a * (1.0 - step_stats["a_prev"]) / (
+                    1.0 - step_stats["a_t"]
+                )
+                sigma = sigma * (1.0 - step_stats["a_t"] / step_stats["a_prev"])
+                sigma = torch.clamp(sigma, min=1e-12)
+                with torch.no_grad():
+                    pseudo_x0_pixel = self.vq_model.decode(
+                        pseudo_x0.detach().to(dtype=self.vq_model.dtype)
+                    ).sample
+                optimized_pixels = self._optimize_pixels(
+                    pseudo_x0_pixel,
+                    measurement,
+                    operator_fn,
+                    last_pixel_params if index == 0 else pixel_params,
+                )
+                with torch.no_grad():
+                    optimized_latents = self.vq_model.encode(
+                        optimized_pixels.to(dtype=self.vq_model.dtype)
+                    ).latents.to(dtype=self.unet.dtype)
+                    next_latents = self._stochastic_resample(
+                        optimized_latents,
+                        x_t_reference,
+                        step_stats["a_prev"],
+                        sigma.to(dtype=self.unet.dtype),
+                        generator,
+                    )
+            impedance_latents = next_latents.detach()
+        with torch.no_grad():
+            impedance_samples = self.vq_model.decode(
+                impedance_latents.to(dtype=self.vq_model.dtype)
+            ).sample
+            impedance_reconstructed = None
+            if image is not None:
+                image = image.to(device=device, dtype=self.vq_model.dtype)
+                image_latents = self.vq_model.encode(image).latents
+                impedance_reconstructed = self.vq_model.decode(image_latents).sample
+        if output_type == "np":
+            impedance_samples = impedance_samples.detach().cpu().numpy()
+            impedance_latents = impedance_latents.detach().cpu().numpy()
+            impedance_dipin = impedance_dipin.detach().cpu().numpy()
+            record_features = record_features.detach().cpu().numpy()
+            if impedance_reconstructed is not None:
+                impedance_reconstructed = impedance_reconstructed.detach().cpu().numpy()
+        return SeismicImpInvLDDPMPipelineOutput(
+            impedance_samples=impedance_samples,
+            impedance_latents=impedance_latents,
+            impedance_dipin=impedance_dipin,
+            impedance_reconstructed=impedance_reconstructed,
+            record_features=record_features,
+        )