File size: 9,437 Bytes

94773c9

# LRGNet: Learnable Region Growing for Point Clouds

A **PyTorch** implementation of **LRGNet** — *Learnable Region Growing for Class-Agnostic Point Cloud Segmentation* — originally described in:

> **LRGNet: Learnable Region Growing for Class-Agnostic Point Cloud Segmentation**  
> Jingdao Chen, Zsolt Kira, Yong K. Cho  
> *IEEE Robotics and Automation Letters (RAL), 2021*  
> [arXiv:2103.09160](https://arxiv.org/abs/2103.09160)  
> Original code: [jingdao/learn_region_grow](https://github.com/jingdao/learn_region_grow)

---

## What this repository does

This repo provides a **clean, modern PyTorch port** of the LRGNet algorithm that can be applied to **any PLY or PCD point cloud** without being tied to specific benchmark datasets (S3DIS, ScanNet, etc.).

- **Load** `.ply` or `.pcd` point clouds.
- **Preprocess** (voxel equalization, normal/curvature estimation).
- **Train** an `LrgNet` model on labeled point clouds.
- **Segment** unlabeled point clouds into class-agnostic object instances.

---

## Quick start

### 1. Install

```bash
pip install -r requirements.txt
```

### 2. Train on a labeled point cloud

You need a PLY/PCD file and a `.npy` file with shape `(N,)` containing integer instance labels per point.

```bash
# Stage training data
python scripts/stage_data.py \
    --input my_scene.ply \
    --labels my_scene_labels.npy \
    --output my_scene_staged.h5 \
    --resolution 0.1

# Train
python scripts/train.py \
    --data_dir . \
    --epochs 50 --batch_size 16 --lr 1e-3 \
    --save_dir checkpoints
```

### 3. Segment a new point cloud

```bash
python scripts/segment.py \
    --input new_scene.ply \
    --ckpt checkpoints/best_model.pt \
    --output segmented.ply \
    --device cuda
```

Open `segmented.ply` in [CloudCompare](https://www.danielgm.net/cc/) or [MeshLab](https://www.meshlab.net/) — each object instance is colored differently.

---

## How the algorithm works internally

### Classical region growing vs. LRGNet

| | Classical region growing (PCL) | **LRGNet** |
|---|---|---|
| **Similarity rule** | Hand-crafted threshold on normals / color / curvature | **Learned** by a neural network |
| **Direction** | Can only **add** points to the region | Can **add** AND **remove** points ("morphing") |
| **Class knowledge** | Often class-specific or needs tuning | **Class-agnostic** — works on any object |
| **Recovery** | None — one bad seed ruins the segment | Network learns to **recover** from mistakes |

### Step-by-step pipeline

```
Raw point cloud (.ply / .pcd)
        │
        ▼
┌─────────────────────┐
│  Voxel equalization │  ← keep 1 point per voxel (default 0.1 m)
│  (removes density   │     prevents oversampled regions from dominating
│   bias)               │
└─────────────────────┘
        │
        ▼
┌─────────────────────┐
│ Normal + curvature  │  ← PCA on 3×3×3 voxel neighborhood
│ estimation          │     flat areas → low curvature (good seeds)
└─────────────────────┘
        │
        ▼
┌─────────────────────┐
│ Build 13-D feature  │  ← [x,y,z, room_x, room_y, room_z,
│ vector per point    │      r,g,b, nx,ny,nz, curvature]
└─────────────────────┘
        │
        ▼
┌─────────────────────┐
│ Region growing loop │  ← for each seed (sorted by curvature):
│                     │     1. Find boundary neighbors (6-connected voxels)
│   ┌───────────────┐ │     2. Sample 512 inliers + 512 neighbors
│   │   LrgNet      │ │     3. Center coordinates (translation invariance)
│   │  dual-branch  │ │     4. Forward pass → add logits + remove logits
│   │  1D PointNet  │ │     5. Stochastic decision: accept / reject points
│   └───────────────┘ │     6. Repeat until "stuck" or max steps
│                     │
└─────────────────────┘
        │
        ▼
   Colored PLY output
```

### Network architecture (`lrg_net.py`)

`LrgNet` is a **dual-branch 1D PointNet**:

- **Inlier branch** — processes the current region points.
- **Neighbor branch** — processes candidate boundary points.

Both branches run a stack of 1D convolutions:
```
Conv1D(13 -> 64) → Conv1D(64 -> 64) → Conv1D(64 -> 64)
      → Conv1D(64 -> 128) → Conv1D(128 -> 512)
```

After the deepest layer, a **global max-pool** extracts one feature vector per branch. These global vectors are tiled back to the point counts and concatenated with per-point features. Two classification heads then predict:

1. **`add_head`** — for each neighbor point: probability it should join the region.
2. **`remove_head`** — for each inlier point: probability it should leave the region.

Both heads use binary cross-entropy loss and are trained jointly.

> **Why two heads?**  
> Classical region growing is monotonic — it can only expand. LRGNet can **shrink** the region too. This is critical for recovering from an early bad seed or an over-segmentation.

### Training data generation (`stage_data.py`)

Training is **fully supervised** from ground-truth instance labels:

1. For each object instance, pick `k` random seeds inside it.
2. Simulate region growing using the true labels.
3. At every step, record:
   - the current (possibly noisy) inlier set
   - the boundary neighbor set
   - ground-truth binary labels: *add / don't add*, *remove / don't remove*
4. **Inject controlled noise** (`add_mistake_prob ≈ 0.2`, `remove_mistake_prob ≈ 0.2`)  
   → the network sees "messy" regions and learns to clean them up.

### Centering (translation invariance)

At **both** training and inference time, the XYZ coordinates of inliers and neighbors are **centered** by subtracting the median XY of the inlier set, and feature channels (RGB, normals, curvature) are centered by subtracting their median over the inlier set. This makes the network **translation invariant** — it only cares about local geometry and appearance, not absolute room coordinates.

---

## Repository layout

```
learn_region_grow/
├── __init__.py          # Package metadata
├── io.py                # PLY / PCD loaders and savers + color maps
├── preprocess.py         # Voxel equalization, normal/curvature, feature vector
├── lrg_net.py           # PyTorch LrgNet model
├── growing.py           # Region growing inference engine
├── stage_data.py        # Training data generator from labeled clouds
├── train.py             # PyTorch training loop
└── utils.py             # Sampling, centering, clustering metrics

scripts/
├── segment.py           # CLI: run inference on a new point cloud
├── train.py             # CLI: train from staged H5 files
└── stage_data.py        # CLI: generate staged H5 from labeled cloud
```

---

## Parameters & tuning tips

| Parameter | Default | Effect |
|---|---|---|
| `resolution` | `0.1` m | Voxel grid size. Smaller = more detail, slower. |
| `lite` | `0` | Model size: `0`=full, `1`=half, `2`=quarter channels. |
| `add_threshold` / `remove_threshold` | `0.5` | Confidence cutoffs for deterministic mode. |
| `stochastic` | `True` | Random sampling weighted by confidence (paper default). |
| `cluster_threshold` | `10` | Minimum points to keep a segment. |
| `seeds_per_instance` | `5` | Random seeds per object during training staging. |
| `add_mistake_prob` | `0.2` | Noise injected into "add" labels during staging. |
| `remove_mistake_prob` | `0.2` | Noise injected into "remove" labels during staging. |

### When to change `resolution`
- **Indoor scenes (S3DIS, ScanNet)**: `0.1` m works well.
- **Outdoor LiDAR (Semantic KITTI)**: use `0.3` m because objects are larger and sparser.
- **Small objects / high detail**: try `0.05` m.

### When to use `lite`
- `lite=0`: best accuracy, use on GPU.
- `lite=1` or `2`: faster inference, lower memory, slight accuracy drop.

---

## Differences from the original TensorFlow 1.x code

| Aspect | Original | This port |
|---|---|---|
| Framework | TensorFlow 1.x (`tf.compat.v1`) | **PyTorch** 2.x |
| Input format | H5 dumps from S3DIS / ScanNet | **Any PLY or PCD** |
| Model branches | Shared conv weights | Independent branches (clearer, no leakage) |
| Normals | 3×3×3 voxel window | Same, with k-NN fallback for sparse clouds |
| Seed ordering | Curvature ascending | Same |
| Loss | TF cross-entropy | PyTorch `BCEWithLogitsLoss` |
| Output | Colored PLY only | PLY + PCD, with label IDs |

---

## Citation

If you use this code, please cite the original paper:

```bibtex
@article{chen2021lrgnet,
  title={LRGNet: Learnable Region Growing for Class-Agnostic Point Cloud Segmentation},
  author={Chen, Jingdao and Kira, Zsolt and Cho, Yong K},
  journal={IEEE Robotics and Automation Letters},
  volume={6},
  number={3},
  pages={5205--5212},
  year={2021},
  publisher={IEEE}
}
```

---

## License

This port is released under the MIT license. The original repository is available at [jingdao/learn_region_grow](https://github.com/jingdao/learn_region_grow).