learn_region_grow / README.md
bdck's picture
Upload README.md
94773c9 verified
# LRGNet: Learnable Region Growing for Point Clouds
A **PyTorch** implementation of **LRGNet** β€” *Learnable Region Growing for Class-Agnostic Point Cloud Segmentation* β€” originally described in:
> **LRGNet: Learnable Region Growing for Class-Agnostic Point Cloud Segmentation**
> Jingdao Chen, Zsolt Kira, Yong K. Cho
> *IEEE Robotics and Automation Letters (RAL), 2021*
> [arXiv:2103.09160](https://arxiv.org/abs/2103.09160)
> Original code: [jingdao/learn_region_grow](https://github.com/jingdao/learn_region_grow)
---
## What this repository does
This repo provides a **clean, modern PyTorch port** of the LRGNet algorithm that can be applied to **any PLY or PCD point cloud** without being tied to specific benchmark datasets (S3DIS, ScanNet, etc.).
- **Load** `.ply` or `.pcd` point clouds.
- **Preprocess** (voxel equalization, normal/curvature estimation).
- **Train** an `LrgNet` model on labeled point clouds.
- **Segment** unlabeled point clouds into class-agnostic object instances.
---
## Quick start
### 1. Install
```bash
pip install -r requirements.txt
```
### 2. Train on a labeled point cloud
You need a PLY/PCD file and a `.npy` file with shape `(N,)` containing integer instance labels per point.
```bash
# Stage training data
python scripts/stage_data.py \
--input my_scene.ply \
--labels my_scene_labels.npy \
--output my_scene_staged.h5 \
--resolution 0.1
# Train
python scripts/train.py \
--data_dir . \
--epochs 50 --batch_size 16 --lr 1e-3 \
--save_dir checkpoints
```
### 3. Segment a new point cloud
```bash
python scripts/segment.py \
--input new_scene.ply \
--ckpt checkpoints/best_model.pt \
--output segmented.ply \
--device cuda
```
Open `segmented.ply` in [CloudCompare](https://www.danielgm.net/cc/) or [MeshLab](https://www.meshlab.net/) β€” each object instance is colored differently.
---
## How the algorithm works internally
### Classical region growing vs. LRGNet
| | Classical region growing (PCL) | **LRGNet** |
|---|---|---|
| **Similarity rule** | Hand-crafted threshold on normals / color / curvature | **Learned** by a neural network |
| **Direction** | Can only **add** points to the region | Can **add** AND **remove** points ("morphing") |
| **Class knowledge** | Often class-specific or needs tuning | **Class-agnostic** β€” works on any object |
| **Recovery** | None β€” one bad seed ruins the segment | Network learns to **recover** from mistakes |
### Step-by-step pipeline
```
Raw point cloud (.ply / .pcd)
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Voxel equalization β”‚ ← keep 1 point per voxel (default 0.1 m)
β”‚ (removes density β”‚ prevents oversampled regions from dominating
β”‚ bias) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Normal + curvature β”‚ ← PCA on 3Γ—3Γ—3 voxel neighborhood
β”‚ estimation β”‚ flat areas β†’ low curvature (good seeds)
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Build 13-D feature β”‚ ← [x,y,z, room_x, room_y, room_z,
β”‚ vector per point β”‚ r,g,b, nx,ny,nz, curvature]
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Region growing loop β”‚ ← for each seed (sorted by curvature):
β”‚ β”‚ 1. Find boundary neighbors (6-connected voxels)
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ 2. Sample 512 inliers + 512 neighbors
β”‚ β”‚ LrgNet β”‚ β”‚ 3. Center coordinates (translation invariance)
β”‚ β”‚ dual-branch β”‚ β”‚ 4. Forward pass β†’ add logits + remove logits
β”‚ β”‚ 1D PointNet β”‚ β”‚ 5. Stochastic decision: accept / reject points
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ 6. Repeat until "stuck" or max steps
β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
Colored PLY output
```
### Network architecture (`lrg_net.py`)
`LrgNet` is a **dual-branch 1D PointNet**:
- **Inlier branch** β€” processes the current region points.
- **Neighbor branch** β€” processes candidate boundary points.
Both branches run a stack of 1D convolutions:
```
Conv1D(13 -> 64) β†’ Conv1D(64 -> 64) β†’ Conv1D(64 -> 64)
β†’ Conv1D(64 -> 128) β†’ Conv1D(128 -> 512)
```
After the deepest layer, a **global max-pool** extracts one feature vector per branch. These global vectors are tiled back to the point counts and concatenated with per-point features. Two classification heads then predict:
1. **`add_head`** β€” for each neighbor point: probability it should join the region.
2. **`remove_head`** β€” for each inlier point: probability it should leave the region.
Both heads use binary cross-entropy loss and are trained jointly.
> **Why two heads?**
> Classical region growing is monotonic β€” it can only expand. LRGNet can **shrink** the region too. This is critical for recovering from an early bad seed or an over-segmentation.
### Training data generation (`stage_data.py`)
Training is **fully supervised** from ground-truth instance labels:
1. For each object instance, pick `k` random seeds inside it.
2. Simulate region growing using the true labels.
3. At every step, record:
- the current (possibly noisy) inlier set
- the boundary neighbor set
- ground-truth binary labels: *add / don't add*, *remove / don't remove*
4. **Inject controlled noise** (`add_mistake_prob β‰ˆ 0.2`, `remove_mistake_prob β‰ˆ 0.2`)
β†’ the network sees "messy" regions and learns to clean them up.
### Centering (translation invariance)
At **both** training and inference time, the XYZ coordinates of inliers and neighbors are **centered** by subtracting the median XY of the inlier set, and feature channels (RGB, normals, curvature) are centered by subtracting their median over the inlier set. This makes the network **translation invariant** β€” it only cares about local geometry and appearance, not absolute room coordinates.
---
## Repository layout
```
learn_region_grow/
β”œβ”€β”€ __init__.py # Package metadata
β”œβ”€β”€ io.py # PLY / PCD loaders and savers + color maps
β”œβ”€β”€ preprocess.py # Voxel equalization, normal/curvature, feature vector
β”œβ”€β”€ lrg_net.py # PyTorch LrgNet model
β”œβ”€β”€ growing.py # Region growing inference engine
β”œβ”€β”€ stage_data.py # Training data generator from labeled clouds
β”œβ”€β”€ train.py # PyTorch training loop
└── utils.py # Sampling, centering, clustering metrics
scripts/
β”œβ”€β”€ segment.py # CLI: run inference on a new point cloud
β”œβ”€β”€ train.py # CLI: train from staged H5 files
└── stage_data.py # CLI: generate staged H5 from labeled cloud
```
---
## Parameters & tuning tips
| Parameter | Default | Effect |
|---|---|---|
| `resolution` | `0.1` m | Voxel grid size. Smaller = more detail, slower. |
| `lite` | `0` | Model size: `0`=full, `1`=half, `2`=quarter channels. |
| `add_threshold` / `remove_threshold` | `0.5` | Confidence cutoffs for deterministic mode. |
| `stochastic` | `True` | Random sampling weighted by confidence (paper default). |
| `cluster_threshold` | `10` | Minimum points to keep a segment. |
| `seeds_per_instance` | `5` | Random seeds per object during training staging. |
| `add_mistake_prob` | `0.2` | Noise injected into "add" labels during staging. |
| `remove_mistake_prob` | `0.2` | Noise injected into "remove" labels during staging. |
### When to change `resolution`
- **Indoor scenes (S3DIS, ScanNet)**: `0.1` m works well.
- **Outdoor LiDAR (Semantic KITTI)**: use `0.3` m because objects are larger and sparser.
- **Small objects / high detail**: try `0.05` m.
### When to use `lite`
- `lite=0`: best accuracy, use on GPU.
- `lite=1` or `2`: faster inference, lower memory, slight accuracy drop.
---
## Differences from the original TensorFlow 1.x code
| Aspect | Original | This port |
|---|---|---|
| Framework | TensorFlow 1.x (`tf.compat.v1`) | **PyTorch** 2.x |
| Input format | H5 dumps from S3DIS / ScanNet | **Any PLY or PCD** |
| Model branches | Shared conv weights | Independent branches (clearer, no leakage) |
| Normals | 3Γ—3Γ—3 voxel window | Same, with k-NN fallback for sparse clouds |
| Seed ordering | Curvature ascending | Same |
| Loss | TF cross-entropy | PyTorch `BCEWithLogitsLoss` |
| Output | Colored PLY only | PLY + PCD, with label IDs |
---
## Citation
If you use this code, please cite the original paper:
```bibtex
@article{chen2021lrgnet,
title={LRGNet: Learnable Region Growing for Class-Agnostic Point Cloud Segmentation},
author={Chen, Jingdao and Kira, Zsolt and Cho, Yong K},
journal={IEEE Robotics and Automation Letters},
volume={6},
number={3},
pages={5205--5212},
year={2021},
publisher={IEEE}
}
```
---
## License
This port is released under the MIT license. The original repository is available at [jingdao/learn_region_grow](https://github.com/jingdao/learn_region_grow).