| # LRGNet: Learnable Region Growing for Point Clouds |
|
|
| A **PyTorch** implementation of **LRGNet** β *Learnable Region Growing for Class-Agnostic Point Cloud Segmentation* β originally described in: |
|
|
| > **LRGNet: Learnable Region Growing for Class-Agnostic Point Cloud Segmentation** |
| > Jingdao Chen, Zsolt Kira, Yong K. Cho |
| > *IEEE Robotics and Automation Letters (RAL), 2021* |
| > [arXiv:2103.09160](https://arxiv.org/abs/2103.09160) |
| > Original code: [jingdao/learn_region_grow](https://github.com/jingdao/learn_region_grow) |
|
|
| --- |
|
|
| ## What this repository does |
|
|
| This repo provides a **clean, modern PyTorch port** of the LRGNet algorithm that can be applied to **any PLY or PCD point cloud** without being tied to specific benchmark datasets (S3DIS, ScanNet, etc.). |
|
|
| - **Load** `.ply` or `.pcd` point clouds. |
| - **Preprocess** (voxel equalization, normal/curvature estimation). |
| - **Train** an `LrgNet` model on labeled point clouds. |
| - **Segment** unlabeled point clouds into class-agnostic object instances. |
|
|
| --- |
|
|
| ## Quick start |
|
|
| ### 1. Install |
|
|
| ```bash |
| pip install -r requirements.txt |
| ``` |
|
|
| ### 2. Train on a labeled point cloud |
|
|
| You need a PLY/PCD file and a `.npy` file with shape `(N,)` containing integer instance labels per point. |
|
|
| ```bash |
| # Stage training data |
| python scripts/stage_data.py \ |
| --input my_scene.ply \ |
| --labels my_scene_labels.npy \ |
| --output my_scene_staged.h5 \ |
| --resolution 0.1 |
| |
| # Train |
| python scripts/train.py \ |
| --data_dir . \ |
| --epochs 50 --batch_size 16 --lr 1e-3 \ |
| --save_dir checkpoints |
| ``` |
|
|
| ### 3. Segment a new point cloud |
|
|
| ```bash |
| python scripts/segment.py \ |
| --input new_scene.ply \ |
| --ckpt checkpoints/best_model.pt \ |
| --output segmented.ply \ |
| --device cuda |
| ``` |
|
|
| Open `segmented.ply` in [CloudCompare](https://www.danielgm.net/cc/) or [MeshLab](https://www.meshlab.net/) β each object instance is colored differently. |
|
|
| --- |
|
|
| ## How the algorithm works internally |
|
|
| ### Classical region growing vs. LRGNet |
|
|
| | | Classical region growing (PCL) | **LRGNet** | |
| |---|---|---| |
| | **Similarity rule** | Hand-crafted threshold on normals / color / curvature | **Learned** by a neural network | |
| | **Direction** | Can only **add** points to the region | Can **add** AND **remove** points ("morphing") | |
| | **Class knowledge** | Often class-specific or needs tuning | **Class-agnostic** β works on any object | |
| | **Recovery** | None β one bad seed ruins the segment | Network learns to **recover** from mistakes | |
|
|
| ### Step-by-step pipeline |
|
|
| ``` |
| Raw point cloud (.ply / .pcd) |
| β |
| βΌ |
| βββββββββββββββββββββββ |
| β Voxel equalization β β keep 1 point per voxel (default 0.1 m) |
| β (removes density β prevents oversampled regions from dominating |
| β bias) β |
| βββββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββ |
| β Normal + curvature β β PCA on 3Γ3Γ3 voxel neighborhood |
| β estimation β flat areas β low curvature (good seeds) |
| βββββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββ |
| β Build 13-D feature β β [x,y,z, room_x, room_y, room_z, |
| β vector per point β r,g,b, nx,ny,nz, curvature] |
| βββββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββ |
| β Region growing loop β β for each seed (sorted by curvature): |
| β β 1. Find boundary neighbors (6-connected voxels) |
| β βββββββββββββββββ β 2. Sample 512 inliers + 512 neighbors |
| β β LrgNet β β 3. Center coordinates (translation invariance) |
| β β dual-branch β β 4. Forward pass β add logits + remove logits |
| β β 1D PointNet β β 5. Stochastic decision: accept / reject points |
| β βββββββββββββββββ β 6. Repeat until "stuck" or max steps |
| β β |
| βββββββββββββββββββββββ |
| β |
| βΌ |
| Colored PLY output |
| ``` |
|
|
| ### Network architecture (`lrg_net.py`) |
| |
| `LrgNet` is a **dual-branch 1D PointNet**: |
| |
| - **Inlier branch** β processes the current region points. |
| - **Neighbor branch** β processes candidate boundary points. |
| |
| Both branches run a stack of 1D convolutions: |
| ``` |
| Conv1D(13 -> 64) β Conv1D(64 -> 64) β Conv1D(64 -> 64) |
| β Conv1D(64 -> 128) β Conv1D(128 -> 512) |
| ``` |
| |
| After the deepest layer, a **global max-pool** extracts one feature vector per branch. These global vectors are tiled back to the point counts and concatenated with per-point features. Two classification heads then predict: |
| |
| 1. **`add_head`** β for each neighbor point: probability it should join the region. |
| 2. **`remove_head`** β for each inlier point: probability it should leave the region. |
| |
| Both heads use binary cross-entropy loss and are trained jointly. |
| |
| > **Why two heads?** |
| > Classical region growing is monotonic β it can only expand. LRGNet can **shrink** the region too. This is critical for recovering from an early bad seed or an over-segmentation. |
| |
| ### Training data generation (`stage_data.py`) |
|
|
| Training is **fully supervised** from ground-truth instance labels: |
|
|
| 1. For each object instance, pick `k` random seeds inside it. |
| 2. Simulate region growing using the true labels. |
| 3. At every step, record: |
| - the current (possibly noisy) inlier set |
| - the boundary neighbor set |
| - ground-truth binary labels: *add / don't add*, *remove / don't remove* |
| 4. **Inject controlled noise** (`add_mistake_prob β 0.2`, `remove_mistake_prob β 0.2`) |
| β the network sees "messy" regions and learns to clean them up. |
|
|
| ### Centering (translation invariance) |
|
|
| At **both** training and inference time, the XYZ coordinates of inliers and neighbors are **centered** by subtracting the median XY of the inlier set, and feature channels (RGB, normals, curvature) are centered by subtracting their median over the inlier set. This makes the network **translation invariant** β it only cares about local geometry and appearance, not absolute room coordinates. |
|
|
| --- |
|
|
| ## Repository layout |
|
|
| ``` |
| learn_region_grow/ |
| βββ __init__.py # Package metadata |
| βββ io.py # PLY / PCD loaders and savers + color maps |
| βββ preprocess.py # Voxel equalization, normal/curvature, feature vector |
| βββ lrg_net.py # PyTorch LrgNet model |
| βββ growing.py # Region growing inference engine |
| βββ stage_data.py # Training data generator from labeled clouds |
| βββ train.py # PyTorch training loop |
| βββ utils.py # Sampling, centering, clustering metrics |
| |
| scripts/ |
| βββ segment.py # CLI: run inference on a new point cloud |
| βββ train.py # CLI: train from staged H5 files |
| βββ stage_data.py # CLI: generate staged H5 from labeled cloud |
| ``` |
|
|
| --- |
|
|
| ## Parameters & tuning tips |
|
|
| | Parameter | Default | Effect | |
| |---|---|---| |
| | `resolution` | `0.1` m | Voxel grid size. Smaller = more detail, slower. | |
| | `lite` | `0` | Model size: `0`=full, `1`=half, `2`=quarter channels. | |
| | `add_threshold` / `remove_threshold` | `0.5` | Confidence cutoffs for deterministic mode. | |
| | `stochastic` | `True` | Random sampling weighted by confidence (paper default). | |
| | `cluster_threshold` | `10` | Minimum points to keep a segment. | |
| | `seeds_per_instance` | `5` | Random seeds per object during training staging. | |
| | `add_mistake_prob` | `0.2` | Noise injected into "add" labels during staging. | |
| | `remove_mistake_prob` | `0.2` | Noise injected into "remove" labels during staging. | |
|
|
| ### When to change `resolution` |
| - **Indoor scenes (S3DIS, ScanNet)**: `0.1` m works well. |
| - **Outdoor LiDAR (Semantic KITTI)**: use `0.3` m because objects are larger and sparser. |
| - **Small objects / high detail**: try `0.05` m. |
|
|
| ### When to use `lite` |
| - `lite=0`: best accuracy, use on GPU. |
| - `lite=1` or `2`: faster inference, lower memory, slight accuracy drop. |
|
|
| --- |
|
|
| ## Differences from the original TensorFlow 1.x code |
|
|
| | Aspect | Original | This port | |
| |---|---|---| |
| | Framework | TensorFlow 1.x (`tf.compat.v1`) | **PyTorch** 2.x | |
| | Input format | H5 dumps from S3DIS / ScanNet | **Any PLY or PCD** | |
| | Model branches | Shared conv weights | Independent branches (clearer, no leakage) | |
| | Normals | 3Γ3Γ3 voxel window | Same, with k-NN fallback for sparse clouds | |
| | Seed ordering | Curvature ascending | Same | |
| | Loss | TF cross-entropy | PyTorch `BCEWithLogitsLoss` | |
| | Output | Colored PLY only | PLY + PCD, with label IDs | |
|
|
| --- |
|
|
| ## Citation |
|
|
| If you use this code, please cite the original paper: |
|
|
| ```bibtex |
| @article{chen2021lrgnet, |
| title={LRGNet: Learnable Region Growing for Class-Agnostic Point Cloud Segmentation}, |
| author={Chen, Jingdao and Kira, Zsolt and Cho, Yong K}, |
| journal={IEEE Robotics and Automation Letters}, |
| volume={6}, |
| number={3}, |
| pages={5205--5212}, |
| year={2021}, |
| publisher={IEEE} |
| } |
| ``` |
|
|
| --- |
|
|
| ## License |
|
|
| This port is released under the MIT license. The original repository is available at [jingdao/learn_region_grow](https://github.com/jingdao/learn_region_grow). |
|
|