LRGNet: Learnable Region Growing for Point Clouds
A PyTorch implementation of LRGNet β Learnable Region Growing for Class-Agnostic Point Cloud Segmentation β originally described in:
LRGNet: Learnable Region Growing for Class-Agnostic Point Cloud Segmentation
Jingdao Chen, Zsolt Kira, Yong K. Cho
IEEE Robotics and Automation Letters (RAL), 2021
arXiv:2103.09160
Original code: jingdao/learn_region_grow
What this repository does
This repo provides a clean, modern PyTorch port of the LRGNet algorithm that can be applied to any PLY or PCD point cloud without being tied to specific benchmark datasets (S3DIS, ScanNet, etc.).
- Load
.plyor.pcdpoint clouds. - Preprocess (voxel equalization, normal/curvature estimation).
- Train an
LrgNetmodel on labeled point clouds. - Segment unlabeled point clouds into class-agnostic object instances.
Quick start
1. Install
pip install -r requirements.txt
2. Train on a labeled point cloud
You need a PLY/PCD file and a .npy file with shape (N,) containing integer instance labels per point.
# Stage training data
python scripts/stage_data.py \
--input my_scene.ply \
--labels my_scene_labels.npy \
--output my_scene_staged.h5 \
--resolution 0.1
# Train
python scripts/train.py \
--data_dir . \
--epochs 50 --batch_size 16 --lr 1e-3 \
--save_dir checkpoints
3. Segment a new point cloud
python scripts/segment.py \
--input new_scene.ply \
--ckpt checkpoints/best_model.pt \
--output segmented.ply \
--device cuda
Open segmented.ply in CloudCompare or MeshLab β each object instance is colored differently.
How the algorithm works internally
Classical region growing vs. LRGNet
| Classical region growing (PCL) | LRGNet | |
|---|---|---|
| Similarity rule | Hand-crafted threshold on normals / color / curvature | Learned by a neural network |
| Direction | Can only add points to the region | Can add AND remove points ("morphing") |
| Class knowledge | Often class-specific or needs tuning | Class-agnostic β works on any object |
| Recovery | None β one bad seed ruins the segment | Network learns to recover from mistakes |
Step-by-step pipeline
Raw point cloud (.ply / .pcd)
β
βΌ
βββββββββββββββββββββββ
β Voxel equalization β β keep 1 point per voxel (default 0.1 m)
β (removes density β prevents oversampled regions from dominating
β bias) β
βββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β Normal + curvature β β PCA on 3Γ3Γ3 voxel neighborhood
β estimation β flat areas β low curvature (good seeds)
βββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β Build 13-D feature β β [x,y,z, room_x, room_y, room_z,
β vector per point β r,g,b, nx,ny,nz, curvature]
βββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β Region growing loop β β for each seed (sorted by curvature):
β β 1. Find boundary neighbors (6-connected voxels)
β βββββββββββββββββ β 2. Sample 512 inliers + 512 neighbors
β β LrgNet β β 3. Center coordinates (translation invariance)
β β dual-branch β β 4. Forward pass β add logits + remove logits
β β 1D PointNet β β 5. Stochastic decision: accept / reject points
β βββββββββββββββββ β 6. Repeat until "stuck" or max steps
β β
βββββββββββββββββββββββ
β
βΌ
Colored PLY output
Network architecture (lrg_net.py)
LrgNet is a dual-branch 1D PointNet:
- Inlier branch β processes the current region points.
- Neighbor branch β processes candidate boundary points.
Both branches run a stack of 1D convolutions:
Conv1D(13 -> 64) β Conv1D(64 -> 64) β Conv1D(64 -> 64)
β Conv1D(64 -> 128) β Conv1D(128 -> 512)
After the deepest layer, a global max-pool extracts one feature vector per branch. These global vectors are tiled back to the point counts and concatenated with per-point features. Two classification heads then predict:
add_headβ for each neighbor point: probability it should join the region.remove_headβ for each inlier point: probability it should leave the region.
Both heads use binary cross-entropy loss and are trained jointly.
Why two heads?
Classical region growing is monotonic β it can only expand. LRGNet can shrink the region too. This is critical for recovering from an early bad seed or an over-segmentation.
Training data generation (stage_data.py)
Training is fully supervised from ground-truth instance labels:
- For each object instance, pick
krandom seeds inside it. - Simulate region growing using the true labels.
- At every step, record:
- the current (possibly noisy) inlier set
- the boundary neighbor set
- ground-truth binary labels: add / don't add, remove / don't remove
- Inject controlled noise (
add_mistake_prob β 0.2,remove_mistake_prob β 0.2)
β the network sees "messy" regions and learns to clean them up.
Centering (translation invariance)
At both training and inference time, the XYZ coordinates of inliers and neighbors are centered by subtracting the median XY of the inlier set, and feature channels (RGB, normals, curvature) are centered by subtracting their median over the inlier set. This makes the network translation invariant β it only cares about local geometry and appearance, not absolute room coordinates.
Repository layout
learn_region_grow/
βββ __init__.py # Package metadata
βββ io.py # PLY / PCD loaders and savers + color maps
βββ preprocess.py # Voxel equalization, normal/curvature, feature vector
βββ lrg_net.py # PyTorch LrgNet model
βββ growing.py # Region growing inference engine
βββ stage_data.py # Training data generator from labeled clouds
βββ train.py # PyTorch training loop
βββ utils.py # Sampling, centering, clustering metrics
scripts/
βββ segment.py # CLI: run inference on a new point cloud
βββ train.py # CLI: train from staged H5 files
βββ stage_data.py # CLI: generate staged H5 from labeled cloud
Parameters & tuning tips
| Parameter | Default | Effect |
|---|---|---|
resolution |
0.1 m |
Voxel grid size. Smaller = more detail, slower. |
lite |
0 |
Model size: 0=full, 1=half, 2=quarter channels. |
add_threshold / remove_threshold |
0.5 |
Confidence cutoffs for deterministic mode. |
stochastic |
True |
Random sampling weighted by confidence (paper default). |
cluster_threshold |
10 |
Minimum points to keep a segment. |
seeds_per_instance |
5 |
Random seeds per object during training staging. |
add_mistake_prob |
0.2 |
Noise injected into "add" labels during staging. |
remove_mistake_prob |
0.2 |
Noise injected into "remove" labels during staging. |
When to change resolution
- Indoor scenes (S3DIS, ScanNet):
0.1m works well. - Outdoor LiDAR (Semantic KITTI): use
0.3m because objects are larger and sparser. - Small objects / high detail: try
0.05m.
When to use lite
lite=0: best accuracy, use on GPU.lite=1or2: faster inference, lower memory, slight accuracy drop.
Differences from the original TensorFlow 1.x code
| Aspect | Original | This port |
|---|---|---|
| Framework | TensorFlow 1.x (tf.compat.v1) |
PyTorch 2.x |
| Input format | H5 dumps from S3DIS / ScanNet | Any PLY or PCD |
| Model branches | Shared conv weights | Independent branches (clearer, no leakage) |
| Normals | 3Γ3Γ3 voxel window | Same, with k-NN fallback for sparse clouds |
| Seed ordering | Curvature ascending | Same |
| Loss | TF cross-entropy | PyTorch BCEWithLogitsLoss |
| Output | Colored PLY only | PLY + PCD, with label IDs |
Citation
If you use this code, please cite the original paper:
@article{chen2021lrgnet,
title={LRGNet: Learnable Region Growing for Class-Agnostic Point Cloud Segmentation},
author={Chen, Jingdao and Kira, Zsolt and Cho, Yong K},
journal={IEEE Robotics and Automation Letters},
volume={6},
number={3},
pages={5205--5212},
year={2021},
publisher={IEEE}
}
License
This port is released under the MIT license. The original repository is available at jingdao/learn_region_grow.