File size: 9,437 Bytes
94773c9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
# LRGNet: Learnable Region Growing for Point Clouds

A **PyTorch** implementation of **LRGNet** β€” *Learnable Region Growing for Class-Agnostic Point Cloud Segmentation* β€” originally described in:

> **LRGNet: Learnable Region Growing for Class-Agnostic Point Cloud Segmentation**  
> Jingdao Chen, Zsolt Kira, Yong K. Cho  
> *IEEE Robotics and Automation Letters (RAL), 2021*  
> [arXiv:2103.09160](https://arxiv.org/abs/2103.09160)  
> Original code: [jingdao/learn_region_grow](https://github.com/jingdao/learn_region_grow)

---

## What this repository does

This repo provides a **clean, modern PyTorch port** of the LRGNet algorithm that can be applied to **any PLY or PCD point cloud** without being tied to specific benchmark datasets (S3DIS, ScanNet, etc.).

- **Load** `.ply` or `.pcd` point clouds.
- **Preprocess** (voxel equalization, normal/curvature estimation).
- **Train** an `LrgNet` model on labeled point clouds.
- **Segment** unlabeled point clouds into class-agnostic object instances.

---

## Quick start

### 1. Install

```bash
pip install -r requirements.txt
```

### 2. Train on a labeled point cloud

You need a PLY/PCD file and a `.npy` file with shape `(N,)` containing integer instance labels per point.

```bash
# Stage training data
python scripts/stage_data.py \
    --input my_scene.ply \
    --labels my_scene_labels.npy \
    --output my_scene_staged.h5 \
    --resolution 0.1

# Train
python scripts/train.py \
    --data_dir . \
    --epochs 50 --batch_size 16 --lr 1e-3 \
    --save_dir checkpoints
```

### 3. Segment a new point cloud

```bash
python scripts/segment.py \
    --input new_scene.ply \
    --ckpt checkpoints/best_model.pt \
    --output segmented.ply \
    --device cuda
```

Open `segmented.ply` in [CloudCompare](https://www.danielgm.net/cc/) or [MeshLab](https://www.meshlab.net/) β€” each object instance is colored differently.

---

## How the algorithm works internally

### Classical region growing vs. LRGNet

| | Classical region growing (PCL) | **LRGNet** |
|---|---|---|
| **Similarity rule** | Hand-crafted threshold on normals / color / curvature | **Learned** by a neural network |
| **Direction** | Can only **add** points to the region | Can **add** AND **remove** points ("morphing") |
| **Class knowledge** | Often class-specific or needs tuning | **Class-agnostic** β€” works on any object |
| **Recovery** | None β€” one bad seed ruins the segment | Network learns to **recover** from mistakes |

### Step-by-step pipeline

```
Raw point cloud (.ply / .pcd)
        β”‚
        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Voxel equalization β”‚  ← keep 1 point per voxel (default 0.1 m)
β”‚  (removes density   β”‚     prevents oversampled regions from dominating
β”‚   bias)               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚
        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Normal + curvature  β”‚  ← PCA on 3Γ—3Γ—3 voxel neighborhood
β”‚ estimation          β”‚     flat areas β†’ low curvature (good seeds)
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚
        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Build 13-D feature  β”‚  ← [x,y,z, room_x, room_y, room_z,
β”‚ vector per point    β”‚      r,g,b, nx,ny,nz, curvature]
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚
        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Region growing loop β”‚  ← for each seed (sorted by curvature):
β”‚                     β”‚     1. Find boundary neighbors (6-connected voxels)
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚     2. Sample 512 inliers + 512 neighbors
β”‚   β”‚   LrgNet      β”‚ β”‚     3. Center coordinates (translation invariance)
β”‚   β”‚  dual-branch  β”‚ β”‚     4. Forward pass β†’ add logits + remove logits
β”‚   β”‚  1D PointNet  β”‚ β”‚     5. Stochastic decision: accept / reject points
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚     6. Repeat until "stuck" or max steps
β”‚                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚
        β–Ό
   Colored PLY output
```

### Network architecture (`lrg_net.py`)

`LrgNet` is a **dual-branch 1D PointNet**:

- **Inlier branch** β€” processes the current region points.
- **Neighbor branch** β€” processes candidate boundary points.

Both branches run a stack of 1D convolutions:
```
Conv1D(13 -> 64) β†’ Conv1D(64 -> 64) β†’ Conv1D(64 -> 64)
      β†’ Conv1D(64 -> 128) β†’ Conv1D(128 -> 512)
```

After the deepest layer, a **global max-pool** extracts one feature vector per branch. These global vectors are tiled back to the point counts and concatenated with per-point features. Two classification heads then predict:

1. **`add_head`** β€” for each neighbor point: probability it should join the region.
2. **`remove_head`** β€” for each inlier point: probability it should leave the region.

Both heads use binary cross-entropy loss and are trained jointly.

> **Why two heads?**  
> Classical region growing is monotonic β€” it can only expand. LRGNet can **shrink** the region too. This is critical for recovering from an early bad seed or an over-segmentation.

### Training data generation (`stage_data.py`)

Training is **fully supervised** from ground-truth instance labels:

1. For each object instance, pick `k` random seeds inside it.
2. Simulate region growing using the true labels.
3. At every step, record:
   - the current (possibly noisy) inlier set
   - the boundary neighbor set
   - ground-truth binary labels: *add / don't add*, *remove / don't remove*
4. **Inject controlled noise** (`add_mistake_prob β‰ˆ 0.2`, `remove_mistake_prob β‰ˆ 0.2`)  
   β†’ the network sees "messy" regions and learns to clean them up.

### Centering (translation invariance)

At **both** training and inference time, the XYZ coordinates of inliers and neighbors are **centered** by subtracting the median XY of the inlier set, and feature channels (RGB, normals, curvature) are centered by subtracting their median over the inlier set. This makes the network **translation invariant** β€” it only cares about local geometry and appearance, not absolute room coordinates.

---

## Repository layout

```
learn_region_grow/
β”œβ”€β”€ __init__.py          # Package metadata
β”œβ”€β”€ io.py                # PLY / PCD loaders and savers + color maps
β”œβ”€β”€ preprocess.py         # Voxel equalization, normal/curvature, feature vector
β”œβ”€β”€ lrg_net.py           # PyTorch LrgNet model
β”œβ”€β”€ growing.py           # Region growing inference engine
β”œβ”€β”€ stage_data.py        # Training data generator from labeled clouds
β”œβ”€β”€ train.py             # PyTorch training loop
└── utils.py             # Sampling, centering, clustering metrics

scripts/
β”œβ”€β”€ segment.py           # CLI: run inference on a new point cloud
β”œβ”€β”€ train.py             # CLI: train from staged H5 files
└── stage_data.py        # CLI: generate staged H5 from labeled cloud
```

---

## Parameters & tuning tips

| Parameter | Default | Effect |
|---|---|---|
| `resolution` | `0.1` m | Voxel grid size. Smaller = more detail, slower. |
| `lite` | `0` | Model size: `0`=full, `1`=half, `2`=quarter channels. |
| `add_threshold` / `remove_threshold` | `0.5` | Confidence cutoffs for deterministic mode. |
| `stochastic` | `True` | Random sampling weighted by confidence (paper default). |
| `cluster_threshold` | `10` | Minimum points to keep a segment. |
| `seeds_per_instance` | `5` | Random seeds per object during training staging. |
| `add_mistake_prob` | `0.2` | Noise injected into "add" labels during staging. |
| `remove_mistake_prob` | `0.2` | Noise injected into "remove" labels during staging. |

### When to change `resolution`
- **Indoor scenes (S3DIS, ScanNet)**: `0.1` m works well.
- **Outdoor LiDAR (Semantic KITTI)**: use `0.3` m because objects are larger and sparser.
- **Small objects / high detail**: try `0.05` m.

### When to use `lite`
- `lite=0`: best accuracy, use on GPU.
- `lite=1` or `2`: faster inference, lower memory, slight accuracy drop.

---

## Differences from the original TensorFlow 1.x code

| Aspect | Original | This port |
|---|---|---|
| Framework | TensorFlow 1.x (`tf.compat.v1`) | **PyTorch** 2.x |
| Input format | H5 dumps from S3DIS / ScanNet | **Any PLY or PCD** |
| Model branches | Shared conv weights | Independent branches (clearer, no leakage) |
| Normals | 3Γ—3Γ—3 voxel window | Same, with k-NN fallback for sparse clouds |
| Seed ordering | Curvature ascending | Same |
| Loss | TF cross-entropy | PyTorch `BCEWithLogitsLoss` |
| Output | Colored PLY only | PLY + PCD, with label IDs |

---

## Citation

If you use this code, please cite the original paper:

```bibtex
@article{chen2021lrgnet,
  title={LRGNet: Learnable Region Growing for Class-Agnostic Point Cloud Segmentation},
  author={Chen, Jingdao and Kira, Zsolt and Cho, Yong K},
  journal={IEEE Robotics and Automation Letters},
  volume={6},
  number={3},
  pages={5205--5212},
  year={2021},
  publisher={IEEE}
}
```

---

## License

This port is released under the MIT license. The original repository is available at [jingdao/learn_region_grow](https://github.com/jingdao/learn_region_grow).