Spaces:

praveendatascience
/

Crowd-Detection

Running

File size: 8,602 Bytes

---
title: Civic Pulse — Crowd Counting
emoji: 👥
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
---

# Civic Pulse — Tactical Crowd Intelligence

A full-stack AI drone monitoring dashboard built on **P2PNet** (ICCV 2021).
FastAPI backend + React/Vite frontend with real-time WebSocket video streaming.

## 🚀 Live Deployment (Free Tier)

| Component | Platform | URL |
|-----------|----------|-----|
| **Frontend** | Vercel | `https://crowd-counting.vercel.app` |
| **Backend API** | FastAPI (Docker/HuggingFace Spaces) | `Set your deployed backend URL in frontend/.env.production` |
| **Model Weights** | HuggingFace Hub | `Set HF_WEIGHTS_REPO to your deployed weights repo` |

> ⚠️ The HF Space may sleep after 15 min of inactivity. Open the app 30 seconds before your demo.

## ⚡ Quick Local Setup

```bash
# Backend
pip install -r requirements.txt
uvicorn api:app --reload

# Frontend (in another terminal)
cd frontend
npm install
npm run dev
```

---

# P2PNet (ICCV2021 Oral Presentation)

This repository contains codes for the official implementation in PyTorch of **P2PNet** as described in [Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework](https://arxiv.org/abs/2107.12746).
 
A brief introduction of P2PNet can be found at [机器之心 (almosthuman)](https://mp.weixin.qq.com/s?__biz=MzA3MzI4MjgzMw==&mid=2650827826&idx=3&sn=edd3d66444130fb34a59d08fab618a9e&chksm=84e5a84cb392215a005a3b3424f20a9d24dc525dcd933960035bf4b6aa740191b5ecb2b7b161&mpshare=1&scene=1&srcid=1004YEOC7HC9daYRYeUio7Xn&sharer_sharetime=1633675738338&sharer_shareid=7d375dccd3b2f9eec5f8b27ee7c04883&version=3.1.16.5505&platform=win#rd).

The codes is tested with PyTorch 1.5.0. It may not run with other versions.

## Visualized demos for P2PNet
<img src="vis/congested1.png" width="1000"/>   
<img src="vis/congested2.png" width="1000"/> 
<img src="vis/congested3.png" width="1000"/> 

## The network
The overall architecture of the P2PNet. Built upon the VGG16, it firstly introduce an upsampling path to obtain fine-grained feature map. 
Then it exploits two branches to simultaneously predict a set of point proposals and their confidence scores.

<img src="vis/net.png" width="1000"/>   

## Comparison with state-of-the-art methods
The P2PNet achieved state-of-the-art performance on several challenging datasets with various densities.

| Methods   | Venue     | SHTechPartA <br> MAE/MSE  |SHTechPartB <br> MAE/MSE | UCF_CC_50 <br> MAE/MSE | UCF_QNRF <br> MAE/MSE   |
|:----:|:----:|:----:|:----:|:----:|:----:|
CAN  | CVPR'19 | 62.3/100.0 | 7.8/12.2 | 212.2/**243.7** | 107.0/183.0 |
Bayesian+ | ICCV'19 | 62.8/101.8 | 7.7/12.7 | 229.3/308.2 | 88.7/154.8 |
S-DCNet  | ICCV'19 | 58.3/95.0 | 6.7/10.7 | 204.2/301.3 | 104.4/176.1 |
SANet+SPANet  | ICCV'19 | 59.4/92.5 | 6.5/**9.9** | 232.6/311.7 | -/- |
DUBNet  | AAAI'20 | 64.6/106.8 | 7.7/12.5 | 243.8/329.3 | 105.6/180.5 |
SDANet | AAAI'20 | 63.6/101.8 | 7.8/10.2 | 227.6/316.4 | -/- |
ADSCNet | CVPR'20 | <u>55.4</u>/97.7 | <u>6.4</u>/11.3 | 198.4/267.3 | **71.3**/**132.5**|
ASNet   | CVPR'20 | 57.78/<u>90.13</u> | -/- | <u>174.84</u>/<u>251.63</u> | 91.59/159.71 |
AMRNet  | ECCV'20 | 61.59/98.36 | 7.02/11.00 | 184.0/265.8 | 86.6/152.2 |
AMSNet  | ECCV'20 | 56.7/93.4 | 6.7/10.2 | 208.4/297.3 | 101.8/163.2|
DM-Count  | NeurIPS'20 | 59.7/95.7 | 7.4/11.8 | 211.0/291.5 | 85.6/<u>148.3</u>|
**Ours** |- | **52.74**/**85.06** | **6.25**/**9.9** | **172.72**/256.18 | <u>85.32</u>/154.5 |

Comparison on the [NWPU-Crowd](https://www.crowdbenchmark.com/resultdetail.html?rid=81) dataset.

| Methods   | MAE[O]  |MSE[O] | MAE[L] | MAE[S]   |
|:----:|:----:|:----:|:----:|:----:|
MCNN  | 232.5|714.6 | 220.9|1171.9 |
SANet  | 190.6 | 491.4 | 153.8 | 716.3|
CSRNet | 121.3 | 387.8 | 112.0 | <u>522.7</u> |
PCC-Net  | 112.3 | 457.0 | 111.0 | 777.6 |
CANNet  | 110.0 | 495.3 | 102.3 | 718.3|
Bayesian+  | 105.4 | 454.2 | 115.8 | 750.5 |
S-DCNet   | 90.2 | 370.5 | **82.9** | 567.8 |
DM-Count  | <u>88.4</u> | 388.6 | 88.0 | **498.0** |
**Ours** | **77.44**|**362** | <u>83.28</u>| 553.92 |

The overall performance for both counting and localization.

|nAP$_{\delta}$|SHTechPartA| SHTechPartB | UCF_CC_50 | UCF_QNRF | NWPU_Crowd |
|:----:|:----:|:----:|:----:|:----:|:----:|    
$\delta=0.05$ | 10.9\% | 23.8\%  | 5.0\% | 5.9\% | 12.9\% | 
$\delta=0.25$ | 70.3\% | 84.2\%  | 54.5\% | 55.4\% | 71.3\% |  
$\delta=0.50$ | 90.1\% | 94.1\%  | 88.1\% | 83.2\% | 89.1\% | 
$\delta=\{{0.05:0.05:0.50}\}$ | 64.4\% | 76.3\%  | 54.3\% | 53.1\% | 65.0\% |  

Comparison for the localization performance in terms of F1-Measure on NWPU.

| Method| F1-Measure |Precision| Recall |
|:----:|:----:|:----:|:----:|
FasterRCNN  |  0.068 |  0.958 | 0.035 |
TinyFaces |  0.567  |  0.529 | 0.611 |
RAZ |   0.599 |  0.666 |  0.543|
Crowd-SDNet |  0.637  | 0.651  | 0.624  |
PDRNet |  0.653 | 0.675  | 0.633  |
TopoCount | 0.692  | 0.683  | **0.701** |
D2CNet | <u>0.700</u> | **0.741**  | 0.662 |
**Ours** |**0.712** | <u>0.729</u>  | <u>0.695</u> |

## Installation
* Clone this repo into a directory named P2PNET_ROOT
* Organize your datasets as required
* Install Python dependencies. We use python 3.6.5 and pytorch 1.5.0
```
pip install -r requirements.txt
```

## Organize the counting dataset
We use a list file to collect all the images and their ground truth annotations in a counting dataset. When your dataset is organized as recommended in the following, the format of this list file is defined as:
```
train/scene01/img01.jpg train/scene01/img01.txt
train/scene01/img02.jpg train/scene01/img02.txt
...
train/scene02/img01.jpg train/scene02/img01.txt
```

### Dataset structures:
```
DATA_ROOT/
        |->train/
        |    |->scene01/
        |    |->scene02/
        |    |->...
        |->test/
        |    |->scene01/
        |    |->scene02/
        |    |->...
        |->train.list
        |->test.list
```
DATA_ROOT is your path containing the counting datasets.

### Annotations format
For the annotations of each image, we use a single txt file which contains one annotation per line. Note that indexing for pixel values starts at 0. The expected format of each line is:
```
x1 y1
x2 y2
...
```

## Training

The network can be trained using the `train.py` script. For training on SHTechPartA, use

```
CUDA_VISIBLE_DEVICES=0 python train.py --data_root $DATA_ROOT \
    --dataset_file SHHA \
    --epochs 3500 \
    --lr_drop 3500 \
    --output_dir ./logs \
    --checkpoints_dir ./weights \
    --tensorboard_dir ./logs \
    --lr 0.0001 \
    --lr_backbone 0.00001 \
    --batch_size 8 \
    --eval_freq 1 \
    --gpu_id 0
```
By default, a periodic evaluation will be conducted on the validation set.

## Testing

A trained model (with an MAE of **51.96**) on SHTechPartA is available at "./weights", run the following commands to launch a visualization demo:

```
CUDA_VISIBLE_DEVICES=0 python run_test.py --weight_path ./weights/SHTechA.pth --output_dir ./logs/
```

## Civic Pulse Application

The supported application stack in this repository is:

- FastAPI backend in `api.py`
- React/Vite frontend in `frontend/`

The application loads `weights/SHTechA.pth` by default.

For this application, use the pretrained P2PNet weights directly for inference. Manual point-labeling and one-image fine-tuning are not part of the recommended workflow because they are too small and unstable to improve model quality.

## Acknowledgements

- Part of codes are borrowed from the [C^3 Framework](https://github.com/gjy3035/C-3-Framework).
- We refer to [DETR](https://github.com/facebookresearch/detr) to implement our matching strategy.


## Citing P2PNet

If you find P2PNet is useful in your project, please consider citing us:

```BibTeX
@inproceedings{song2021rethinking,
  title={Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework},
  author={Song, Qingyu and Wang, Changan and Jiang, Zhengkai and Wang, Yabiao and Tai, Ying and Wang, Chengjie and Li, Jilin and Huang, Feiyue and Wu, Yang},
  journal={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2021}
}
```

## Related works from Tencent Youtu Lab
- [AAAI2021] To Choose or to Fuse? Scale Selection for Crowd Counting. ([paper link](https://ojs.aaai.org/index.php/AAAI/article/view/16360) & [codes](https://github.com/TencentYoutuResearch/CrowdCounting-SASNet))
- [ICCV2021] Uniformity in Heterogeneity: Diving Deep into Count Interval Partition for Crowd Counting. ([paper link](https://arxiv.org/abs/2107.12619) & [codes](https://github.com/TencentYoutuResearch/CrowdCounting-UEPNet))