Spaces:
Running
title: CUD Traffic AI
emoji: 🚦
colorFrom: yellow
colorTo: red
sdk: docker
app_port: 7860
pinned: false
CUD - AAI - Midterm Project - Traffic Incident Summarization
This repo compares extractive and abstractive summarization methods for traffic incident reports and ships with a polished React + FastAPI demo.
What is included
- U.S. Accidents ingestion with automatic Kaggle download support
- GCC regional track with bundled Dubai, Abu Dhabi, and UAE federal sample datasets
- Rule-based GCC narrative generation so structured GCC records become natural-language incident reports
- Baselines: Lead-1 and TextRank
- Abstractive models: BART, Flan-T5, optional PEGASUS
- Evaluation pipeline, notebooks, LaTeX paper draft, poster content, and a demo UI
GCC data note
The repo now includes official source references for Dubai Pulse, Abu Dhabi Open Data, and UAE federal traffic statistics, along with normalized bundled sample files so the project runs immediately offline. This is the practical compromise because public GCC portals often expose structured records, JavaScript-only dashboards, or gated exports rather than ready-to-bundle narrative text.
In the paper, describe the GCC track like this:
Structured GCC traffic records were normalized into a common schema and converted into operator-style narrative incident descriptions using a rule-based text generator. Official source references were retained for provenance, while bundled sample extracts were used to make the demo reproducible offline.
Quick start
1. Python environment
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
2. Prepare data
python -m src.cli.run_prepare --source both
Behavior:
- If
data/raw/US_Accidents_March23.csvis missing, the script attempts Kaggle download. - GCC sample sources are already bundled.
- A combined corpus is written to
data/interim/combined_incident_corpus.csv.
3. Start backend
uvicorn backend.main:app --reload --port 8000
4. Start frontend
cd frontend
npm install
npm run dev
Demo features
- Beautiful hero dashboard for screenshots
- Dataset track toggle: US or GCC
- Sample incident picker
- Summarize and compare endpoints
- Copy and download summary cards
- Batch CSV upload preview
Important paths
data/raw/gcc/source_manifest.csvdata/interim/gcc_narratives.csvdocs/paper/main.texdocs/poster/poster_content.md