CUD-Traffic-AI / README.md
rajvivan's picture
Update README.md
cbc3062 verified
metadata
title: CUD Traffic AI
emoji: 🚦
colorFrom: yellow
colorTo: red
sdk: docker
app_port: 7860
pinned: false

CUD - AAI - Midterm Project - Traffic Incident Summarization

This repo compares extractive and abstractive summarization methods for traffic incident reports and ships with a polished React + FastAPI demo.

What is included

  • U.S. Accidents ingestion with automatic Kaggle download support
  • GCC regional track with bundled Dubai, Abu Dhabi, and UAE federal sample datasets
  • Rule-based GCC narrative generation so structured GCC records become natural-language incident reports
  • Baselines: Lead-1 and TextRank
  • Abstractive models: BART, Flan-T5, optional PEGASUS
  • Evaluation pipeline, notebooks, LaTeX paper draft, poster content, and a demo UI

GCC data note

The repo now includes official source references for Dubai Pulse, Abu Dhabi Open Data, and UAE federal traffic statistics, along with normalized bundled sample files so the project runs immediately offline. This is the practical compromise because public GCC portals often expose structured records, JavaScript-only dashboards, or gated exports rather than ready-to-bundle narrative text.

In the paper, describe the GCC track like this:

Structured GCC traffic records were normalized into a common schema and converted into operator-style narrative incident descriptions using a rule-based text generator. Official source references were retained for provenance, while bundled sample extracts were used to make the demo reproducible offline.

Quick start

1. Python environment

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

2. Prepare data

python -m src.cli.run_prepare --source both

Behavior:

  • If data/raw/US_Accidents_March23.csv is missing, the script attempts Kaggle download.
  • GCC sample sources are already bundled.
  • A combined corpus is written to data/interim/combined_incident_corpus.csv.

3. Start backend

uvicorn backend.main:app --reload --port 8000

4. Start frontend

cd frontend
npm install
npm run dev

Demo features

  • Beautiful hero dashboard for screenshots
  • Dataset track toggle: US or GCC
  • Sample incident picker
  • Summarize and compare endpoints
  • Copy and download summary cards
  • Batch CSV upload preview

Important paths

  • data/raw/gcc/source_manifest.csv
  • data/interim/gcc_narratives.csv
  • docs/paper/main.tex
  • docs/poster/poster_content.md