--- title: Mic-ID emoji: "πŸŽ™οΈ" colorFrom: red colorTo: purple sdk: streamlit sdk_version: "1.31.1" python_version: "3.10" app_file: app.py pinned: false license: mit --- # Mic-ID ![Streamlit](https://img.shields.io/badge/Streamlit-FF4B4B?style=flat&logo=streamlit&logoColor=white) ![Python](https://img.shields.io/badge/Python-3776AB?style=flat&logo=python&logoColor=white) A Streamlit front-end around a microphone fingerprinting baseline: drop in a short clip, get the most likely capture device plus an optional tonal hint. πŸŽ™οΈ Built for quick lab demos, perfect for showing off how far classic features still go. ## Table of Contents - [Highlights](#highlights) - [Live Demo Flow](#live-demo-flow) - [Quick Start](#quick-start) - [Controls at a Glance](#controls-at-a-glance) - [Device Recognition](#device-recognition) - [Scale Detection](#scale-detection) - [Bundled Example Clips](#bundled-example-clips) - [Download Contents](#download-contents) - [Testing](#testing) - [Project Layout](#project-layout) - [Roadmap](#roadmap) - [Contributing](#contributing) ## Highlights - πŸ”Ž End-to-end workflow for collecting, training, and demoing mic classification in one repo. - πŸŽ›οΈ Feature-first approach: log-mel, MFCC, and spectral stats feed a histogram gradient boosting model. - 🧠 Friendly predictions: class IDs map to real device names so you can narrate results without decoding labels. - πŸ—‚οΈ Lightweight artefacts: plain `.wav` folders in `data/`, pickled models in `models/`, metrics and confusion heatmaps in `reports/`. - βš™οΈ Streamlit UI mirrors the CLI helpers, including loudness normalisation and experimental scale read-outs. ## Live Demo Flow If you are running a live session, keep this script handy: - 🎧 `streamlit run app.py` from the project root. - πŸ“‚ Use `data/audio/airport-helsinki-204-6138-a.wav` to introduce the core upload flow and the default top-3 guess list. - πŸ”„ Swap to `data/audio/airport-helsinki-204-6138-b.wav` or `data/audio/airport-helsinki-204-6138-c.wav` to highlight how the twin scene shifts the predicted device while the environment stays constant. - πŸ“± Jump to `data/iphone/clip_05.wav` to show the locally recorded class and talk about adding in-house gear with `utils.py`. - πŸ“Š Mention the probability bar chart and the saved copy under `uploads/hooks - ` for later analysis. ## Quick Start ⚑ Four commands set everything up: ```bash python3 -m venv .venv source .venv/bin/activate pip install -r requirements.txt python3 scripts/refresh_metadata.py # rebuild hashes + provenance records python3 train.py --config configs/base.yaml # optional if you want to refresh the model ``` Then launch the app with `streamlit run app.py` (defaults to http://localhost:8501). ## Hugging Face Space Setup Want a hosted demo? This repo is ready to drop into a Hugging Face Space using the Streamlit SDK. The short version: 1. `pip install -U "huggingface_hub[cli]"` and run `huggingface-cli login` with a write-scoped access token. 2. `git clone` your Space (for example `https://huggingface.co/spaces/connaaa/mic-id`) into an empty folder. 3. Copy the contents of this repository into that clone, keeping `README.md`, `app.py`, `requirements.txt`, `packages.txt`, `models/`, and the curated `data/` subsets you want online. 4. Commit and `git push`. The Space will build the dependencies listed in `requirements.txt` plus Debian packages from `packages.txt`. Large training corpora can be trimmed before pushing if you only need the pretrained model for inference. ## Controls at a Glance | Control | Default | What it does | | --- | --- | --- | | File uploader | – | Accepts WAV/MP3/M4A, converts to 16 kHz mono, and normalises loudness before scoring. | | `How many guesses should we list?` slider | 3 | Sets the length of the ranked prediction list and bar chart. | | Training data expander | Collapsed | Recaps which datasets went into the current checkpoint, handy during demos. | | Prediction pane | Auto | Shows the tonal estimate (if any), RMS loudness, ranked devices, and probability chart. | Each control includes inline help text so presenters can improvise without notes. ## Device Recognition - 🧱 Audio flows through `features.extract_features`, stitching log-mel and MFCC statistics with zero-crossing, centroid, roll-off, and flatness cues. - 🌲 `python3 train.py --config configs/base.yaml` reads the provenance metadata, enforces per-device clip minimums, and fits a `HistGradientBoostingClassifier` before saving artefacts to `models/model.pkl` plus the label encoder. - πŸ“ˆ Every training run exports `reports/metrics.json`, `reports/confusion_matrix.png`, and a timestamped `reports/runs/run-*.json` snapshot so you can cite precision/recall live. - 🏷️ The app and CLI surface friendly names (e.g. β€œZoom F8 field recorder”) pulled from `devices.describe_label()` to keep the story human-readable. ## Scale Detection - 🎼 Uses a simple `librosa` chroma profile match across all major/minor keys. - βœ… High confidence (β‰₯β€―0.6) renders a green highlight, 0.4–0.6 shows an amber β€œlow confidence” tag, and anything lower hides the scale suggestion entirely. - πŸ₯ Purely percussive or noisy clips skip the tonal hint, which is exactly what you want for location recordings. ## Bundled Example Clips All sample audio lives under `data/` and mirrors the device IDs referenced in the demo. | Folder | What it represents | Count* | | --- | --- | --- | | `audio/` | TAU Urban Acoustic Scenes clips (device A) – Zoom F8 field recorder | 295 | | `audio2/` | TAU Urban Acoustic Scenes clips (device B) – Samsung Galaxy S7 | 295 | | `audio9/` | TAU Urban Acoustic Scenes clips (device C) – iPhone SE | 295 | | `iphone/` | Locally recorded iPhone speech snippets captured with `utils.py` | 4 | | `laptop/` | MacBook built-in mic samples recorded in a treated room | 4 | | `outtakes/` | Extra captures you can promote into training data after curation | varies | ⋆Counts based on the current repo snapshot; refresh `data/` to rebalance as needed. ## Download Contents Every run generates artefacts you can drop into a slide deck or share with collaborators: - 🎯 `models/model.pkl` and `models/label_encoder.pkl` store the trained classifier and label map. - πŸ“Š `reports/metrics.json` plus `reports/confusion_matrix.png` capture evaluation snapshots for the latest training session. - 🧾 `data/metadata.csv` tracks every clip’s provenance, licence, and hash for reproducible retrains. - πŸ—‚οΈ `reports/runs/run-*.json` snapshots record the exact config, dataset summary, and hashes used for each training run. - πŸ“ Uploaded clips are preserved under `uploads/hooks - ` so you can replay or re-label them later. ## Testing Quick smoke checks live in the scripts themselves: ```bash # Validate provenance without training python3 train.py --dry-run # Rebuild the model, metrics, and run snapshot python3 train.py --config configs/base.yaml # Score a few clips and verify probabilities look sane python3 predict.py data/laptop/clip_01.wav data/iphone/clip_05.wav --topk 5 ``` For deeper regression coverage, wire these commands into your CI and compare the resulting metrics JSON against previous baselines. ## Project Layout ``` mic-id/ β”œβ”€ app.py # Streamlit UI for uploading and scoring clips β”œβ”€ predict.py # CLI scorer with friendly device names β”œβ”€ train.py # Dataset loader, model trainer, metric exporter β”œβ”€ configs/ # YAML training configs + device provenance defaults β”œβ”€ features.py # Audio feature extraction helpers β”œβ”€ utils.py # Command-line recorder for new device samples β”œβ”€ data/ # Per-device waveforms and provenance metadata β”‚ └─ metadata.csv # Clip-level provenance (source/licence/hash) β”œβ”€ models/ # Saved classifier + label encoder β”œβ”€ reports/ # Metrics JSON and confusion matrix plots β”œβ”€ docs/ # Data sourcing guide and prep notes β”œβ”€ scripts/ # Dataset preparation helpers (TAU, Freesound, etc.) └─ uploads/ # Cached demo uploads saved by the Streamlit app ``` ## Roadmap - πŸ›°οΈ Add a lightweight CNN baseline alongside the gradient boosting model for comparison. - πŸ§ͺ Ship augmentation scripts (noise, EQ, impulse responses) to spotlight microphone colouration differences. - πŸ” Wire metadata/hash validation into CI so new clips are rejected unless provenance is complete. - πŸ“¦ Polish export helpers so the app can bundle probabilities + features in one download. ## Contributing Issues and pull requests are welcome. 🀝 If you contribute new devices, include a short note (or a `metadata.csv` entry) describing the capture setup so others can reproduce your results and audit licensing.