davanstrien's picture
davanstrien HF Staff
BPL shelf-list NuExtract3 extraction demo (120 cards)
bba8312 verified
metadata
title: BPL Shelf-List Card Extraction
emoji: πŸ“‡
colorFrom: gray
colorTo: indigo
sdk: static
pinned: false
license: cc-by-4.0

Boston Public Library shelf-list cards β†’ structured records

A demo of zero-shot structured extraction on scanned BPL shelf-list catalogue cards using NuExtract3 (4B, Apache-2.0), run as a single command on Hugging Face Jobs via the uv-scripts/ocr nuextract3.py script.

Each card image is paired with the JSON the model returned for a target catalogue schema (shelf_no, author, title, place, date, accession_no, …). The model also self-classifies shelf-divider cards vs bibliographic cards.

This is an unreviewed zero-shot demo β€” the next step is expert curator review and an iteration loop (and, potentially, a fine-tuned community model). Source cards are public domain via the Boston Public Library.