--- title: Unified Document Extraction API emoji: 📄 colorFrom: blue colorTo: indigo sdk: docker app_file: app.py pinned: false --- # 🚀 Unified Document Extraction API **One API, Two Engines: Docling + DocStrange** Extract structured data from any document using AI-powered engines. ## Features - ✅ **Docling** - Advanced document parsing with structure preservation - ✅ **DocStrange** - GPU-accelerated intelligent document processing - ✅ **Multiple formats** - PDF, DOCX, XLSX, PPTX, Images, and more - ✅ **Structured output** - Markdown, JSON, Tables ## API Endpoints - `GET /` - Health check - `GET /engines` - List available engines - `POST /convert` - Full document conversion - `POST /convert/markdown` - Markdown only - `POST /convert/tables` - Tables only ## Usage ```bash # Convert with Docling curl -X POST "https://YOUR_SPACE.hf.space/convert?engine=docling" \ -F "file=@document.pdf" # Convert with DocStrange curl -X POST "https://YOUR_SPACE.hf.space/convert?engine=docstrange" \ -F "file=@document.pdf" ``` ## Integration Works with **DataSync** application for ERPNext integration. ## License MIT