Alexandre-Numind commited on
Commit
c77a4da
·
verified ·
1 Parent(s): 2ea8a97

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +11 -7
app.py CHANGED
@@ -1316,13 +1316,17 @@ with gr.Blocks(
1316
 
1317
  gr.Markdown(
1318
  """
1319
- We introduce **NuExtract 3** a 4B open-source **MIT License** VLM specialized in document extraction.
1320
- NuExtract 3 unifies structured extraction document to JSON and content extraction document to Markdown,
1321
- a.k.a. OCR — into one model.
1322
-
1323
- NuExtract 3 has been trained via Reinforcement Learning to have extraction-specific reasoning abilities, which can
1324
- be switched on/off on demand. We find that NuExtract 3 substantially outperforms similar-sized models for both
1325
- structured extraction and content extraction, making it the new reference model of open-source document extraction.
 
 
 
 
1326
  """,
1327
  elem_classes=["intro-card"],
1328
  )
 
1316
 
1317
  gr.Markdown(
1318
  """
1319
+ **NuExtract3** is a unified **4B** vision-language reasoning model for document understanding.
1320
+ It combines strong **structured information extraction** with high-quality **image-to-Markdown** conversion, making it suitable for extraction pipelines, OCR, and RAG preprocessing for all types of documents such as scans, receipts, forms, invoices, contracts or tables.
1321
+
1322
+ ## Overview
1323
+ - **Structured extraction**: input (text/images) + JSON template + instructions --> JSON output
1324
+ - **Markdown conversion**: input (text/images) --> Markdown
1325
+ - **Multimodal inputs**: text, images, or text + images.
1326
+ - **Multilingual** documents.
1327
+ - **Reasoning** and non-reasoning inference modes.
1328
+ - **Template generation** for structured extraction from natural language or input document.
1329
+
1330
  """,
1331
  elem_classes=["intro-card"],
1332
  )