davanstrien HF Staff commited on
Commit
4af31e3
·
1 Parent(s): beca8ab

description

Browse files
Files changed (1) hide show
  1. app.py +7 -6
app.py CHANGED
@@ -408,13 +408,14 @@ with gr.Blocks() as demo:
408
  "For decades, galleries, libraries, archives, and museums (GLAMs) have used Optical Character Recognition "
409
  "to transform digitized books, newspapers, and manuscripts into machine-readable text. Traditional OCR "
410
  "produces complex XML formats like ALTO, packed with layout details but difficult to use. "
411
- "Now, cutting-edge Vision-Language Models (VLMs) are revolutionizing OCR with simpler, cleaner Markdown output. "
412
- "This Space makes it easy to compare these two approaches and see which works best for your historical documents. "
413
- "Upload a historical document image and its XML file to compare these approaches side-by-side. "
414
  "We'll extract the reading order from your XML for an apples-to-apples comparison of the actual text content.\n\n"
415
- "**Available models:** [RolmOCR](https://huggingface.co/reducto/RolmOCR) | "
416
- "[Nanonets-OCR-s](https://huggingface.co/nanonets/Nanonets-OCR-s) | "
417
- "[olmOCR](https://huggingface.co/allenai/olmOCR-7B-0225-preview)"
 
418
  )
419
 
420
  gr.Markdown("---")
 
408
  "For decades, galleries, libraries, archives, and museums (GLAMs) have used Optical Character Recognition "
409
  "to transform digitized books, newspapers, and manuscripts into machine-readable text. Traditional OCR "
410
  "produces complex XML formats like ALTO, packed with layout details but difficult to use. "
411
+ "Now, Vision-Language Models (VLMs) are revolutionizing OCR with simpler, cleaner output. "
412
+ "This Space lets you compare three leading VLM-based OCR models against traditional approaches. "
413
+ "Upload a historical document image and its XML file to see them side-by-side. "
414
  "We'll extract the reading order from your XML for an apples-to-apples comparison of the actual text content.\n\n"
415
+ "**Available models:**\n"
416
+ "[RolmOCR](https://huggingface.co/reducto/RolmOCR) - Fast & general-purpose\n"
417
+ "[Nanonets-OCR-s](https://huggingface.co/nanonets/Nanonets-OCR-s) - Advanced with table/math support\n"
418
+ "• [olmOCR](https://huggingface.co/allenai/olmOCR-7B-0225-preview) - Allen AI's pioneering 7B document specialist"
419
  )
420
 
421
  gr.Markdown("---")