bokesyo commited on
Commit
7ba6f6d
·
verified ·
1 Parent(s): 1048188

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +11 -10
app.py CHANGED
@@ -216,22 +216,23 @@ def answer_question(images, question):
216
 
217
 
218
  with gr.Blocks() as app:
219
- gr.Markdown("# Memex: OCR-free Visual Document Embedding Model as Your Personal Librarian")
220
- gr.Markdown("""The model only takes images as document-side inputs and produce vectors representing document pages. Memex is trained with over 200k query-visual document pairs, including textual document, visual document, arxiv figures, plots, charts, industry documents, textbooks, ebooks, and openly-available PDFs, etc. Its performance is on a par with our ablation text embedding model on text-oriented documents, and an advantages on visually-intensive documents.
221
-
222
- Our model is capable of:
223
 
224
- - Help you read a long visually-intensive or text-oriented PDF document and find the pages that answer your question.
225
 
226
- - Help you build a personal library and retireve book pages from a large collection of books.
227
 
228
- - It works like human: read and comprehend with vision and remember multimodal information in hippocampus.""")
229
 
230
- gr.Markdown("- Our model is proudly based on MiniCPM-V series [MiniCPM-V-2.6](https://huggingface.co/openbmb/MiniCPM-V-2_6) [MiniCPM-V-2](https://huggingface.co/openbmb/MiniCPM-V-2).")
 
231
 
232
- gr.Markdown("- We open-sourced our model at [RhapsodyAI/minicpm-visual-embedding-v0](https://huggingface.co/RhapsodyAI/minicpm-visual-embedding-v0)")
233
 
234
- gr.Markdown("- Currently we support PDF document with less than 50 pages, PDF over 50 pages will reach GPU time limit.")
235
 
236
  with gr.Row():
237
  file_input = gr.File(type="binary", label="Upload PDF")
 
216
 
217
 
218
  with gr.Blocks() as app:
219
+ gr.Markdown("# MiniCPMV-RAG-PDFQA: Two Vision Language Models Enable End-to-End RAG")
220
+
221
+ gr.Markdown("""
222
+ - A Vision Language Model Dense Retriever ([MiniCPM-Visual-Embedding](https://huggingface.co/RhapsodyAI/minicpm-visual-embedding-v0)) **directly reads** your PDFs **without need of OCR**, produce **multimodal dense representations** and build your personal library.
223
 
224
+ - **Ask a question**, it retrieve most relavant pages, then [MiniCPM-V-2.6](https://huggingface.co/spaces/openbmb/MiniCPM-V-2_6) will answer your question based on pages recalled, with strong multi-image understanding capability.
225
 
226
+ 1. It helps you read a long **visually-intensive** or **text-oriented** PDF document and find the pages that answer your question.
227
 
228
+ 2. It helps you build a personal library and retireve book pages from a large collection of books.
229
 
230
+ 3. It works like a human: read, store, retrieve, and answer with full vision.
231
+ """)
232
 
233
+ gr.Markdown("- We **open-sourced** our visual embedding model at [RhapsodyAI/minicpm-visual-embedding-v0](https://huggingface.co/RhapsodyAI/minicpm-visual-embedding-v0)")
234
 
235
+ gr.Markdown("- Currently online demo support PDF document with less than 50 pages.")
236
 
237
  with gr.Row():
238
  file_input = gr.File(type="binary", label="Upload PDF")