Spaces:

bokesyo
/

MiniCPM_Visual_Document_Retriever_Demo

Running on Zero

bokesyo commited on Aug 6, 2024

Commit

9dbc707

verified ·

1 Parent(s): 4061826

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -184,9 +184,19 @@ model.to(device)
 with gr.Blocks() as app:
-    gr.Markdown("# Memex: OCR-free Visual Document Retrieval @RhapsodyAI")
     gr.Markdown("- We open-sourced our model at [RhapsodyAI/minicpm-visual-embedding-v0](https://huggingface.co/RhapsodyAI/minicpm-visual-embedding-v0)")
     gr.Markdown("- Currently we support PDF document with less than 50 pages, PDF over 50 pages will reach GPU time limit.")
     with gr.Row():

 with gr.Blocks() as app:
+    gr.Markdown("# Memex: OCR-free Visual Document Embedding Model as Your Personal Librarian")
+    gr.Markdown("""The model only takes images as document-side inputs and produce vectors representing document pages. Memex is trained with over 200k query-visual document pairs, including textual document, visual document, arxiv figures, plots, charts, industry documents, textbooks, ebooks, and openly-available PDFs, etc. Its performance is on a par with our ablation text embedding model on text-oriented documents, and an advantages on visually-intensive documents.
+Our model is capable of:
+- Help you read a long visually-intensive or text-oriented PDF document and find the pages that answer your question.
+- Help you build a personal library and retireve book pages from a large collection of books.
+- It works like human: read and comprehend with vision and remember multimodal information in hippocampus.""")
     gr.Markdown("- We open-sourced our model at [RhapsodyAI/minicpm-visual-embedding-v0](https://huggingface.co/RhapsodyAI/minicpm-visual-embedding-v0)")
     gr.Markdown("- Currently we support PDF document with less than 50 pages, PDF over 50 pages will reach GPU time limit.")
     with gr.Row():