Anonymous1223334444 commited on
Commit
94b27ef
·
1 Parent(s): c2e3cf5

Add YAML metadata to README for Hugging Face Hub

Browse files
Files changed (1) hide show
  1. README.md +16 -0
README.md CHANGED
@@ -1,3 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Multimodal & Multilingual PDF Embedding Pipeline
2
 
3
  This repository hosts a Python pipeline that extracts text, tables, and images from PDF documents, generates multimodal descriptions for visual content (tables and images), and then creates multilingual text embeddings for all extracted information. The generated embeddings are stored in a JSON file, ready for use in Retrieval Augmented Generation (RAG) systems or other downstream applications.
 
1
+ ---
2
+ tags:
3
+ - multimodal
4
+ - multilingual
5
+ - pdf
6
+ - embeddings
7
+ - rag
8
+ - google-cloud
9
+ - vertex-ai
10
+ - gemini
11
+ - python
12
+ datasets:
13
+ - any
14
+ license: mit
15
+ ---
16
+
17
  # Multimodal & Multilingual PDF Embedding Pipeline
18
 
19
  This repository hosts a Python pipeline that extracts text, tables, and images from PDF documents, generates multimodal descriptions for visual content (tables and images), and then creates multilingual text embeddings for all extracted information. The generated embeddings are stored in a JSON file, ready for use in Retrieval Augmented Generation (RAG) systems or other downstream applications.