medrag / medrag_multi_modal /document_loader

Commit History

chore: fix linting and code formatting
33deb8d

mratanusarkar commited on

update: override MarkerImageLoader.load_data to align page indices to reflect pdf page numbers
e4a917d

geekyrakshit commited on

update: BaseImageLoader + MarkerImageLoader
9c51e22

geekyrakshit commited on

add: alias to document loader artifacts and datasets + enable mps fallback for marker
789b57f

geekyrakshit commited on

bugfix: prevent model load on every extraction
ff75fe0

mratanusarkar commited on

chore: address review points
2691833

mratanusarkar commited on

add: two modules on fitz to handle img extractions
f9d44bd

mratanusarkar commited on

temp: attempt - force to png with pillow
3d948a1

mratanusarkar commited on

temp: attempt - all format img extraction from pdf
5406446

mratanusarkar commited on

add: hacky impl of img extraction with pdfplumber
4fd52cf

mratanusarkar commited on

add: example usage for marker and pdf2img loaders
bf0f2e5

mratanusarkar commited on

add: marker image loader + docs + corrections
331f289

mratanusarkar commited on

chore: improve doc + code formatting
f37090a

mratanusarkar commited on

add: docs for base img loader + pdf2image
cc5cebc

mratanusarkar commited on

add: base image loader + pdf2img from load_image
5c74069

mratanusarkar commited on

update: codebase addressing review comments
a24da3d

mratanusarkar commited on

add: kwargs to interact with underlying library
6526b2f

mratanusarkar commited on

update: convert _process_page to extract_page_data
e31ec78

mratanusarkar commited on

add: docs & docstrings for marker text loader
fc27062

mratanusarkar commited on

add: marker pdf text loader
fb5095f

mratanusarkar commited on

add: docs & docstrings for pdfplumber text loader
d647546

mratanusarkar commited on

add: pdfplumber text loader
be6fbc6

mratanusarkar commited on

add: docs & docstrings for pypdf2 text loader
419f968

mratanusarkar commited on

add: pypdf2 loader text loader
391b2f3

mratanusarkar commited on

chore: format & linting + __init__ + fix: imports
e0aff18

mratanusarkar commited on

chore: remove old load_text
78dd8e8

mratanusarkar commited on

add: docs & docstrings for base + pymupdf4llm
4304db6

mratanusarkar commited on

add: base text loader and pymupdf4llm loader
9761deb

mratanusarkar commited on

add: MultiModalRetriever.predict
d197e7f

geekyrakshit commited on

update: colpali index syncs with wandb artifact
abd20d0

geekyrakshit commited on

update: docuementation with installation instructions
24e7c59

geekyrakshit commited on

update: ImageLoader
a7ff122

geekyrakshit commited on

update: ImageLoader
bd0ff68

geekyrakshit commited on

add: TextImageLoader
7b862ff

geekyrakshit commited on

refactor: TextLoader
c675904

geekyrakshit commited on

add: load_text_from_pdf
b9d8094

geekyrakshit commited on