Update README.md
Browse files
README.md
CHANGED
@@ -4,7 +4,7 @@ license: mit
|
|
4 |
|
5 |
# ColFlor: Towards BERT-Size Vision-Language Document Retrieval Models
|
6 |
|
7 |
-
In June 2024, [ColPali](https://arxiv.org/abs/2407.01449) was introduced as an OCR-free document retrieval model, built over [PaliGemma](https://arxiv.org/abs/2407.07726), shifting the paradigm of PDF document retrieval by directly processing images instead of using error-prone and resource-heavy OCR pipelines. However, with three billion parameters, ColPali might be computationally expensive, especially for large document databases. In contrast, text retrieval models like [ColBERT](https://arxiv.org/abs/2004.12832) are more efficient with just a few hundred million parameters, but they require error-prone and expensive OCR pipelines to. To bridge this gap, we introduce ColFlor, an OCR-free visual document retrieval model with only
|
8 |
|
9 |
<p align="center"><img width=800 src="https://github.com/AhmedMasryKU/colflor/blob/main/assets/colflor.png?raw=true"/></p>
|
10 |
|
|
|
4 |
|
5 |
# ColFlor: Towards BERT-Size Vision-Language Document Retrieval Models
|
6 |
|
7 |
+
In June 2024, [ColPali](https://arxiv.org/abs/2407.01449) was introduced as an OCR-free document retrieval model, built over [PaliGemma](https://arxiv.org/abs/2407.07726), shifting the paradigm of PDF document retrieval by directly processing images instead of using error-prone and resource-heavy OCR pipelines. However, with three billion parameters, ColPali might be computationally expensive, especially for large document databases. In contrast, text retrieval models like [ColBERT](https://arxiv.org/abs/2004.12832) are more efficient with just a few hundred million parameters, but they require error-prone and expensive OCR pipelines to. To bridge this gap, we introduce ColFlor, an OCR-free visual document retrieval model with only 174 million parameters.
|
8 |
|
9 |
<p align="center"><img width=800 src="https://github.com/AhmedMasryKU/colflor/blob/main/assets/colflor.png?raw=true"/></p>
|
10 |
|