Update README.md
Browse files
README.md
CHANGED
@@ -27,7 +27,7 @@ pipeline_tag: visual-document-retrieval
|
|
27 |
|
28 |
## Ranked #1 on the Vidore benchmark (as of February 11, 2025). The reported scores are on the [Vidore Leaderboard](https://huggingface.co/spaces/vidore/vidore-leaderboard).
|
29 |
|
30 |
-
### This is the base version trained on 4xA100 80GB with per_device_batch_size=
|
31 |
|
32 |
ColQwen is a model based on a novel model architecture and training strategy based on Vision Language Models (VLMs) to efficiently index documents from their visual features.
|
33 |
It is a [Qwen2.5-VL-3B](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct) extension that generates [ColBERT](https://arxiv.org/abs/2004.12832)- style multi-vector representations of text and images.
|
|
|
27 |
|
28 |
## Ranked #1 on the Vidore benchmark (as of February 11, 2025). The reported scores are on the [Vidore Leaderboard](https://huggingface.co/spaces/vidore/vidore-leaderboard).
|
29 |
|
30 |
+
### This is the base version trained on 4xA100 80GB with per_device_batch_size=64 and gradient_accumulation_steps=2 for 5 epoch.
|
31 |
|
32 |
ColQwen is a model based on a novel model architecture and training strategy based on Vision Language Models (VLMs) to efficiently index documents from their visual features.
|
33 |
It is a [Qwen2.5-VL-3B](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct) extension that generates [ColBERT](https://arxiv.org/abs/2004.12832)- style multi-vector representations of text and images.
|