yydxlv commited on
Commit
54c7735
·
verified ·
1 Parent(s): b4a30e9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -11,14 +11,12 @@ tags:
11
  library_name: peft
12
  pipeline_tag: visual-document-retrieval
13
  ---
14
- # ColQwen2.5-7b-v0.1: Multimodal Visual Retriever based on Qwen2.5-VL-7B-Instruct with ColBERT strategy
15
 
16
  ## Ranked #1 among models on the Vidore benchmark (as of February 7, 2025). The reported scores on the [Vidore Leaderboard](https://huggingface.co/spaces/vidore/vidore-leaderboard).
17
 
18
  ### This is the base version trained on 8xA100 80GB with batch_size=32*8 and gradient_accumulation_steps=2 for 3 epoch.
19
 
20
- - **Developed by:** IEIT systems
21
-
22
  ColQwen is a model based on a novel model architecture and training strategy based on Vision Language Models (VLMs) to efficiently index documents from their visual features.
23
  It is a [Qwen2.5-VL-3B](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct) extension that generates [ColBERT](https://arxiv.org/abs/2004.12832)- style multi-vector representations of text and images.
24
  It was introduced in the paper [ColPali: Efficient Document Retrieval with Vision Language Models](https://arxiv.org/abs/2407.01449) and first released in [this repository](https://github.com/ManuelFay/colpali)
@@ -131,4 +129,5 @@ If you use this models from this organization in your research, please cite the
131
  primaryClass={cs.IR},
132
  url={https://arxiv.org/abs/2407.01449},
133
  }
134
- ```
 
 
11
  library_name: peft
12
  pipeline_tag: visual-document-retrieval
13
  ---
14
+ # IEIT-Systems ColQwen2.5-7b-v0.1: Multimodal Visual Retriever based on Qwen2.5-VL-7B-Instruct with ColBERT strategy
15
 
16
  ## Ranked #1 among models on the Vidore benchmark (as of February 7, 2025). The reported scores on the [Vidore Leaderboard](https://huggingface.co/spaces/vidore/vidore-leaderboard).
17
 
18
  ### This is the base version trained on 8xA100 80GB with batch_size=32*8 and gradient_accumulation_steps=2 for 3 epoch.
19
 
 
 
20
  ColQwen is a model based on a novel model architecture and training strategy based on Vision Language Models (VLMs) to efficiently index documents from their visual features.
21
  It is a [Qwen2.5-VL-3B](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct) extension that generates [ColBERT](https://arxiv.org/abs/2004.12832)- style multi-vector representations of text and images.
22
  It was introduced in the paper [ColPali: Efficient Document Retrieval with Vision Language Models](https://arxiv.org/abs/2407.01449) and first released in [this repository](https://github.com/ManuelFay/colpali)
 
129
  primaryClass={cs.IR},
130
  url={https://arxiv.org/abs/2407.01449},
131
  }
132
+ ```
133
+ - **Developed by:** IEIT systems