Ivy1997 commited on
Commit
29067e8
·
verified ·
1 Parent(s): 2d8eefc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -3
README.md CHANGED
@@ -15,8 +15,6 @@ It accepts both image and text inputs to generate text outputs. 
15
 
16
  Thanks to its lightweight design, it can be deployed on edge devices such as AI glasses and smartphones, offering low memory usage and high speed while maintaining strong performance on multimodal tasks. Some well-known small models include [PaliGemma 3B](https://huggingface.co/google/paligemma-3b-mix-448), [Moondream2](https://huggingface.co/vikhyatk/moondream2), [Qwen2-VL-2B](https://huggingface.co/Qwen/Qwen2-VL-2B), [InternVL2-2B](https://huggingface.co/OpenGVLab/InternVL2-2B), and [InternVL2_5-2B](https://huggingface.co/OpenGVLab/InternVL2_5-2B). Ivy-VL outperforms them on multiple benchmarks.
17
 
18
- The model is built upon the `Qwen/Qwen2.5-3B-Instruct` language model, with [<code>google/siglip-so400m-patch14-384</code>](https://huggingface.co/google/siglip-so400m-patch14-384) serving as the vision encoder. 
19
-
20
  # Model Summary:
21
 
22
  * Developed: Standford, CMU, AI Safeguard
@@ -28,7 +26,10 @@ The model is built upon the `Qwen/Qwen2.5-3B-Instruct` language model, 
28
  * License: Apache 2.0
29
 
30
  * Architecture: Based on LLaVA-One-Vision
31
-
 
 
 
32
 
33
  # Evaluation:
34
 
 
15
 
16
  Thanks to its lightweight design, it can be deployed on edge devices such as AI glasses and smartphones, offering low memory usage and high speed while maintaining strong performance on multimodal tasks. Some well-known small models include [PaliGemma 3B](https://huggingface.co/google/paligemma-3b-mix-448), [Moondream2](https://huggingface.co/vikhyatk/moondream2), [Qwen2-VL-2B](https://huggingface.co/Qwen/Qwen2-VL-2B), [InternVL2-2B](https://huggingface.co/OpenGVLab/InternVL2-2B), and [InternVL2_5-2B](https://huggingface.co/OpenGVLab/InternVL2_5-2B). Ivy-VL outperforms them on multiple benchmarks.
17
 
 
 
18
  # Model Summary:
19
 
20
  * Developed: Standford, CMU, AI Safeguard
 
26
  * License: Apache 2.0
27
 
28
  * Architecture: Based on LLaVA-One-Vision
29
+
30
+ * LLM: Qwen/Qwen2.5-3B-Instruct
31
+
32
+ * Vision Encoder: google/siglip-so400m-patch14-384
33
 
34
  # Evaluation:
35