prithivMLmods commited on
Commit
14e4776
·
verified ·
1 Parent(s): b249d4b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -0
README.md CHANGED
@@ -19,6 +19,16 @@ tags:
19
 
20
  The **Qwen2-VL-OCR-2B-Instruct** model is a fine-tuned version of **Qwen/Qwen2-VL-2B-Instruct**, tailored for tasks that involve **Optical Character Recognition (OCR)**, **image-to-text conversion**, and **math problem solving with LaTeX formatting**. This model integrates a conversational approach with visual and textual understanding to handle multi-modal tasks effectively.
21
 
 
 
 
 
 
 
 
 
 
 
22
  | **File Name** | **Size** | **Description** | **Upload Status** |
23
  |---------------------------|------------|------------------------------------------------|-------------------|
24
  | `.gitattributes` | 1.52 kB | Configures LFS tracking for specific model files. | Initial commit |
 
19
 
20
  The **Qwen2-VL-OCR-2B-Instruct** model is a fine-tuned version of **Qwen/Qwen2-VL-2B-Instruct**, tailored for tasks that involve **Optical Character Recognition (OCR)**, **image-to-text conversion**, and **math problem solving with LaTeX formatting**. This model integrates a conversational approach with visual and textual understanding to handle multi-modal tasks effectively.
21
 
22
+ #### Key Enhancements:
23
+
24
+ * **SoTA understanding of images of various resolution & ratio**: Qwen2-VL achieves state-of-the-art performance on visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, MTVQA, etc.
25
+
26
+ * **Understanding videos of 20min+**: Qwen2-VL can understand videos over 20 minutes for high-quality video-based question answering, dialog, content creation, etc.
27
+
28
+ * **Agent that can operate your mobiles, robots, etc.**: with the abilities of complex reasoning and decision making, Qwen2-VL can be integrated with devices like mobile phones, robots, etc., for automatic operation based on visual environment and text instructions.
29
+
30
+ * **Multilingual Support**: to serve global users, besides English and Chinese, Qwen2-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc.
31
+
32
  | **File Name** | **Size** | **Description** | **Upload Status** |
33
  |---------------------------|------------|------------------------------------------------|-------------------|
34
  | `.gitattributes` | 1.52 kB | Configures LFS tracking for specific model files. | Initial commit |