Update README.md
Browse files
README.md
CHANGED
@@ -18,8 +18,7 @@ datasets:
|
|
18 |
<p>
|
19 |
|
20 |
# Llama-3.2-MAAL-11B-Vision-v0.1
|
21 |
-
|
22 |
-
We are releasing a [model](https://huggingface.co/maum-ai/Llama-3.2-MAAL-11B-Vision-v0.1), a subset of the [training dataset](https://huggingface.co/datasets/maum-ai/General-Evol-VQA), and a [leaderboard](https://huggingface.co/spaces/maum-ai/KOFFVQA-Leaderboard) to promote and accelerate the development of Korean Vision-Language Models (VLMs).
|
23 |
|
24 |
- **Developed by:** [maum.ai Brain NLP](https://maum-ai.github.io). Jaeyoon Jung, Yoonshik Kim, Yekyung Nah
|
25 |
- **Language(s) (NLP):** Korean, English (currently, bilingual)
|
@@ -27,7 +26,7 @@ We are releasing a [model](https://huggingface.co/maum-ai/Llama-3.2-MAAL-11B-Vis
|
|
27 |
|
28 |
## Model Description
|
29 |
|
30 |
-
Version 0.1 is fine-tuned by English and Korean VQA
|
31 |
|
32 |
- We trained this model on 8 H100-80G for 2 days with image-text pair multimodal fine-tuning dataset
|
33 |
- [maum-ai/General-Evol-VQA](https://huggingface.co/datasets/maum-ai/General-Evol-VQA) is one of the datasets that we used for fine-tuning.
|
@@ -85,7 +84,7 @@ As the main goal of version 0.1 is **leveraging Korean VQA and OCR capabilities
|
|
85 |
|InternVL2-8b|8.1b|32.76|
|
86 |
|MiniCPM-V-2_6|8.1b|32.69|
|
87 |
|
88 |
-
Our model has achieved a 20
|
89 |
You can check more results in [this Leaderboard](https://huggingface.co/spaces/maum-ai/KOFFVQA-Leaderboard)
|
90 |
|
91 |
-
### We will release enhanced model, v0.2 soon
|
|
|
18 |
<p>
|
19 |
|
20 |
# Llama-3.2-MAAL-11B-Vision-v0.1
|
21 |
+
**Llama-3.2-MAAL-11B-Vision-v0.1** is bilingual multimodal model trained for text and visual understanding across Korean and English languages. We are releasing a [model](https://huggingface.co/maum-ai/Llama-3.2-MAAL-11B-Vision-v0.1), a subset of the [training dataset](https://huggingface.co/datasets/maum-ai/General-Evol-VQA), and a [leaderboard](https://huggingface.co/spaces/maum-ai/KOFFVQA-Leaderboard) to promote and accelerate the development of Korean Vision-Language Models (VLMs).
|
|
|
22 |
|
23 |
- **Developed by:** [maum.ai Brain NLP](https://maum-ai.github.io). Jaeyoon Jung, Yoonshik Kim, Yekyung Nah
|
24 |
- **Language(s) (NLP):** Korean, English (currently, bilingual)
|
|
|
26 |
|
27 |
## Model Description
|
28 |
|
29 |
+
Version 0.1 is fine-tuned by English and Korean VQA datasets with other datasets (OCR, Math, etc)...
|
30 |
|
31 |
- We trained this model on 8 H100-80G for 2 days with image-text pair multimodal fine-tuning dataset
|
32 |
- [maum-ai/General-Evol-VQA](https://huggingface.co/datasets/maum-ai/General-Evol-VQA) is one of the datasets that we used for fine-tuning.
|
|
|
84 |
|InternVL2-8b|8.1b|32.76|
|
85 |
|MiniCPM-V-2_6|8.1b|32.69|
|
86 |
|
87 |
+
Our model has achieved a **20%** performance improvement compared to the previous base model.
|
88 |
You can check more results in [this Leaderboard](https://huggingface.co/spaces/maum-ai/KOFFVQA-Leaderboard)
|
89 |
|
90 |
+
### We will release enhanced model, v0.2 soon
|