yoonshik1205 commited on
Commit
96e3dc1
Β·
1 Parent(s): 4b5b795

add link to evaluation code

Browse files
Files changed (1) hide show
  1. content.py +7 -3
content.py CHANGED
@@ -11,7 +11,11 @@ benchname = 'KOFFVQA'
11
  Bottom_logo = f'''<img src="data:image/jpeg;base64,{bottom_logo}" style="width:20%;display:block;margin-left:auto;margin-right:auto">'''
12
 
13
  intro_md = f'''
14
- # KOFFVQA Leaderboard
 
 
 
 
15
 
16
  {benchname}πŸ” is a Free-Form VQA benchmark dataset designed to evaluate Vision-Language Models (VLMs) in Korean language environments. Unlike traditional multiple-choice or predefined answer formats, KOFFVQA challenges models to generate open-ended, natural-language answers to visually grounded questions. This allows for a more comprehensive assessment of a model's ability to understand and generate nuanced Korean responses.
17
 
@@ -29,10 +33,10 @@ The {benchname} benchmark is designed to evaluate and compare the performance of
29
 
30
  This benchmark includes a total of 275 Korean questions across 10 tasks. The questions are open-ended, free-form VQA (Visual Question Answering) with objective answers, allowing responses without strict format constraints.
31
 
32
- We will add more information about this benchmark soon.
33
-
34
  ## News
35
 
 
 
36
  * **2024-12-06**: Leaderboard Release!
37
 
38
  '''.strip()
 
11
  Bottom_logo = f'''<img src="data:image/jpeg;base64,{bottom_logo}" style="width:20%;display:block;margin-left:auto;margin-right:auto">'''
12
 
13
  intro_md = f'''
14
+ # {benchname} Leaderboard
15
+
16
+ * [Dataset](https://huggingface.co/datasets/maum-ai/KOFFVQA_Data)
17
+ * [Evaluation Code](https://github.com/maum-ai/KOFFVQA)
18
+ * Report (coming soon)
19
 
20
  {benchname}πŸ” is a Free-Form VQA benchmark dataset designed to evaluate Vision-Language Models (VLMs) in Korean language environments. Unlike traditional multiple-choice or predefined answer formats, KOFFVQA challenges models to generate open-ended, natural-language answers to visually grounded questions. This allows for a more comprehensive assessment of a model's ability to understand and generate nuanced Korean responses.
21
 
 
33
 
34
  This benchmark includes a total of 275 Korean questions across 10 tasks. The questions are open-ended, free-form VQA (Visual Question Answering) with objective answers, allowing responses without strict format constraints.
35
 
 
 
36
  ## News
37
 
38
+ * **2025-01-21**: [Evaluation code](https://github.com/maum-ai/KOFFVQA) release
39
+
40
  * **2024-12-06**: Leaderboard Release!
41
 
42
  '''.strip()