File size: 3,411 Bytes
4772a8d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
## Evaluation Instruction for MiniGPT-v2

### Data preparation
Images download
Image source | Download path
--- | :---:
OKVQA| <a href="https://drive.google.com/drive/folders/1jxIgAhtaLu_YqnZEl8Ym11f7LhX3nptN?usp=sharing">annotations</a> &nbsp;&nbsp;  <a href="http://images.cocodataset.org/zips/train2017.zip"> images</a>
gqa | <a href="https://drive.google.com/drive/folders/1-dF-cgFwstutS4qq2D9CFQTDS0UTmIft?usp=drive_link">annotations</a> &nbsp;&nbsp;  <a href="https://downloads.cs.stanford.edu/nlp/data/gqa/images.zip">images</a> 
hateful meme |  <a href="https://github.com/faizanahemad/facebook-hateful-memes">images and annotations</a> 
iconqa |  <a href="https://iconqa.github.io/#download">images and annotation</a>
vizwiz |  <a href="https://vizwiz.org/tasks-and-datasets/vqa/">images and annotation</a>
RefCOCO | <a href="https://bvisionweb1.cs.unc.edu/licheng/referit/data/refcoco.zip"> annotations </a>
RefCOCO+ | <a href="https://bvisionweb1.cs.unc.edu/licheng/referit/data/refcoco+.zip"> annotations </a>
RefCOCOg | <a href="https://bvisionweb1.cs.unc.edu/licheng/referit/data/refcocog.zip"> annotations </a>

### Evaluation dataset structure

```
${MINIGPTv2_EVALUATION_DATASET}
β”œβ”€β”€ gqa
β”‚   └── test_balanced_questions.json
β”‚   β”œβ”€β”€ testdev_balanced_questions.json
β”‚   β”œβ”€β”€ gqa_images
β”œβ”€β”€ hateful_meme
β”‚   └── hm_images
β”‚   β”œβ”€β”€ dev.jsonl
β”œβ”€β”€ iconvqa
β”‚   └── iconvqa_images
β”‚   β”œβ”€β”€ choose_text_val.json
β”œβ”€β”€ vizwiz
β”‚   └── vizwiz_images
β”‚   β”œβ”€β”€ val.json
β”œβ”€β”€ vsr
β”‚   └── vsr_images
β”œβ”€β”€ okvqa
β”‚   β”œβ”€β”€ okvqa_test_split.json
β”‚   β”œβ”€β”€ mscoco_val2014_annotations_clean.json
β”‚   β”œβ”€β”€ OpenEnded_mscoco_val2014_questions_clean.json
β”œβ”€β”€ refcoco
β”‚   └── instances.json
β”‚   β”œβ”€β”€ refs(google).p
β”‚   β”œβ”€β”€ refs(unc).p
β”œβ”€β”€ refcoco+
β”‚   └── instances.json
β”‚   β”œβ”€β”€ refs(unc).p
β”œβ”€β”€ refercocog
β”‚   └── instances.json
β”‚   β”œβ”€β”€ refs(google).p
β”‚   β”œβ”€β”€ refs(und).p
...
```


### environment setup

```
export PYTHONPATH=$PYTHONPATH:/path/to/directory/of/MiniGPT-4
```

### config file setup

Set **llama_model** to the path of LLaMA model.  
Set **ckpt** to the path of our pretrained model.  
Set **eval_file_path** to the path of the annotation files for each evaluation data.  
Set **img_path** to the img_path for each evaluation dataset.  
Set **save_path** to the save_path for each evaluation dataset.    

in [eval_configs/minigptv2_benchmark_evaluation.yaml](../eval_configs/minigptv2_benchmark_evaluation.yaml) 




### start evalauting RefCOCO, RefCOCO+, RefCOCOg
port=port_number  
cfg_path=/path/to/eval_configs/minigptv2_benchmark_evaluation.yaml  

dataset names:  
| refcoco | refcoco+ | refcocog |
| ------- | -------- | -------- |

```
torchrun --master-port ${port} --nproc_per_node 1 eval_ref.py \
 --cfg-path ${cfg_path} --dataset refcoco,refcoco+,refcocog --resample
```


### start evaluating visual question answering

port=port_number  
cfg_path=/path/to/eval_configs/minigptv2_benchmark_evaluation.yaml 

dataset names:  
| okvqa | vizwiz | iconvqa | gqa | vsr | hm |
| ------- | -------- | -------- |-------- | -------- | -------- |


```
torchrun --master-port ${port} --nproc_per_node 1 eval_vqa.py \
 --cfg-path ${cfg_path} --dataset okvqa,vizwiz,iconvqa,gqa,vsr,hm
```