mqliu commited on
Commit
6d700c5
·
verified ·
1 Parent(s): ce1a7c1

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +77 -0
README.md ADDED
@@ -0,0 +1,77 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # InterleavedBench
2
+
3
+ This is the official huggingface repo for the work "Holistic Evaluation for Interleaved Text-and-Image Generation". This is a research preview. More details, including the baseline models' prediction and so on, will be coming soon in the following few weeks.
4
+
5
+
6
+ ## How to use InterleavedBench
7
+
8
+ ### Repo hierarchy
9
+
10
+ - `interleaved_bench.json` is the main json file of the dataset.
11
+ - `zipped_images` is the directory of zipped images for each subset, including the images for the context and ground truths.
12
+ - `src/interleavedeval_gpt4o.py` is the python script for InterleavedEval with GPT-4o. Its input is the model prediction file.
13
+
14
+ ### To get started
15
+
16
+ - unzip the images files under `zipped_images`.
17
+ - Run the inference on `interleaved_bench.json` with your model and get your model output (including text and image).
18
+ - Use the script in `src/interleavedeval_gpt4o.py` to perform evaluation.
19
+
20
+
21
+ One example in `interleaved_bench.json` is as follows:
22
+
23
+ ```
24
+ {
25
+ "id": "wikihow_next_step_0_489157",
26
+ "image": [
27
+ "wiki_images_test/489157_0_0.png",
28
+ "wiki_images_test/489157_0_1.png",
29
+ "wiki_images_test/489157_0_2.png",
30
+ "wiki_images_test/489157_0_3.png",
31
+ "wiki_images_test/489157_0_4.png"
32
+ ],
33
+ "task_name": "wikihow_next_step",
34
+ "conversations": [
35
+ {
36
+ "from": "human",
37
+ "value": "In this task, you are given a high-level goal 'How to Make a Banana Shake': Banana shakes are a tasty way to get a lot of nutrients all at once. Bananas provide a creamy, smooth texture when turned into a drink. Bananas also fill empty stomachs, staving off hunger pangs and giving you a nice energy burst. In this article you'll find a few ways to make banana shakes, among the many possibilities. \n You need to assist human user to complete this task via making a banana shake with kefir. Given the previous steps, you need to predict the subsequent 4 steps to help the user to finish the task. The previous steps are: \n <BEGIN> Put 2 to 3 bananas in a bowl. <image>\n"
38
+ },
39
+ {
40
+ "from": "gpt",
41
+ "value": "Now put in a liter of kefir and a teaspoon of sugar. <image>\n Put 1 cup milk into the mix. <image>\n Using a blender, blend all ingredients together. <image>\n Relax with your fresh banana smoothie! <image>\n"
42
+ }
43
+ ],
44
+ "goal": "How to Make a Banana Shake",
45
+ "category": [
46
+ "Food and Entertaining",
47
+ "Drinks",
48
+ "Smoothies Shakes and Milk",
49
+ "Fruit Based Shakes"
50
+ ],
51
+ "dataset_id": "wikihow_selected_test_uni"
52
+ },
53
+ ```
54
+
55
+ ### Reference
56
+
57
+ If you find our work useful or interesting, please cite:
58
+ ```
59
+ @article{liu_holistic_2024,
60
+ author = {Minqian Liu and
61
+ Zhiyang Xu and
62
+ Zihao Lin and
63
+ Trevor Ashby and
64
+ Joy Rimchala and
65
+ Jiaxin Zhang and
66
+ Lifu Huang},
67
+ title = {Holistic Evaluation for Interleaved Text-and-Image Generation},
68
+ journal = {CoRR},
69
+ volume = {abs/2406.14643},
70
+ year = {2024},
71
+ url = {https://doi.org/10.48550/arXiv.2406.14643},
72
+ doi = {10.48550/ARXIV.2406.14643},
73
+ eprinttype = {arXiv},
74
+ eprint = {2406.14643},
75
+ timestamp = {Tue, 16 Jul 2024 16:17:50 +0200}
76
+ }
77
+ ```