Spaces:

ml-energy
/

leaderboard

Running

App Files Files Community

leaderboard / benchmark /mllm_text_generation /chat /llava_conversation /README.md

Jae-Won Chung

New leaderboard prototype

b10121d over 1 year ago

preview code

raw

history blame

807 Bytes

Benchmarking dataset

500 prompt/image pairs were sampled from the conversation subset of the Llava-Instruct dataset.

Obtaining and filtering the dataset

First, download the full conversatio dataset with 58k samples:

curl -L https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K/resolve/9d451dc7629cfe0469f6ae4432b765cd603d5fcb/conversation_58k.json -o full.json

Also fetch all COCO trainset and extract images:

curl -LO http://images.cocodataset.org/zips/train2017.zip
unzip train2017.zip

Finally, run the script to (1) sample 500 pairs, (2) read in the corresponding COCO image, (3) encode images into base64 strings, and (4) merge all pairs into one JSON file. After this step, original dataset files (Llava and COCO) may be deleted.

python sample.py