Jae-Won Chung
New leaderboard prototype
b10121d
|
raw
history blame
451 Bytes

ShareGPT benchmarking dataset

Download cleaned ShareGPT dataset

https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json

Construct benchmarking dataset

Filter conversations with too long prompts/responses, conversations not started by "human", extract first turn, and randomly sample 500 prompts

pip install transformers
python filter_dataset.py