README.md · rohansampath/H2H-eval-comparator at 397d7985d9ff9eaa5b17e89be78a89253c331c6c

metadata

title: Evaluations on the MMLU-Pro (2024) Dataset
emoji: 🦀
colorFrom: indigo
colorTo: blue
sdk: gradio
sdk_version: 5.16.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: Evaluates various models on the MMLU-Pro Dataset.

This Space replicates the evaluation of various models on the MMLU-Pro Dataset. Dataset: https://huggingface.co/datasets/TIGER-Lab/MMLU-Pro GitHub: https://github.com/TIGER-AI-Lab/MMLU-Pro Paper: https://arxiv.org/abs/2406.01574 (Submitted at NeurIPS 2024)

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference