metadata
title: README
emoji: 💻
colorFrom: blue
colorTo: red
sdk: static
pinned: false
Our team at Czech Institute of Informatics, Robotics and Cybernetics focuses on developing NLP applications utilizing large language models. As selecting the most capable model for a specific task and language is crucial for ensuring optimal performance, we concentrated our efforts on developing a Czech-focused LLM evaluation suite.
CzechBench is a collection of Czech evaluation tasks selected to assess multiple aspects of LLM capabilities. The suite newly leverages the Language Model Evaluation Harness, providing improved model compatibility and computation efficiency.
We are currently working on providing an open leaderboard for CzechBench to allow for easy sharing of evaluation results.