Spaces:

CIIRC-NLP
/

README

Configuration error

App Files Files Community

README / README.md

Adamiros

Update README.md

6f96d42 verified about 1 year ago

preview code

raw

history blame

1.59 kB

	---
	title: README
	emoji: 💻
	colorFrom: blue
	colorTo: red
	sdk: static
	pinned: false
	---

	# CIIRC CTU

	Our team at Czech Institute of Informatics, Robotics and Cybernetics focuses on developing NLP applications utilizing large language models.
	Apart from creating custom solutions for our worldwide partners, we aim to aid the local NLP community by developing Czech-enabled LLMs and complemantary evaluation tools.

	## CzechBench
	As selecting the most capable model for a specific task and language is crucial for ensuring optimal performance, we concentrated our efforts on developing a Czech-focused LLM evaluation suite.
	CzechBench, available on [GitHub](https://github.com/jirkoada/czechbench_eval_harness/tree/main/lm_eval/tasks/czechbench#readme), is a collection of Czech evaluation tasks selected to assess multiple aspects of LLM capabilities.
	The suite newly leverages the [Language Model Evaluation Harness framework](https://github.com/EleutherAI/lm-evaluation-harness), in order to provide an effective environment most LLM developers are already familiar with.
	The models are evaluated in an end-to-end fashion, using only their final textual outputs. This allows for direct performance comparison across both open-source and proprietary LLM solutions.

	To start evaluating your own Czech-enabled models, you can follow the instructions on [GitHub](https://github.com/jirkoada/czechbench_eval_harness/tree/main/lm_eval/tasks/czechbench#readme).
	We are currently working on providing an open leaderboard for CzechBench to allow for efficient sharing of evaluation results.