LLMs Evaluation - a JournalistsonHF Collection

JournalistsonHF 's Collections

Test Chat Models

For Fun & Understanding AI Capabilities

Text-Analysis Tools

LLMs Evaluation

Data Journalism

Text-to-Speech & Audio Tools

LLMs Evaluation

updated 5 days ago

Evaluate models on key benchmarks. Thanks @clefourrier and @VictorSanh for the recommandations.

Running on CPU Upgrade

12.4k

12.4k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots
Running

3.99k

3.99k

Chatbot Arena Leaderboard

🏆
Running

1.12k

1.12k

Big Code Models Leaderboard

📈

Submit code models for evaluation on benchmarks
Running on CPU Upgrade

605

605

Open VLM Leaderboard

🌎

VLMEvalKit Evaluation Results Collection
Running

541

541

Vision Arena (Testing VLMs side-by-side)

🖼

Analyze images to detect and label objects
Running on CPU Upgrade

630

630

TTS Arena

🏆

Vote on the latest TTS models!
Running

253

253

3D Arena

🏢

Generate a 3D leaderboard by voting
Configuration error

34

34

Leaderboard

🐠
Running on CPU Upgrade

4.74k

4.74k

MTEB Leaderboard

🥇

Select and filter benchmarks for text embedding tasks
Running

50

50

GIFT Eval

🥇

GIFT-Eval: A Benchmark for General Time Series Forecasting
Running on Zero

287

287

TTS Spaces Arena

🤗

Blind vote on HF TTS models!
Running

60

60

Background Removal Arena

⚡

Vote on background-removed images to rank models