Leaderboards 🔥 - a sugatoray Collection

sugatoray 's Collections

Reasoning Datasets

SmolAgents Tools (Spaces)

Bookmark::Models

LLMs

AV LLMs

LLM Training Datasets

Papers

Leaderboards 🔥

Papers-Fundamentals

TFM: TimeSeries Foundation Models

Papers-Benchmarks

LLMs-EmbeddingModels

LLM + Datasets : Finance

Leaderboards 🔥

updated 8 days ago

A collection of Leaderboards for LLMs ⚡️⚖️ 🤗

Running

3.99k

3.99k

Chatbot Arena Leaderboard

🏆
Running on CPU Upgrade

12.4k

12.4k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots
Running

185

185

Yet Another LLM Leaderboard

🌖

Run a Streamlit web app
Running on CPU Upgrade

130

130

Hallucinations Leaderboard

🔥

View and submit LLM evaluations
Running

419

419

LLM-Perf Leaderboard

🏆

Explore hardware performance for language models
Running on CPU Upgrade

88

88

LLM Safety Leaderboard

🥇

View and submit machine learning model evaluations
Running

222

222

AI2 WildBench Leaderboard (V2)

🦁

Display and explore model leaderboards and chat history
Runtime error

30

30

Contextual Leaderboard

🐨
Running on CPU Upgrade

4.74k

4.74k

MTEB Leaderboard

🥇

Select and filter benchmarks for text embedding tasks
Running on CPU Upgrade

50

50

Open CoT Leaderboard

🥇

Track, rank and evaluate open LLMs' CoT quality
Running

276

276

LLM Performance Leaderboard

🐨

View LLM Performance Leaderboard
Running

181

181

BigCodeBench Leaderboard

🥇

Explore and analyze code evaluation data
Running

57

57

The timm Leaderboard

🏆

Display and analyze PyTorch Image Models leaderboard
Running

60

60

Open FinLLM Leaderboard

🥇

Browse and submit large language model evaluations
Running

99

99

Open VLM Video Leaderboard

🌎

VLMEvalKit Eval Results in video understanding benchmark
Running

38

38

MEGA-Bench Leaderboard

🥇

A leaderboard for multimodal models
Running on CPU Upgrade

84

84

Open LLM Leaderboard Model Comparator

🏆

Compare Open LLM Leaderboard results
Running

108

108

Vidore Leaderboard

🥇

Display Visual Document Retrieval leaderboard
Running

88

88

Judge Arena

💻

Compare AI models by voting on responses
Running on CPU Upgrade

605

605

Open VLM Leaderboard

🌎

VLMEvalKit Evaluation Results Collection
Running on TPU v5e

8

8

Keras Chatbot Battle

💬

Interact with multiple chatbots simultaneously
Running

4

4

OmniEval

🥇
Running

5

5

OmniEval

🥇

Official Leaderboard for OmniEval
open-llm-leaderboard/contents

Viewer • Updated 1 day ago • 3.83k • 18.5k • 14
Running on CPU Upgrade

62

62

LeaderboardExplorer

🔎

Filter and display leaderboards based on selected criteria
Running on CPU Upgrade

262

262

GAIA Leaderboard

🦾

Submit and evaluate models on a leaderboard
m-ric/agents_small_benchmark

Viewer • Updated Jan 19, 2024 • 100 • 78 • 10
Running on Zero

287

287

TTS Spaces Arena

🤗

Blind vote on HF TTS models!
Running

92

92

MTEB Arena

⚔

Teach, test, evaluate language models with MTEB Arena
Running on Zero

267

267

GenAI Arena

📈

Realtime Image/Video Gen AI Arena