OpenCompass

community

https://opencompass.org.cn/

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

KennyUTC authored a paper 2 days ago

VisualPRM: An Effective Process Reward Model for Multimodal Reasoning

vanilla1116 updated a dataset 3 days ago

opencompass/anah

vanilla1116 new activity 3 days ago

opencompass/anah:Update dataset card, link to paper, add category

View all activity

opencompass's activity

vansin

posted an update about 18 hours ago

Post

941

🔥MedAgentBench Amazing Work🚀

Just explored #MedAgentBench from @Yale researchers and it's mind-blowing! They've created a cutting-edge benchmark that finally exposes the true capabilities of LLMs in complex medical reasoning.

⚡ Key discoveries:

DeepSeek R1 & OpenAI O3 dominate clinical reasoning tasks
Agent-based frameworks deliver exceptional performance-cost balance
Open-source alternatives are closing the gap at fraction of the cost

This work shatters previous benchmarks that failed to challenge today's advanced models.
The future of medical AI is here: https://github.com/gersteinlab/medagents-benchmark
#MedicalAI #MachineLearning #AIinHealthcare 🔥

KennyUTC

authored a paper 2 days ago

VisualPRM: An Effective Process Reward Model for Multimodal Reasoning

Paper • 2503.10291 • Published 3 days ago • 29

vanilla1116

updated a dataset 3 days ago

opencompass/anah

Viewer • Updated 3 days ago • 783 • 187 • 3

vanilla1116

in opencompass/anah 3 days ago

Update dataset card, link to paper, add category

#2 opened 8 days ago by

vanilla1116

in opencompass/anah-7b 8 days ago

Add missing metadata and clarify license

#1 opened 8 days ago by

vanilla1116

in opencompass/anah-20b 8 days ago

Add missing metadata: `pipeline_tag`, `library_name`, and `license`

#1 opened 8 days ago by

vanilla1116

in opencompass/anah-v2 8 days ago

Improve model card with library_name and pipeline_tag

#1 opened 8 days ago by

ZwwWayne

authored a paper 11 days ago

Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs

Paper • 2503.02846 • Published 12 days ago • 18

vanilla1116

authored a paper 11 days ago

Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs

Paper • 2503.02846 • Published 12 days ago • 18

KennyUTC

authored a paper 12 days ago

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published 13 days ago • 66

dongsheng

updated a Space 13 days ago

BigCodeBench Evaluator

dongsheng

published a Space 13 days ago

BigCodeBench Evaluator

jnanliu

updated a dataset 18 days ago

opencompass/LiveMathBench

Viewer • Updated 18 days ago • 283 • 1.1k • 4

KennyUTC

authored a paper 18 days ago

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Paper • 2502.18411 • Published 19 days ago • 69

nebulae09

authored a paper 18 days ago

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Paper • 2502.18411 • Published 19 days ago • 69

KennyUTC

updated a Space 19 days ago

Open LMM Reasoning Leaderboard

A Leaderboard that demonstrates LMM reasoning capabilities

Shz

updated a dataset 19 days ago

opencompass/AIME2025

Viewer • Updated 19 days ago • 30 • 2.62k • 10

vanilla1116

authored a paper 26 days ago

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Paper • 2502.06781 • Published Feb 10 • 60

dongsheng

updated a Space 27 days ago

Compass Academic Leaderboard

Compass Academic Leaderboard

ZwwWayne

authored a paper about 1 month ago

MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training

Paper • 2303.13510 • Published Mar 23, 2023 • 1