7 2

Ameya Prabhu

AmeyaPrabhu

https://drimpossible.github.io/about

drimpossible

AI & ML interests

None yet

Recent Activity

new activity 4 days ago

bethgelab/lm-similarity:Add question-answering task category

authored a paper 5 days ago

Great Models Think Alike and this Undermines AI Oversight

upvoted a paper 5 days ago

Great Models Think Alike and this Undermines AI Oversight

View all activity

Organizations

AmeyaPrabhu's activity

New activity in bethgelab/lm-similarity 4 days ago

Add question-answering task category

#2 opened 4 days ago by

nielsr

authored a paper 5 days ago

Great Models Think Alike and this Undermines AI Oversight

Paper • 2502.04313 • Published 5 days ago • 24

upvoted a paper 5 days ago

Great Models Think Alike and this Undermines AI Oversight

Paper • 2502.04313 • Published 5 days ago • 24

published a dataset 7 days ago

bethgelab/lm-similarity

Viewer • Updated 4 days ago • 12k • 34 • 4

authored a paper about 2 months ago

ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities

Paper • 2412.06745 • Published Dec 9, 2024 • 6

upvoted a paper 2 months ago

MALT: Improving Reasoning with Multi-Agent LLM Training

Paper • 2412.01928 • Published Dec 2, 2024 • 40

updated a dataset 4 months ago

bethgelab/paper_parsed_jsons

Updated Oct 23, 2024 • 13

updated a dataset 5 months ago

bethgelab/CiteME

Viewer • Updated Sep 26, 2024 • 130 • 50 • 4

New activity in bethgelab/CiteME 5 months ago

Update README.md

#1 opened 5 months ago by

oripress

authored a paper 6 months ago

Data Contamination Report from the 2024 CONDA Shared Task

Paper • 2407.21530 • Published Jul 31, 2024 • 10

New activity in CONDA-Workshop/Data-Contamination-Database 10 months ago

Added Contamination Evidence on MMLU of ChatGPT/GPT4 from "Investigating data contamination in modern benchmarks for large language models"

#10 opened 10 months ago by

AmeyaPrabhu

Mistral 7B Arc Easy Contamination based on "Proving Test Set Contamination in Black Box Language Models"

#14 opened 10 months ago by

AmeyaPrabhu

Added Contamination Evidence from GPT4 Tech Report using String matching on GPT-4

#11 opened 10 months ago by

AmeyaPrabhu

Added Contamination Info on Old Models: GPT3, FLAN, GLaM, PaLM, PaLM 2

#13 opened 10 months ago by

AmeyaPrabhu

Code contamination in HumanEval and MBPP

#12 opened 10 months ago by

AmeyaPrabhu

authored a paper 10 months ago

Wu's Method can Boost Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry

Paper • 2404.06405 • Published Apr 9, 2024 • 2

updated a dataset 10 months ago

bethgelab/simplegeometry

Updated Apr 10, 2024 • 177 • 13

authored 3 papers 10 months ago

No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance

Paper • 2404.04125 • Published Apr 4, 2024 • 28

Lifelong Benchmarks: Efficient Model Evaluation in an Era of Rapid Progress

Paper • 2402.19472 • Published Feb 29, 2024 • 2

Rapid Adaptation in Online Continual Learning: Are We Evaluating It Right?

Paper • 2305.09275 • Published May 16, 2023 • 1