Great Models Think Alike and this Undermines AI Oversight Paper • 2502.04313 • Published 5 days ago • 24
Great Models Think Alike and this Undermines AI Oversight Paper • 2502.04313 • Published 5 days ago • 24
Great Models Think Alike and this Undermines AI Oversight Paper • 2502.04313 • Published 5 days ago • 24
ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities Paper • 2412.06745 • Published Dec 9, 2024 • 6
ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities Paper • 2412.06745 • Published Dec 9, 2024 • 6
ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities Paper • 2412.06745 • Published Dec 9, 2024 • 6
Data Contamination Report from the 2024 CONDA Shared Task Paper • 2407.21530 • Published Jul 31, 2024 • 10
Data Contamination Report from the 2024 CONDA Shared Task Paper • 2407.21530 • Published Jul 31, 2024 • 10
Image retrieval outperforms diffusion models on data augmentation Paper • 2304.10253 • Published Apr 20, 2023
Most discriminative stimuli for functional cell type clustering Paper • 2401.05342 • Published Nov 29, 2023
Visual Data-Type Understanding does not emerge from Scaling Vision-Language Models Paper • 2310.08577 • Published Oct 12, 2023 • 1
Wu's Method can Boost Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry Paper • 2404.06405 • Published Apr 9, 2024 • 2