Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
microsoft
/
MageBench-Leaderboard
like
10
Running
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
5b280bc
MageBench-Leaderboard
2 contributors
History:
39 commits
daiqi
Add {'Score': '867', 'Name': 'gbhd', 'BaseModel': 'bdfb', 'Env': 'Sokoban', 'Target-research': 'Model-Eval-Online', 'Subset': 'mini', 'Link': 'fdns', 'State': 'Checking'} to checking queue
5b280bc
verified
4 months ago
src
Update src/envs.py
4 months ago
.gitattributes
Safe
1.58 kB
Upload demo.mp4
4 months ago
.gitignore
Safe
136 Bytes
Duplicate from demo-leaderboard-backend/leaderboard
4 months ago
.pre-commit-config.yaml
Safe
1.53 kB
Duplicate from demo-leaderboard-backend/leaderboard
4 months ago
Makefile
Safe
208 Bytes
Duplicate from demo-leaderboard-backend/leaderboard
4 months ago
README.md
Safe
1.44 kB
initial commit
4 months ago
app.py
Safe
12.2 kB
Update app.py
4 months ago
commit_results.jsonl
Safe
32.7 kB
Upload commit_results.jsonl
4 months ago
demo.mp4
Safe
335 MB
LFS
Upload demo.mp4
4 months ago
pyproject.toml
Safe
548 Bytes
Duplicate from demo-leaderboard-backend/leaderboard
4 months ago
requirements.txt
Safe
214 Bytes
Update requirements.txt
4 months ago
test-output.json
Safe
166 Bytes
Add {'Score': '867', 'Name': 'gbhd', 'BaseModel': 'bdfb', 'Env': 'Sokoban', 'Target-research': 'Model-Eval-Online', 'Subset': 'mini', 'Link': 'fdns', 'State': 'Checking'} to checking queue
4 months ago