Commit History

Adding GLM, Qwen3 and some old models
f505d13
Running

grg commited on

Evaluation with CoT
2a3cc01

grg commited on

Adding reasoning models and open generation for non-reasoning models
74922c5

grg commited on

New models
63e90ba

grg commited on

Adding data from Phi-4 and Falcon3
a388440

grg commited on