[Ref, Fix] indentation error in answer key selection, longer explanation in demo, exclusion of broken dataset c608f7f Joschka Strueber commited on 4 days ago
[Fix] error in dataset name, error in digit check for str 3dfa66b Joschka Strueber commited on 4 days ago
[Add] add bbh and gpqa benchmarks again with correct answer_index selection 0a42e99 Joschka Strueber commited on 4 days ago
[Ref] apply custom css to heatmap, increase size of images 4077e51 Joschka Strueber commited on 5 days ago
[Ref, Add] custom css for sizing, move demo utility to its own file bd28414 Joschka Strueber commited on 5 days ago
[Add, Ref] Add more info and table on metric, move model list to data/ b90e0d3 Joschka Strueber commited on 5 days ago
[Fix, Debug] wrong default model, check filter_labels 4b2993a Joschka Strueber commited on 5 days ago
[Ref, Fix] use cached list of usable models, convert logits to OneHot for EC as well 64b132e Joschka Strueber commited on 5 days ago
[Ref, Add] change default models, remove sorting in plot 8be99c0 Joschka Strueber commited on 5 days ago
[Add, Fix] add loading mechanism for cached models, change error to warning when computing heatmap 93d753c Joschka Strueber commited on 5 days ago