Commit History

[Fix] error in label filtering
2d8352e

Joschka Strueber commited on

[Fix] error in filter_responses call
7c4f6b6

Joschka Strueber commited on

[Fix] import error
001064a

Joschka Strueber commited on

[Add] add bbh and gpqa benchmarks again with correct answer_index selection
0a42e99

Joschka Strueber commited on

[Ref] apply custom css to heatmap, increase size of images
4077e51

Joschka Strueber commited on

[Ref, Add] custom css for sizing, move demo utility to its own file
bd28414

Joschka Strueber commited on

[Ref] change table size
1b549fb

Joschka Strueber commited on

[Add, Ref] Add more info and table on metric, move model list to data/
b90e0d3

Joschka Strueber commited on

[Fix] removal of not working benchmarks
c24946e

Joschka Strueber commited on

[Fix] default heatmap
26c0eec

Joschka Strueber commited on

[Fix] error in default heatmap
c4145ee

Joschka Strueber commited on

[Add] ignore datasets that are not functional atm
bf2618d

Joschka Strueber commited on

[Add] default heatmap
45b2347

Joschka Strueber commited on

[Fix] key error for binary datasets
9e1c5ed

Joschka Strueber commited on

[Fix, Debug] wrong default model, check filter_labels
4b2993a

Joschka Strueber commited on

[Fix] error in deleting not-matching gt values
75132dc

Joschka Strueber commited on

[Ref] dataset selection
b1f98e1

Joschka Strueber commited on

[Fix] error in dataset default selection
58da8de

Joschka Strueber commited on

[Add, Fix] add list of ungated models
5c5dc6a

Joschka Strueber commited on

[Ref, Fix] use cached list of usable models, convert logits to OneHot for EC as well
64b132e

Joschka Strueber commited on

[Fix] type in default model name
cb7e104

Joschka Strueber commited on

[Debug] EC error
1f20712

Joschka Strueber commited on

[Ref, Add] change default models, remove sorting in plot
8be99c0

Joschka Strueber commited on

[Add] only load cached models
ec5f717

Joschka Strueber commited on

[Fix] wrong API calls
9f3c166

Joschka Strueber commited on

[Ref] check number of saved and loaded models
e604b65

Joschka Strueber commited on

[Ref] check number of saved models
715aed5

Joschka Strueber commited on

[Fix] add check if cached files have been saved
81438ca

Joschka Strueber commited on

[Fix, Add] check for #api calls, bug in warning
047f32f

Joschka Strueber commited on

[Add, Fix] add loading mechanism for cached models, change error to warning when computing heatmap
93d753c

Joschka Strueber commited on

[Add] saving unblocked models as file to read from
1e010df

Joschka Strueber commited on

[Fix, Add] fix bug with metric names
d2471f2

Joschka Strueber commited on

[Fix] catch all errors from API access
1072829

Joschka Strueber commited on

[Add] cache loading data from hf
e64ca4e

Joschka Strueber commited on

[Add] list of default models
5815cf9

Joschka Strueber commited on

[Add, Fix] change to CAPA, fix error in dataloading
ce6be70

Joschka Strueber commited on

[Add] filter gated models
5d4059c

Joschka Strueber commited on

[Add] links to project information
238bffb

Joschka Strueber commited on

[Add, Fix] better warnings for missing models, better description
35404bc

Joschka Strueber commited on

[Fix] wrong import
3eeaa4c

Joschka Strueber commited on

[Add] error messages
b776365

Joschka Strueber commited on

[Add] error messages
75b9622

Joschka Strueber commited on

[Fix] convert logits to softmax for kappa_p
00b5438

Joschka Strueber commited on

[Ref] change name of the space
8851661

Joschka Strueber commited on

[Ref] change name of the space
8c73413

Joschka Strueber commited on

[Add] emoji
57b85c4

Joschka Strueber commited on

[Ref] remove duplicate login
d97ea57

Joschka Strueber commited on

[Fix] sim check for gts values
65ef274

Joschka Strueber commited on

[Fix] check version, fix dataset update
ea91c80

Joschka Strueber commited on

[Fix] import error
8fd0ade

Joschka Strueber commited on