Joschka Strueber commited on
Commit
69fd3ae
·
1 Parent(s): 274c92e

[Fix] mathjax in metric explanation

Browse files
Files changed (1) hide show
  1. app.py +1 -1
app.py CHANGED
@@ -78,7 +78,7 @@ with gr.Blocks(title="LLM Similarity Analyzer", css=app_util.custom_css) as demo
78
  )
79
 
80
  gr.Markdown("## Information")
81
- gr.Markdown("""We propose Chance Adjusted Probabilistic Agreement (\(\operatorname{CAPA}\), or \(\kappa_p\)), a novel metric \
82
  for model similarity which adjusts for chance agreement due to accuracy. Using CAPA, we find: (1) LLM-as-a-judge scores are \
83
  biased towards more similar models controlling for the model's capability. (2) Gain from training strong models on annotations \
84
  of weak supervisors (weak-to-strong generalization) is higher when the two models are more different. (3) Concerningly, model \
 
78
  )
79
 
80
  gr.Markdown("## Information")
81
+ gr.Markdown(r"""We propose Chance Adjusted Probabilistic Agreement (\(\operatorname{CAPA}\), or \(\kappa_p\)), a novel metric \
82
  for model similarity which adjusts for chance agreement due to accuracy. Using CAPA, we find: (1) LLM-as-a-judge scores are \
83
  biased towards more similar models controlling for the model's capability. (2) Gain from training strong models on annotations \
84
  of weak supervisors (weak-to-strong generalization) is higher when the two models are more different. (3) Concerningly, model \