Spaces:

nhop
/

L3Score

Running

Niklas Hoepner commited on Apr 15

Commit

f4e6d54

1 Parent(s): e130a6a

Implemened L3Score from SPIQA datase paper

Files changed (2) hide show

L3Score.py CHANGED Viewed

@@ -120,8 +120,6 @@ class L3Score(evaluate.Metric):
                 )
             )
-        if api_key == "":
-            raise ValueError("api_key is required")
     def _get_llm(self, model, api_key):
         """Get the LLM"""
@@ -134,7 +132,7 @@ class L3Score(evaluate.Metric):
         questions,
         predictions,
         references,
-        api_key="",
         provider="openai",
         model="gpt-4o-mini",
     ):

                 )
             )
     def _get_llm(self, model, api_key):
         """Get the LLM"""
         questions,
         predictions,
         references,
+        api_key,
         provider="openai",
         model="gpt-4o-mini",
     ):

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ description: >
   It uses log-probabilities of "Yes"/"No" tokens from a language model acting as a judge.
   Based on the SPIQA benchmark: https://arxiv.org/pdf/2407.09413
 sdk: gradio
-sdk_version: 3.19.1
 app_file: app.py
 pinned: false
 ---

   It uses log-probabilities of "Yes"/"No" tokens from a language model acting as a judge.
   Based on the SPIQA benchmark: https://arxiv.org/pdf/2407.09413
 sdk: gradio
+sdk_version: 4.44.1
 app_file: app.py
 pinned: false
 ---