Add CHANGELOG
Browse files- app.py +6 -0
- content.py +24 -0
app.py
CHANGED
|
@@ -404,6 +404,12 @@ We chose these benchmarks as they test a variety of reasoning and general knowle
|
|
| 404 |
],
|
| 405 |
submission_result,
|
| 406 |
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 407 |
|
| 408 |
block.load(
|
| 409 |
refresh,
|
|
|
|
| 404 |
],
|
| 405 |
submission_result,
|
| 406 |
)
|
| 407 |
+
|
| 408 |
+
with gr.Row():
|
| 409 |
+
changelog = gr.Markdown(CHANGELOG_TEXT)
|
| 410 |
+
|
| 411 |
+
|
| 412 |
+
|
| 413 |
|
| 414 |
block.load(
|
| 415 |
refresh,
|
content.py
ADDED
|
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
CHANGELOG_TEXT = f"""
|
| 2 |
+
# Changelog
|
| 3 |
+
|
| 4 |
+
## [2023-05-24]
|
| 5 |
+
- Add a baseline (all 25.0).
|
| 6 |
+
|
| 7 |
+
## [2023-05-23]
|
| 8 |
+
- Fixed a CSS issue that made the leaderboard hard to read in dark mode.
|
| 9 |
+
|
| 10 |
+
## [2023-05-22]
|
| 11 |
+
- Display a success/error message after submitting evaluation requests.
|
| 12 |
+
- Reject duplicate submission.
|
| 13 |
+
- Do not display results that have incomplete results.
|
| 14 |
+
- Display different queues for jobs that are RUNNING, PENDING, FINISHED status.
|
| 15 |
+
|
| 16 |
+
## [2023-05-15]
|
| 17 |
+
- Fixed a typo: from "TruthQA" to "TruthfulQA"
|
| 18 |
+
|
| 19 |
+
## [2023-05-10]
|
| 20 |
+
- Fixed a bug that prevented auto-refresh.
|
| 21 |
+
|
| 22 |
+
## [2023-05-10]
|
| 23 |
+
- Released the leaderboard to public.
|
| 24 |
+
"""
|