Upload index.html
Browse files- index.html +2 -2
index.html
CHANGED
@@ -140,9 +140,9 @@
|
|
140 |
Similar to the <a href="https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard" target="_blank">🤗 Open LLM Leaderboard</a>,
|
141 |
we selected two common benchmarks for evaluating Code LLMs on multiple programming languages:</p> -->
|
142 |
<ul>
|
143 |
-
<li><a href="https://huggingface.co/datasets/openai_humaneval" target="_blank">HumanEval</a
|
144 |
</li>
|
145 |
-
<li><a href="https://github.com/YihongDong/CodeGenEvaluation" target="_blank">HumanEval-ET</a
|
146 |
</li>
|
147 |
</ul>
|
148 |
<p>
|
|
|
140 |
Similar to the <a href="https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard" target="_blank">🤗 Open LLM Leaderboard</a>,
|
141 |
we selected two common benchmarks for evaluating Code LLMs on multiple programming languages:</p> -->
|
142 |
<ul>
|
143 |
+
<li><a href="https://huggingface.co/datasets/openai_humaneval" target="_blank">HumanEval</a>: Used to measure the functional correctness of programs generated from docstrings. It includes 164 Python programming problems.
|
144 |
</li>
|
145 |
+
<li><a href="https://github.com/YihongDong/CodeGenEvaluation" target="_blank">HumanEval-ET</a>: The extended version of HumanEval benchmark, where each task includes more than 100 test cases.
|
146 |
</li>
|
147 |
</ul>
|
148 |
<p>
|