galtimur commited on
Commit
96a7ff5
Β·
verified Β·
1 Parent(s): 4b54f71

Update src/tasks_content.py

Browse files
Files changed (1) hide show
  1. src/tasks_content.py +3 -1
src/tasks_content.py CHANGED
@@ -27,7 +27,9 @@ TASKS_DESCRIPTIONS = {
27
  Our CI builds repair benchmark πŸ€— [JetBrains-Research/lca-ci-builds-repair](https://huggingface.co/datasets/JetBrains-Research/lca-ci-builds-repair)
28
  includes 77 manually curated and assessed data points coming from 32 Python repositories, which are used to make a model fix a failed build.
29
 
30
- We use the `Pass@1` metric for CI repair.
 
 
31
  Models can be evaluated in three types of tasks:
32
  * `full` – **no** ground truth diffs are used for model evaluation;
33
  * `oracle: files` – ground truth diffs are used to select files that should be corrected to fix the issue;
 
27
  Our CI builds repair benchmark πŸ€— [JetBrains-Research/lca-ci-builds-repair](https://huggingface.co/datasets/JetBrains-Research/lca-ci-builds-repair)
28
  includes 77 manually curated and assessed data points coming from 32 Python repositories, which are used to make a model fix a failed build.
29
 
30
+ The benchmark clones the repo to the local folder. The baseline model fixes the issue according to logs and the local repo state,
31
+ and then the benchmark pushes the repo to GitGub and requests the result of the GitHub CI.
32
+ We use the `Pass@1` rate metric for CI repair.
33
  Models can be evaluated in three types of tasks:
34
  * `full` – **no** ground truth diffs are used for model evaluation;
35
  * `oracle: files` – ground truth diffs are used to select files that should be corrected to fix the issue;