Update README.md
Browse files
README.md
CHANGED
@@ -28,7 +28,7 @@ We introduce AceCoder, the first work to propose a fully automated pipeline for
|
|
28 |
|
29 |
|
30 |
## Note
|
31 |
-
- **This model
|
32 |
<!-- - **This model official is trained on the hard version of [TIGER-Lab/AceCode-89K](https://huggingface.co/datasets/TIGER-Lab/AceCode-89K) with about 22k examples, using the [TIGER-Lab/AceCodeRM-7B](https://huggingface.co/TIGER-Lab/AceCodeRM-7B) as the reward** -->
|
33 |
- You can reproduce the hard version of [TIGER-Lab/AceCode-89K](https://huggingface.co/datasets/TIGER-Lab/AceCode-89K) using [script in our Github](#)
|
34 |
- The training takes 6 hours to finish on 8 x H100 GPUs in around 80 optimization steps.
|
|
|
28 |
|
29 |
|
30 |
## Note
|
31 |
+
- **This model is trained on the hard version of [TIGER-Lab/AceCode-89K](https://huggingface.co/datasets/TIGER-Lab/AceCode-89K) with about 22k examples, using the binary pass rate (rule based reward) as the reward**
|
32 |
<!-- - **This model official is trained on the hard version of [TIGER-Lab/AceCode-89K](https://huggingface.co/datasets/TIGER-Lab/AceCode-89K) with about 22k examples, using the [TIGER-Lab/AceCodeRM-7B](https://huggingface.co/TIGER-Lab/AceCodeRM-7B) as the reward** -->
|
33 |
- You can reproduce the hard version of [TIGER-Lab/AceCode-89K](https://huggingface.co/datasets/TIGER-Lab/AceCode-89K) using [script in our Github](#)
|
34 |
- The training takes 6 hours to finish on 8 x H100 GPUs in around 80 optimization steps.
|