Make the README consistent for the model over iterations
Browse files
README.md
CHANGED
@@ -24,9 +24,7 @@ This model was developed using [Bootstrapping Language Models with DPO Implicit
|
|
24 |
- License: MIT
|
25 |
- Fine-tuned from model: princeton-nlp/Llama-3-Base-8B-SFT-DPO
|
26 |
|
27 |
-
## AlpacaEval Leaderboard Evaluation Results
|
28 |
-
|
29 |
-
The following table shows the AlpacaEval leaderboard evaluation results for this model and related models:
|
30 |
|
31 |
| Model | LC. Win Rate | Win Rate |
|
32 |
|-------------------------------------------|:------------:|:--------:|
|
@@ -34,7 +32,8 @@ The following table shows the AlpacaEval leaderboard evaluation results for this
|
|
34 |
|[Llama-3-Base-8B-DICE-Iter1](https://huggingface.co/sail/Llama-3-Base-8B-DICE-Iter1) |25.08 |25.77
|
35 |
|[Llama-3-Base-8B-DICE-Iter2](https://huggingface.co/sail/Llama-3-Base-8B-DICE-Iter2) |**27.55** |**30.99**
|
36 |
|
37 |
-
|
|
|
38 |
|
39 |
## Citation
|
40 |
|
@@ -45,6 +44,4 @@ The following table shows the AlpacaEval leaderboard evaluation results for this
|
|
45 |
journal={arXiv preprint arXiv:2406.09760},
|
46 |
year={2024}
|
47 |
}
|
48 |
-
```
|
49 |
-
|
50 |
-
Code: https://github.com/sail-sg/dice
|
|
|
24 |
- License: MIT
|
25 |
- Fine-tuned from model: princeton-nlp/Llama-3-Base-8B-SFT-DPO
|
26 |
|
27 |
+
## [AlpacaEval Leaderboard Evaluation Results](https://tatsu-lab.github.io/alpaca_eval/)
|
|
|
|
|
28 |
|
29 |
| Model | LC. Win Rate | Win Rate |
|
30 |
|-------------------------------------------|:------------:|:--------:|
|
|
|
32 |
|[Llama-3-Base-8B-DICE-Iter1](https://huggingface.co/sail/Llama-3-Base-8B-DICE-Iter1) |25.08 |25.77
|
33 |
|[Llama-3-Base-8B-DICE-Iter2](https://huggingface.co/sail/Llama-3-Base-8B-DICE-Iter2) |**27.55** |**30.99**
|
34 |
|
35 |
+
## Code
|
36 |
+
https://github.com/sail-sg/dice
|
37 |
|
38 |
## Citation
|
39 |
|
|
|
44 |
journal={arXiv preprint arXiv:2406.09760},
|
45 |
year={2024}
|
46 |
}
|
47 |
+
```
|
|
|
|