Update README.md
Browse files
README.md
CHANGED
@@ -29,7 +29,6 @@ We used a mixture of the following datasets
|
|
29 |
|
30 |
### luxia-21.4b-alignment model
|
31 |
We utilize state-of-the-art instruction fine-tuning methods including direct preference optimization (DPO).
|
32 |
-
After DPO training, we linearly merged models to boost performance.
|
33 |
|
34 |
We used a mixture of the following datasets
|
35 |
- jondurbin/truthy-dpo-v0.1
|
|
|
29 |
|
30 |
### luxia-21.4b-alignment model
|
31 |
We utilize state-of-the-art instruction fine-tuning methods including direct preference optimization (DPO).
|
|
|
32 |
|
33 |
We used a mixture of the following datasets
|
34 |
- jondurbin/truthy-dpo-v0.1
|