Update README.md
Browse files
README.md
CHANGED
@@ -14,4 +14,19 @@ base_model:
|
|
14 |
- Qwen/Qwen2-72B-Instruct
|
15 |
---
|
16 |
|
17 |
-
# dfurman/Qwen2-72B-Orpo-v0.1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
- Qwen/Qwen2-72B-Instruct
|
15 |
---
|
16 |
|
17 |
+
# dfurman/Qwen2-72B-Orpo-v0.1
|
18 |
+
|
19 |
+
## Introduction
|
20 |
+
|
21 |
+
Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repo contains the instruction-tuned 72B Qwen2 model.
|
22 |
+
|
23 |
+
Compared with the state-of-the-art opensource language models, including the previous released Qwen1.5, Qwen2 has generally surpassed most opensource models and demonstrated competitiveness against proprietary models across a series of benchmarks targeting for language understanding, language generation, multilingual capability, coding, mathematics, reasoning, etc.
|
24 |
+
|
25 |
+
Qwen2-72B-Instruct supports a context length of up to 131,072 tokens, enabling the processing of extensive inputs. Please refer to [this section](#processing-long-texts) for detailed instructions on how to deploy Qwen2 for handling long texts.
|
26 |
+
|
27 |
+
For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2/), [GitHub](https://github.com/QwenLM/Qwen2), and [Documentation](https://qwen.readthedocs.io/en/latest/).
|
28 |
+
<br>
|
29 |
+
|
30 |
+
## This finetune
|
31 |
+
|
32 |
+
Qwen2-72B-Orpo-v0.1 is a QLoRA finetune of `Qwen/Qwen2-72B-Instruct` on 1.5k rows of `mlabonne/orpo-dpo-mix-40k`.
|