add raeadme
Browse files
README.md
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
<p align="left">
|
2 |
-
|
3 |
</p>
|
4 |
<br><br>
|
5 |
|
@@ -30,40 +30,40 @@
|
|
30 |
</p>
|
31 |
</div>
|
32 |
|
|
|
33 |
|
34 |
-
|
35 |
-
|
36 |
-
**Llama3-Chinese**是**以Meta-Llama-3-8B为底座**,使用 [DORA](https://arxiv.org/pdf/2402.09353.pdf) + [LORA+](https://arxiv.org/pdf/2402.12354.pdf) 的训练方法,在50w高质量中文多轮SFT数据 + 10w英文多轮SFT数据 + 2000单轮自我认知数据训练而来的大模型。
|
37 |
|
38 |
**Github:** [https://github.com/seanzhang-zhichen/llama3-chinese](https://github.com/seanzhang-zhichen/llama3-chinese)
|
39 |
|
40 |

|
41 |
|
42 |
|
43 |
-
##
|
|
|
44 |
|
45 |
| Model | Download |
|
46 |
|:-------------------:|:-----------:|
|
47 |
| Meta-Llama-3-8B |[ 🤗 HuggingFace](https://huggingface.co/meta-llama/Meta-Llama-3-8B) [ 🤖 ModelScope](https://modelscope.cn/models/LLM-Research/Meta-Llama-3-8B)|
|
48 |
| Llama3-Chinese-Lora |[ 🤗 HuggingFace](https://huggingface.co/zhichen/Llama3-Chinese-Lora) [ 🤖 ModelScope](https://modelscope.cn/models/seanzhang/Llama3-Chinese-Lora)|
|
49 |
-
| Llama3-Chinese (
|
50 |
|
51 |
|
|
|
52 |
|
53 |
-
|
54 |
-
|
55 |
-
1、下载 [Meta-Llama-3-8B](https://modelscope.cn/models/LLM-Research/Meta-Llama-3-8B)
|
56 |
|
57 |
```bash
|
58 |
git clone https://www.modelscope.cn/LLM-Research/Meta-Llama-3-8B.git
|
59 |
```
|
60 |
|
61 |
-
2
|
62 |
|
63 |
**From ModelScope**
|
64 |
```bash
|
65 |
git lfs install
|
66 |
git clone https://www.modelscope.cn/seanzhang/Llama3-Chinese-Lora.git
|
|
|
67 |
```
|
68 |
|
69 |
**From HuggingFace**
|
@@ -72,7 +72,7 @@ git lfs install
|
|
72 |
git clone https://huggingface.co/zhichen/Llama3-Chinese-Lora
|
73 |
```
|
74 |
|
75 |
-
3
|
76 |
|
77 |
```bash
|
78 |
python merge_lora.py \
|
@@ -81,7 +81,8 @@ python merge_lora.py \
|
|
81 |
--output_dir ./Llama3-Chinese
|
82 |
```
|
83 |
|
84 |
-
|
|
|
85 |
|
86 |
**From ModelScope**
|
87 |
```bash
|
@@ -96,39 +97,36 @@ git clone https://huggingface.co/zhichen/Llama3-Chinese
|
|
96 |
```
|
97 |
|
98 |
|
|
|
99 |
|
100 |
-
|
101 |
-
## vllm web 推理
|
102 |
-
|
103 |
-
1、使用[vllm](https://github.com/vllm-project/vllm)部署模型
|
104 |
|
105 |
```bash
|
106 |
-
python -m vllm.entrypoints.openai.api_server --served-model-name Llama3-Chinese --model ./Llama3-Chinese(
|
107 |
```
|
108 |
|
109 |
-
2
|
110 |
|
111 |
```bash
|
112 |
python vllm_web_demo.py --model Llama3-Chinese
|
113 |
```
|
114 |
|
|
|
115 |
|
116 |
-
|
117 |
-
|
118 |
-
## 训练数据集
|
119 |
-
|
120 |
-
[匠数科技大模型sft数据集](https://modelscope.cn/datasets/deepctrl/deepctrl-sft-data)
|
121 |
|
122 |
|
123 |
## LICENSE
|
124 |
|
125 |
-
|
|
|
|
|
126 |
|
127 |
-
Llama3-Chinese项目代码的授权协议为 [The Apache License 2.0](./LICENSE),代码可免费用做商业用途,模型权重和数据只能用于研究目的。请在产品说明中附加Llama3-Chinese的链接和授权协议。
|
128 |
|
129 |
## Citation
|
130 |
|
131 |
-
|
|
|
132 |
|
133 |
```latex
|
134 |
@misc{Llama3-Chinese,
|
@@ -139,7 +137,6 @@ Llama3-Chinese项目代码的授权协议为 [The Apache License 2.0](./LICENSE)
|
|
139 |
}
|
140 |
```
|
141 |
|
142 |
-
|
143 |
## Acknowledgement
|
144 |
|
145 |
[meta-llama/llama3](https://github.com/meta-llama/llama3)
|
@@ -147,8 +144,6 @@ Llama3-Chinese项目代码的授权协议为 [The Apache License 2.0](./LICENSE)
|
|
147 |
[hiyouga/LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)
|
148 |
|
149 |
|
150 |
-
|
151 |
## Star History
|
152 |
|
153 |
[](https://star-history.com/#seanzhang-zhichen/Llama3-Chinese&Date)
|
154 |
-
|
|
|
1 |
<p align="left">
|
2 |
+
<a href="README_CN.md">中文</a>  |  English
|
3 |
</p>
|
4 |
<br><br>
|
5 |
|
|
|
30 |
</p>
|
31 |
</div>
|
32 |
|
33 |
+
## Introduce
|
34 |
|
35 |
+
**Llama3-Chinese** is a large model trained on 500k high-quality Chinese multi-turn SFT data, 100k English multi-turn SFT data, and 2k single-turn self-cognition data, using the training methods of [DORA](https://arxiv.org/pdf/2402.09353.pdf) and [LORA+](https://arxiv.org/pdf/2402.12354.pdf) based on **Meta-Llama-3-8B** as the base.
|
|
|
|
|
36 |
|
37 |
**Github:** [https://github.com/seanzhang-zhichen/llama3-chinese](https://github.com/seanzhang-zhichen/llama3-chinese)
|
38 |
|
39 |

|
40 |
|
41 |
|
42 |
+
## Download Model
|
43 |
+
|
44 |
|
45 |
| Model | Download |
|
46 |
|:-------------------:|:-----------:|
|
47 |
| Meta-Llama-3-8B |[ 🤗 HuggingFace](https://huggingface.co/meta-llama/Meta-Llama-3-8B) [ 🤖 ModelScope](https://modelscope.cn/models/LLM-Research/Meta-Llama-3-8B)|
|
48 |
| Llama3-Chinese-Lora |[ 🤗 HuggingFace](https://huggingface.co/zhichen/Llama3-Chinese-Lora) [ 🤖 ModelScope](https://modelscope.cn/models/seanzhang/Llama3-Chinese-Lora)|
|
49 |
+
| Llama3-Chinese (merged model) |[ 🤗 HuggingFace](https://huggingface.co/zhichen/Llama3-Chinese) [ 🤖 ModelScope](https://modelscope.cn/models/seanzhang/Llama3-Chinese)|
|
50 |
|
51 |
|
52 |
+
## Merge LORA Model (Skippable)
|
53 |
|
54 |
+
1、Download [Meta-Llama-3-8B](https://modelscope.cn/models/LLM-Research/Meta-Llama-3-8B)
|
|
|
|
|
55 |
|
56 |
```bash
|
57 |
git clone https://www.modelscope.cn/LLM-Research/Meta-Llama-3-8B.git
|
58 |
```
|
59 |
|
60 |
+
2、Download [Llama3-Chinese-Lora](https://www.modelscope.cn/models/seanzhang/Llama3-Chinese-Lora)
|
61 |
|
62 |
**From ModelScope**
|
63 |
```bash
|
64 |
git lfs install
|
65 |
git clone https://www.modelscope.cn/seanzhang/Llama3-Chinese-Lora.git
|
66 |
+
|
67 |
```
|
68 |
|
69 |
**From HuggingFace**
|
|
|
72 |
git clone https://huggingface.co/zhichen/Llama3-Chinese-Lora
|
73 |
```
|
74 |
|
75 |
+
3、Merge Model
|
76 |
|
77 |
```bash
|
78 |
python merge_lora.py \
|
|
|
81 |
--output_dir ./Llama3-Chinese
|
82 |
```
|
83 |
|
84 |
+
|
85 |
+
## Download Llama3-Chinese (Merged Model)
|
86 |
|
87 |
**From ModelScope**
|
88 |
```bash
|
|
|
97 |
```
|
98 |
|
99 |
|
100 |
+
## VLLM WEB DEMO
|
101 |
|
102 |
+
1、Use [vllm](https://github.com/vllm-project/vllm) deploy model
|
|
|
|
|
|
|
103 |
|
104 |
```bash
|
105 |
+
python -m vllm.entrypoints.openai.api_server --served-model-name Llama3-Chinese --model ./Llama3-Chinese(Replace it with your own merged model path)
|
106 |
```
|
107 |
|
108 |
+
2、This command is executed on the CLI
|
109 |
|
110 |
```bash
|
111 |
python vllm_web_demo.py --model Llama3-Chinese
|
112 |
```
|
113 |
|
114 |
+
## Train Dataset
|
115 |
|
116 |
+
[deepctrl-sft-data](https://modelscope.cn/datasets/deepctrl/deepctrl-sft-data)
|
|
|
|
|
|
|
|
|
117 |
|
118 |
|
119 |
## LICENSE
|
120 |
|
121 |
+
This project can only be used for research purposes, and the project developer shall not bear any harm or loss caused by the use of this project (including but not limited to data, models, codes, etc.). For details, please refer to [DISCLAIMER](https://github.com/seanzhang-zhichen/Llama3-Chinese/blob/main/DISCLAIMER)。
|
122 |
+
|
123 |
+
The License agreement of the Llama3-Chinese project code is the [Apache License 2.0](./LICENSE). The code is free for commercial use, and the model weights and data can only be used for research purposes. Please attach a link to Llama3-Chinese and the licensing agreement in the product description.
|
124 |
|
|
|
125 |
|
126 |
## Citation
|
127 |
|
128 |
+
If you used Llama3-Chinese in your research, cite it in the following format:
|
129 |
+
|
130 |
|
131 |
```latex
|
132 |
@misc{Llama3-Chinese,
|
|
|
137 |
}
|
138 |
```
|
139 |
|
|
|
140 |
## Acknowledgement
|
141 |
|
142 |
[meta-llama/llama3](https://github.com/meta-llama/llama3)
|
|
|
144 |
[hiyouga/LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)
|
145 |
|
146 |
|
|
|
147 |
## Star History
|
148 |
|
149 |
[](https://star-history.com/#seanzhang-zhichen/Llama3-Chinese&Date)
|
|