Commit
·
602f82f
1
Parent(s):
ba61d13
adding model card
Browse files
README.md
CHANGED
@@ -1,9 +1,3 @@
|
|
1 |
-
# TenyxChat: Language Model Alignment using Tenyx Fine-tuning
|
2 |
-
|
3 |
-
Introducing TenyxChat, a series of ChatGPT-like models trained to function as useful assistants through preference tuning, using Tenyx's recently released advanced fine-tuning technology ([VentureBeat article](https://venturebeat.com/ai/tenyx-aims-to-fix-llms-catastrophic-forgetting-problem/)). Our first chat model in the series, TenyxChat-7B-v1, is trained using the [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) framework on the open-source AI feedback dataset [UltraFeedback](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized).
|
4 |
-
|
5 |
-
We fine-tune [Openchat-3.5](https://arxiv.org/pdf/2309.11235.pdf) with our proprietary approach ([blog](https://www.tenyx.com/post/forgetting-and-toxicity-in-llms-a-deep-dive-on-fine-tuning-methods), [service](https://www.tenyx.com/fine-tuning)), which shows an increase in [MT-Bench](https://arxiv.org/abs/2306.05685), without a drop in performance of the model on other benchmarks. Our approach aims to mitigate forgetting in LLMs in a computationally efficient manner, thereby enabling continual fine-tuning capabilities without altering the pre-trained output distribution. TenyxChat-7B-v1 was trained using eight A100s (80GB) for two hours, with a training setup obtained from HuggingFaceH4 ([GitHub](https://github.com/huggingface/alignment-handbook)).
|
6 |
-
|
7 |
---
|
8 |
model_type: Fine-tuned 7B model for chat.
|
9 |
license: {apache-2.0}
|
@@ -11,6 +5,12 @@ base_model: {openchat/openchat_3.5}
|
|
11 |
demo: [Hugging Face Spaces](https://huggingface.co/spaces/tenyx/TenyxChat-7B-v1)
|
12 |
---
|
13 |
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
|
15 |
## Usage
|
16 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
model_type: Fine-tuned 7B model for chat.
|
3 |
license: {apache-2.0}
|
|
|
5 |
demo: [Hugging Face Spaces](https://huggingface.co/spaces/tenyx/TenyxChat-7B-v1)
|
6 |
---
|
7 |
|
8 |
+
# TenyxChat: Language Model Alignment using Tenyx Fine-tuning
|
9 |
+
|
10 |
+
Introducing TenyxChat, a series of ChatGPT-like models trained to function as useful assistants through preference tuning, using Tenyx's recently released advanced fine-tuning technology ([VentureBeat article](https://venturebeat.com/ai/tenyx-aims-to-fix-llms-catastrophic-forgetting-problem/)). Our first chat model in the series, TenyxChat-7B-v1, is trained using the [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) framework on the open-source AI feedback dataset [UltraFeedback](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized).
|
11 |
+
|
12 |
+
We fine-tune [Openchat-3.5](https://arxiv.org/pdf/2309.11235.pdf) with our proprietary approach ([blog](https://www.tenyx.com/post/forgetting-and-toxicity-in-llms-a-deep-dive-on-fine-tuning-methods), [service](https://www.tenyx.com/fine-tuning)), which shows an increase in [MT-Bench](https://arxiv.org/abs/2306.05685), without a drop in performance of the model on other benchmarks. Our approach aims to mitigate forgetting in LLMs in a computationally efficient manner, thereby enabling continual fine-tuning capabilities without altering the pre-trained output distribution. TenyxChat-7B-v1 was trained using eight A100s (80GB) for two hours, with a training setup obtained from HuggingFaceH4 ([GitHub](https://github.com/huggingface/alignment-handbook)).
|
13 |
+
|
14 |
|
15 |
## Usage
|
16 |
|