title: README
emoji: π
colorFrom: blue
colorTo: blue
sdk: static
pinned: false
Online Demo
|
GitHub
|
Paper
|
Discord
OpenChat is dedicated to advancing and releasing open-source language models, fine-tuned with our C-RLFT technique, which is inspired by offline reinforcement learning. Our models learn from mixed-quality data without preference labels, delivering exceptional performance on par with ChatGPT
, even with a 7B
model which can be run on a consumer GPU (e.g. RTX 3090).
π° News
[2024/03/15] Nexusflow releases Starling-Beta, an RLHF-tune of openchat-3.5-1106, which is currently the highest ranking Open Source LLM on LMSys Arena not originating from a company, beating all others at only 7B.
[2024/03/08] Released OpenChat-3.5-0106-Gemma, the highest performing Gemma fine-tune.
[2024/01/07] Released OpenChat-3.5-0106, trained with a new data pipeline - the strongest 7B LLM in the world.
- Ranked as the top 7B LLM on LMSys Arena.
- Ranked on LMSys Arena as the top open source LLM not originating from a company.
[2023/12/10] Rleased OpenChat-3.5-1210, 15-point improvements in coding.
[2023/11/01] Released OpenChat-3.5-7B, surpassing ChatGPT on various benchmarks π₯.
[2023/09/21] Released our paper OpenChat: Advancing Open-source Language Models with Mixed-Quality Data.
π Benchmarks
Model | # Params | Average | MT-Bench | HumanEval | BBH MC | AGIEval | TruthfulQA | MMLU | GSM8K | BBH CoT |
---|---|---|---|---|---|---|---|---|---|---|
OpenChat-3.5-0106 | 7B | 64.5 | 7.8 | 71.3 | 51.5 | 49.1 | 61.0 | 65.8 | 77.4 | 62.2 |
OpenChat-3.5-0106-Gemma | 7B | 64.4 | 7.83 | 67.7 | 52.7 | 50.2 | 55.4 | 65.7 | 81.5 | 63.7 |
OpenChat-3.5-1210 | 7B | 63.8 | 7.76 | 68.9 | 49.5 | 48.0 | 61.8 | 65.3 | 77.3 | 61.8 |
OpenChat-3.5 | 7B | 61.6 | 7.81 | 55.5 | 47.6 | 47.4 | 59.1 | 64.3 | 77.3 | 63.5 |
ChatGPT (March)* | ? | 61.5 | 7.94 | 48.1 | 47.6 | 47.1 | 57.7 | 67.3 | 74.9 | 70.1 |
OpenHermes 2.5 | 7B | 59.3 | 7.54 | 48.2 | 49.4 | 46.5 | 57.5 | 63.8 | 73.5 | 59.9 |
OpenOrca Mistral | 7B | 52.7 | 6.86 | 38.4 | 49.4 | 42.9 | 45.9 | 59.3 | 59.1 | 58.1 |
Zephyr-Ξ²^ | 7B | 34.6 | 7.34 | 22.0 | 40.6 | 39.0 | 40.8 | 39.8 | 5.1 | 16.0 |
Mistral | 7B | - | 6.84 | 30.5 | 39.0 | 38.0 | - | 60.1 | 52.2 | - |
π Comparison with X.AI Grok
License | # Param | Average | MMLU | HumanEval | MATH | GSM8k | |
---|---|---|---|---|---|---|---|
OpenChat-3.5-0106 | Apache-2.0 | 7B | 61.0 | 65.8 | 71.3 | 29.3 | 77.4 |
OpenChat 3.5 1210 | Apache-2.0 | 7B | 60.1 | 65.3 | 68.9 | 28.9 | 77.3 |
OpenChat 3.5 | Apache-2.0 | 7B | 56.4 | 64.3 | 55.5 | 28.6 | 77.3 |
Grok-0 | Proprietary | 33B | 44.5 | 65.7 | 39.7 | 15.7 | 56.8 |
Grok-1 | Proprietary | ???B | 55.8 | 73 | 63.2 | 23.9 | 62.9 |
πContact
Project Lead:
- Guan Wang [imonenext at gmail dot com]
- Alpay Ariyak [aariyak at wpi dot edu]
Sponsors:
- Sen Song (Tsinghua University)
- Yang Liu (Tsinghua University)
- 01.AI Company
- RunPod