metadata

title: README
emoji: 🚀
colorFrom: blue
colorTo: blue
sdk: static
pinned: false

Advancing Open-source Language Models with Mixed-Quality Data

Online Demo | GitHub | Paper | Discord

OpenChat is dedicated to advancing and releasing open-source language models, fine-tuned with our C-RLFT technique, which is inspired by offline reinforcement learning. Our models learn from mixed-quality data without preference labels, delivering exceptional performance on par with ChatGPT, even with a 7B model which can be run on a consumer GPU (e.g. RTX 3090).

📰 News

[2024/03/15] Nexusflow releases Starling-Beta, an RLHF-tune of openchat-3.5-1106, which is currently the highest ranking Open Source LLM on LMSys Arena not originating from a company, beating all others at only 7B.
[2024/03/08] Released OpenChat-3.5-0106-Gemma, the highest performing Gemma fine-tune.
[2024/01/07] Released OpenChat-3.5-0106, trained with a new data pipeline - the strongest 7B LLM in the world.
- Ranked as the top 7B LLM on LMSys Arena.
- Ranked on LMSys Arena as the top open source LLM not originating from a company.
[2023/12/10] Rleased OpenChat-3.5-1210, 15-point improvements in coding.
[2023/11/01] Released OpenChat-3.5-7B, surpassing ChatGPT on various benchmarks 🔥.
[2023/09/21] Released our paper OpenChat: Advancing Open-source Language Models with Mixed-Quality Data.

📊 Benchmarks

Model	# Params	Average	MT-Bench	HumanEval	BBH MC	AGIEval	TruthfulQA	MMLU	GSM8K	BBH CoT
OpenChat-3.5-0106	7B	64.5	7.8	71.3	51.5	49.1	61.0	65.8	77.4	62.2
OpenChat-3.5-0106-Gemma	7B	64.4	7.83	67.7	52.7	50.2	55.4	65.7	81.5	63.7
OpenChat-3.5-1210	7B	63.8	7.76	68.9	49.5	48.0	61.8	65.3	77.3	61.8
OpenChat-3.5	7B	61.6	7.81	55.5	47.6	47.4	59.1	64.3	77.3	63.5
ChatGPT (March)*	?	61.5	7.94	48.1	47.6	47.1	57.7	67.3	74.9	70.1

OpenHermes 2.5	7B	59.3	7.54	48.2	49.4	46.5	57.5	63.8	73.5	59.9
OpenOrca Mistral	7B	52.7	6.86	38.4	49.4	42.9	45.9	59.3	59.1	58.1
Zephyr-β^	7B	34.6	7.34	22.0	40.6	39.0	40.8	39.8	5.1	16.0
Mistral	7B	-	6.84	30.5	39.0	38.0	-	60.1	52.2	-

𝕏 Comparison with X.AI Grok

	License	# Param	Average	MMLU	HumanEval	MATH	GSM8k
OpenChat-3.5-0106	Apache-2.0	7B	61.0	65.8	71.3	29.3	77.4
OpenChat 3.5 1210	Apache-2.0	7B	60.1	65.3	68.9	28.9	77.3
OpenChat 3.5	Apache-2.0	7B	56.4	64.3	55.5	28.6	77.3
Grok-0	Proprietary	33B	44.5	65.7	39.7	15.7	56.8
Grok-1	Proprietary	???B	55.8	73	63.2	23.9	62.9

💌Contact

Project Lead:

Guan Wang [imonenext at gmail dot com]
Alpay Ariyak [aariyak at wpi dot edu]

Sponsors:

Sen Song (Tsinghua University)
Yang Liu (Tsinghua University)
01.AI Company
RunPod