Commit
·
49945e6
1
Parent(s):
1c1b159
Update README.md
Browse files
README.md
CHANGED
@@ -57,22 +57,21 @@ This shows the updated table, based on Zephyr-7b-β original table for [MT-Bench
|
|
57 |
|
58 |
| Model | Size | Alignment | MT-Bench (score) | AlpacaEval (win rate %) |
|
59 |
|-------------|-----|----|---------------|--------------|
|
60 |
-
| StableLM-Tuned-α | 7B| dSFT |2.75| -|
|
61 |
| MPT-Chat | 7B |dSFT |5.42| -|
|
62 |
| Xwin-LMv0.1 | 7B| dPPO| 6.19| 87.83|
|
63 |
| Mistral-Instructv0.1 | 7B| - | 6.84 |-|
|
64 |
-
| Zephyr-7b-α |7B| dDPO| 6.88| -|
|
65 |
| Zephyr-7b-β 🪁 | 7B | **dDPO** | **7.34** | 90.60 |
|
66 |
| **Notus-7b-v1** | 7B | **dDPO** | 7.30 | **91.42** |
|
|
|
|
|
|
|
|
|
67 |
| Falcon-Instruct | 40B |dSFT |5.17 |45.71|
|
68 |
| Guanaco | 65B | SFT |6.41| 71.80|
|
69 |
| Llama2-Chat | 70B |RLHF |6.86| 92.66|
|
70 |
| Vicuna v1.3 | 33B |dSFT |7.12 |88.99|
|
71 |
| WizardLM v1.0 | 70B |dSFT |7.71 |-|
|
72 |
| Xwin-LM v0.1 | 70B |dPPO |- |95.57|
|
73 |
-
| GPT-3.5-turbo | - |RLHF |7.94 |89.37|
|
74 |
-
| Claude 2 | - |RLHF |8.06| 91.36|
|
75 |
-
| GPT-4 | -| RLHF |8.99| 95.28|
|
76 |
|
77 |
## Academic benchmarks
|
78 |
|
|
|
57 |
|
58 |
| Model | Size | Alignment | MT-Bench (score) | AlpacaEval (win rate %) |
|
59 |
|-------------|-----|----|---------------|--------------|
|
|
|
60 |
| MPT-Chat | 7B |dSFT |5.42| -|
|
61 |
| Xwin-LMv0.1 | 7B| dPPO| 6.19| 87.83|
|
62 |
| Mistral-Instructv0.1 | 7B| - | 6.84 |-|
|
|
|
63 |
| Zephyr-7b-β 🪁 | 7B | **dDPO** | **7.34** | 90.60 |
|
64 |
| **Notus-7b-v1** | 7B | **dDPO** | 7.30 | **91.42** |
|
65 |
+
| GPT-3.5-turbo | - |RLHF |7.94 |89.37|
|
66 |
+
| Claude 2 | - |RLHF |8.06| 91.36|
|
67 |
+
| Cohere Command 2 | - |RLHF |-| 90.62|
|
68 |
+
| GPT-4 | -| RLHF |8.99| 95.28|
|
69 |
| Falcon-Instruct | 40B |dSFT |5.17 |45.71|
|
70 |
| Guanaco | 65B | SFT |6.41| 71.80|
|
71 |
| Llama2-Chat | 70B |RLHF |6.86| 92.66|
|
72 |
| Vicuna v1.3 | 33B |dSFT |7.12 |88.99|
|
73 |
| WizardLM v1.0 | 70B |dSFT |7.71 |-|
|
74 |
| Xwin-LM v0.1 | 70B |dPPO |- |95.57|
|
|
|
|
|
|
|
75 |
|
76 |
## Academic benchmarks
|
77 |
|