Update README.md
Browse files
README.md
CHANGED
@@ -101,6 +101,9 @@ See the Falcon 180B model card for an example of this.
|
|
101 |
|
102 |
## Performance
|
103 |
|
|
|
|
|
|
|
104 |
| Benchmark (eval) | Tülu 3 SFT 8B | Tülu 3 DPO 8B | Tülu 3 8B | Llama 3.1 8B Instruct | Qwen 2.5 7B Instruct | Magpie 8B | Gemma 2 9B Instruct | Ministral 8B Instruct |
|
105 |
|---------------------------------|----------------|----------------|------------|------------------------|----------------------|-----------|---------------------|-----------------------|
|
106 |
| **Avg.** | 60.4 | 64.4 | **64.8** | 62.2 | 57.8 | 44.7 | 55.2 | 58.3 |
|
|
|
101 |
|
102 |
## Performance
|
103 |
|
104 |
+
*Note, see the updated version of the paper for the latest, fixed evaluations that improve scores for models such as Qwen 2.5 Instruct.*
|
105 |
+
|
106 |
+
|
107 |
| Benchmark (eval) | Tülu 3 SFT 8B | Tülu 3 DPO 8B | Tülu 3 8B | Llama 3.1 8B Instruct | Qwen 2.5 7B Instruct | Magpie 8B | Gemma 2 9B Instruct | Ministral 8B Instruct |
|
108 |
|---------------------------------|----------------|----------------|------------|------------------------|----------------------|-----------|---------------------|-----------------------|
|
109 |
| **Avg.** | 60.4 | 64.4 | **64.8** | 62.2 | 57.8 | 44.7 | 55.2 | 58.3 |
|