Update README.md
Browse files
README.md
CHANGED
@@ -186,7 +186,7 @@ The threshold is obtained by converting the original score of each step using si
|
|
186 |
For leveraging the capibility of **EurusPRM** better, we add ``Step K`` (where K is the actual index of the step) in front of each step in **ProcessBench**.
|
187 |
|
188 |
| Reward Model | GSM8k | MATH | OlympiadBench | Omni-Math | Avg |
|
189 |
-
| --- | --- | --- | --- | --- | --- | --- |
|
190 |
| Math-Shepherd-PRM-7B | 47.9 | 29.5 | 24.8 | 23.8 | 31.5 |
|
191 |
| RLHFlow-PRM-Mistral-8B | 50.4 | 33.4 | 13.8 | 15.8 | 28.4 |
|
192 |
| RLHFlow-PRM-Deepseek-8B | 38.8 | 33.8 | 16.9 | 16.9 | 26.6 |
|
|
|
186 |
For leveraging the capibility of **EurusPRM** better, we add ``Step K`` (where K is the actual index of the step) in front of each step in **ProcessBench**.
|
187 |
|
188 |
| Reward Model | GSM8k | MATH | OlympiadBench | Omni-Math | Avg |
|
189 |
+
| --- | --- | --- | --- | --- | --- | --- |
|
190 |
| Math-Shepherd-PRM-7B | 47.9 | 29.5 | 24.8 | 23.8 | 31.5 |
|
191 |
| RLHFlow-PRM-Mistral-8B | 50.4 | 33.4 | 13.8 | 15.8 | 28.4 |
|
192 |
| RLHFlow-PRM-Deepseek-8B | 38.8 | 33.8 | 16.9 | 16.9 | 26.6 |
|