TensorTemplar commited on
Commit
a12227a
·
verified ·
1 Parent(s): f562cf1

remove size from name column

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -56,7 +56,7 @@ Local Costs based on lambdalabs instances at January 2025 prices
56
 
57
  Rank | Model | VRAM (GB) | GSM8K (%) | IFEval (%) | MUSR-Murder (%) | MUSR-Object (%) | MUSR-Team (%) | Avg Score | Relative to Root Judge (%) |
58
  | ---|--------------|------------|--------|---------|--------------|--------------|------------|------------|--------------------|
59
- **1** | **Root Judge (FP8)** | 70 | **94.6 ± 0.6** | **93.9** | 52.8 ± 3.2 | 24.6 ± 2.7 | **56.8 ± 3.1** | **64.5** | 100 |
60
  2 | Llama-3.3-70B | 140 | 94.4 ± 0.6 | 93.4 | 54.0 ± 3.2 | 23.4 ± 2.7 | 56.0 ± 3.2 | 64.3 | 99.5 |
61
  3 | Patronus-70B | 140 | 91.7 ± 0.8 | 83.7 | 54.4 ± 3.2 | 24.6 ± 2.7 | 48.8 ± 3.2 | 60.6 | 93.9 |
62
  4 | Nemotron-70B | 70 | 80.1 ± 1.1 | 85.0 | 53.6 ± 3.2 | 23.8 ± 2.7 | 55.6 ± 3.1 | 59.6 | 92.4 |
 
56
 
57
  Rank | Model | VRAM (GB) | GSM8K (%) | IFEval (%) | MUSR-Murder (%) | MUSR-Object (%) | MUSR-Team (%) | Avg Score | Relative to Root Judge (%) |
58
  | ---|--------------|------------|--------|---------|--------------|--------------|------------|------------|--------------------|
59
+ **1** | **Root Judge** | 70 | **94.6 ± 0.6** | **93.9** | 52.8 ± 3.2 | 24.6 ± 2.7 | **56.8 ± 3.1** | **64.5** | 100 |
60
  2 | Llama-3.3-70B | 140 | 94.4 ± 0.6 | 93.4 | 54.0 ± 3.2 | 23.4 ± 2.7 | 56.0 ± 3.2 | 64.3 | 99.5 |
61
  3 | Patronus-70B | 140 | 91.7 ± 0.8 | 83.7 | 54.4 ± 3.2 | 24.6 ± 2.7 | 48.8 ± 3.2 | 60.6 | 93.9 |
62
  4 | Nemotron-70B | 70 | 80.1 ± 1.1 | 85.0 | 53.6 ± 3.2 | 23.8 ± 2.7 | 55.6 ± 3.1 | 59.6 | 92.4 |