Update Readme
Browse files
README.md
CHANGED
@@ -11,9 +11,9 @@ tags:
|
|
11 |
---
|
12 |
# Model Card for RootSignals-Judge-Llama-70B
|
13 |
|
14 |
-
Root Judge is a powerful mid-sized model that enables reliable and customizable LLM system evaluations.
|
15 |
-
Root Judge was post-trained from Llama-3.3-70B-Instruct on a high quality, human-annotated dataset mix for pairwise preference choice judgments and multi-turn instruction following with source citing.
|
16 |
-
The model weights are freely
|
17 |
|
18 |
Root Judge’s performance surpasses the Llama-3.3-Instruct model and similar sized open models on Instruction following and
|
19 |
achieves SOTA on hallucination detection compared to leading closed models, at a fraction of the cost.
|
@@ -107,9 +107,9 @@ while also slightly outperforming it on public instruction following benchmarks
|
|
107 |
- **Language(s) (NLP):** Primarily English
|
108 |
- **Finetuned from model:** meta-llama/Llama-3.3-70B-Instruct
|
109 |
|
110 |
-
##
|
111 |
|
112 |
-
We recommend using SGLang for production use together with xml tags for important sections in your prompt. At least 96GB of VRAM is recommended.
|
113 |
While the model runs on 80GB VRAM the effective context size (around 7k total tokens) will be too low for evaluating most RAG inputs.
|
114 |
|
115 |
SGlang example for a single Nvidia H100 (80GB):
|
|
|
11 |
---
|
12 |
# Model Card for RootSignals-Judge-Llama-70B
|
13 |
|
14 |
+
**Root Judge** is a powerful mid-sized model that enables reliable and customizable LLM system evaluations.
|
15 |
+
Root Judge was post-trained from *Llama-3.3-70B-Instruct* on a high quality, human-annotated dataset mix for pairwise preference choice judgments and multi-turn instruction following with source citing.
|
16 |
+
The model weights are freely available in FP8 to facilitate cost effective research as well as commercial use.
|
17 |
|
18 |
Root Judge’s performance surpasses the Llama-3.3-Instruct model and similar sized open models on Instruction following and
|
19 |
achieves SOTA on hallucination detection compared to leading closed models, at a fraction of the cost.
|
|
|
107 |
- **Language(s) (NLP):** Primarily English
|
108 |
- **Finetuned from model:** meta-llama/Llama-3.3-70B-Instruct
|
109 |
|
110 |
+
## Getting Started
|
111 |
|
112 |
+
We recommend using [SGLang](https://github.com/sgl-project/sglang) for production use together with *xml tags* for important sections in your prompt. At least 96GB of VRAM is recommended.
|
113 |
While the model runs on 80GB VRAM the effective context size (around 7k total tokens) will be too low for evaluating most RAG inputs.
|
114 |
|
115 |
SGlang example for a single Nvidia H100 (80GB):
|