avemio
/

German-RAG-PHI-3.5-MINI-4B-SFT-HESSIAN-AI

@@ -132,7 +132,7 @@ Four evaluation metrics were employed across all subsets: language quality, over
 -   **Instruction following:** This metric assessed the model's ability to follow specific instructions provided for each task.
 -   **Overall score:** This metric combined the results from the previous three metrics, offering a comprehensive evaluation of the model's capabilities across all subsets.
-| Metric                                    | [Vanila-Phi-3.5-Mini-4B](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) | [GRAG-SFT](https://huggingface.co/avemio/GRAG-PHI-3.5-MINI-4B-SFT-HESSIAN-AI) | [GRAG-ORPO](https://huggingface.co/avemio/GRAG-PHI-3.5-MINI-4B-ORPO-HESSIAN-AI) | [GRAG-MERGED]() | GPT-3.5-TURBO |
 |------------------------------------------|---------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|-----------------------------|
 | **Average_language_quality**             | 85.88                                                                          | 89.61                                                                                      |        89.1                                                                                        |                             |
 | **extraction_recall_weighted_overall_score**       | 35.2                                                                          | 52.3                                                                                      |       48.8                                                                                         |                             |

 -   **Instruction following:** This metric assessed the model's ability to follow specific instructions provided for each task.
 -   **Overall score:** This metric combined the results from the previous three metrics, offering a comprehensive evaluation of the model's capabilities across all subsets.
+| Metric                                    | [Vanila-Phi-3.5-Mini-4B](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) | [GRAG-SFT](https://huggingface.co/avemio/GRAG-PHI-3.5-MINI-4B-SFT-HESSIAN-AI) | [GRAG-ORPO](https://huggingface.co/avemio/GRAG-PHI-3.5-MINI-4B-ORPO-HESSIAN-AI) | [GRAG-MERGED]() |
 |------------------------------------------|---------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|-----------------------------|
 | **Average_language_quality**             | 85.88                                                                          | 89.61                                                                                      |        89.1                                                                                        |                             |
 | **extraction_recall_weighted_overall_score**       | 35.2                                                                          | 52.3                                                                                      |       48.8                                                                                         |                             |