avemio-digital commited on
Commit
63e4c6e
verified
1 Parent(s): c6bdb42

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -132,7 +132,7 @@ Four evaluation metrics were employed across all subsets: language quality, over
132
  - **Instruction following:** This metric assessed the model's ability to follow specific instructions provided for each task.
133
  - **Overall score:** This metric combined the results from the previous three metrics, offering a comprehensive evaluation of the model's capabilities across all subsets.
134
 
135
- | Metric | [Vanila-Phi-3.5-Mini-4B](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) | [GRAG-SFT](https://huggingface.co/avemio/GRAG-PHI-3.5-MINI-4B-SFT-HESSIAN-AI) | [GRAG-ORPO](https://huggingface.co/avemio/GRAG-PHI-3.5-MINI-4B-ORPO-HESSIAN-AI) | [GRAG-MERGED]() | GPT-3.5-TURBO |
136
  |------------------------------------------|---------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|-----------------------------|
137
  | **Average_language_quality** | 85.88 | 89.61 | 89.1 | |
138
  | **extraction_recall_weighted_overall_score** | 35.2 | 52.3 | 48.8 | |
 
132
  - **Instruction following:** This metric assessed the model's ability to follow specific instructions provided for each task.
133
  - **Overall score:** This metric combined the results from the previous three metrics, offering a comprehensive evaluation of the model's capabilities across all subsets.
134
 
135
+ | Metric | [Vanila-Phi-3.5-Mini-4B](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) | [GRAG-SFT](https://huggingface.co/avemio/GRAG-PHI-3.5-MINI-4B-SFT-HESSIAN-AI) | [GRAG-ORPO](https://huggingface.co/avemio/GRAG-PHI-3.5-MINI-4B-ORPO-HESSIAN-AI) | [GRAG-MERGED]() |
136
  |------------------------------------------|---------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|-----------------------------|
137
  | **Average_language_quality** | 85.88 | 89.61 | 89.1 | |
138
  | **extraction_recall_weighted_overall_score** | 35.2 | 52.3 | 48.8 | |