Update README.md
Browse files
README.md
CHANGED
@@ -142,6 +142,19 @@ Four evaluation metrics were employed across all subsets: language quality, over
|
|
142 |
| **reasoning_weighted_overall_score** | 69.4 | 71.5 | 73.4 | |
|
143 |
| **relevant_context_weighted_overall_score** | 71.3 | 69.1 | 65.5 | |
|
144 |
| **summarizations_weighted_overall_score** | 73.8 | 81.6 | 80.3 | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
145 |
## Model Details
|
146 |
|
147 |
### Data
|
|
|
142 |
| **reasoning_weighted_overall_score** | 69.4 | 71.5 | 73.4 | |
|
143 |
| **relevant_context_weighted_overall_score** | 71.3 | 69.1 | 65.5 | |
|
144 |
| **summarizations_weighted_overall_score** | 73.8 | 81.6 | 80.3 | |
|
145 |
+
|
146 |
+
|
147 |
+
| Metric | [Vanila-Phi-3.5-Mini-4B](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) | [GRAG-SFT](https://huggingface.co/avemio/GRAG-PHI-3.5-MINI-4B-SFT-HESSIAN-AI) | [GRAG-ORPO](https://huggingface.co/avemio/GRAG-PHI-3.5-MINI-4B-ORPO-HESSIAN-AI) | [GRAG-MERGED]() | GPT-3.5-TURBO |
|
148 |
+
|------------------------------------------|---------------------------------------------------------------------------------|--------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|-----------------------------|----------------|
|
149 |
+
| **Average_language_quality** | 85.88 | 89.61 | 89.1 | | |
|
150 |
+
| **extraction_recall_weighted_overall_score** | 35.2 | 52.3 | 48.8 | | |
|
151 |
+
| **qa_multiple_references_weighted_overall_score** | 65.3 | 71.0 | 74.0 | | |
|
152 |
+
| **qa_without_time_difference_weighted_overall_score** | 71.5 | 85.6 | 85.6 | | |
|
153 |
+
| **qa_with_time_difference_weighted_overall_score** | 65.3 | 87.9 | 85.4 | | |
|
154 |
+
| **reasoning_weighted_overall_score** | 69.4 | 71.5 | 73.4 | | |
|
155 |
+
| **relevant_context_weighted_overall_score** | 71.3 | 69.1 | 65.5 | | |
|
156 |
+
| **summarizations_weighted_overall_score** | 73.8 | 81.6 | 80.3 | | |
|
157 |
+
|
158 |
## Model Details
|
159 |
|
160 |
### Data
|