Spaces:
Runtime error
Runtime error
Commit
·
d772cf1
1
Parent(s):
88b0888
Update README.md
Browse files
README.md
CHANGED
|
@@ -12,22 +12,50 @@ app_file: app.py
|
|
| 12 |
pinned: false
|
| 13 |
---
|
| 14 |
|
| 15 |
-
# Metric Card for relation_extraction
|
|
|
|
| 16 |
|
| 17 |
-
***Module Card Instructions:*** *Fill out the following subsections. Feel free to take a look at existing metric cards if you'd like examples.*
|
| 18 |
|
| 19 |
## Metric Description
|
| 20 |
-
|
| 21 |
|
| 22 |
## How to Use
|
| 23 |
-
|
|
|
|
|
|
|
| 24 |
|
| 25 |
-
|
|
|
|
|
|
|
| 26 |
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
### Output Values
|
| 32 |
|
| 33 |
*Explain what this metric outputs and provide an example of what the metric output looks like. Modules should return a dictionary with one or multiple key-value pairs, e.g. {"bleu" : 6.02}*
|
|
|
|
| 12 |
pinned: false
|
| 13 |
---
|
| 14 |
|
| 15 |
+
# Metric Card for relation_extraction evalutation
|
| 16 |
+
This metric is for evaluating the quality of relation extraction output. By calculating the Micro and Macro F1 score of every relation extraction outputs to ensure the quality.
|
| 17 |
|
|
|
|
| 18 |
|
| 19 |
## Metric Description
|
| 20 |
+
This metric can be used in relation extraction evaluation.
|
| 21 |
|
| 22 |
## How to Use
|
| 23 |
+
This metric takes 2 inputs, prediction and references(ground truth). Both of them are a list of list of dictionary of entity's name and entity's type:
|
| 24 |
+
```
|
| 25 |
+
import evaluate
|
| 26 |
|
| 27 |
+
# load metric
|
| 28 |
+
metric_path = "Ikala-allen/relation_extraction"
|
| 29 |
+
module = evaluate.load(metric_path)
|
| 30 |
|
| 31 |
+
# Define your predictions and references
|
| 32 |
+
references = [
|
| 33 |
+
[
|
| 34 |
+
{"head": "phip igments", "head_type": "brand", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
|
| 35 |
+
{"head": "tinadaviespigments", "head_type": "brand", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
|
| 36 |
+
]
|
| 37 |
+
]
|
| 38 |
+
|
| 39 |
+
# Example references (ground truth)
|
| 40 |
+
predictions = [
|
| 41 |
+
[
|
| 42 |
+
{"head": "phipigments", "head_type": "product", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
|
| 43 |
+
{"head": "tinadaviespigments", "head_type": "brand", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
|
| 44 |
+
]
|
| 45 |
+
]
|
| 46 |
+
|
| 47 |
+
# Calculate evaluation scores using the loaded metric
|
| 48 |
+
evaluation_scores = module.compute(predictions=predictions, references=references)
|
| 49 |
|
| 50 |
+
print(evaluation_scores)
|
| 51 |
+
>>> {'sell': {'tp': 1, 'fp': 1, 'fn': 1, 'p': 50.0, 'r': 50.0, 'f1': 50.0}, 'ALL': {'tp': 1, 'fp': 1, 'fn': 1, 'p': 50.0, 'r': 50.0, 'f1': 50.0, 'Macro_f1': 50.0, 'Macro_p': 50.0, 'Macro_r': 50.0}}
|
| 52 |
+
```
|
| 53 |
+
|
| 54 |
+
|
| 55 |
+
### Inputs
|
| 56 |
+
- **predictions** (`list` of `list`s of `dictionary`s): relation and its type of prediction
|
| 57 |
+
- **references** (`list` of `list`s of `dictionary`s): references for each relation and its type.
|
| 58 |
+
-
|
| 59 |
### Output Values
|
| 60 |
|
| 61 |
*Explain what this metric outputs and provide an example of what the metric output looks like. Modules should return a dictionary with one or multiple key-value pairs, e.g. {"bleu" : 6.02}*
|