Update README.md
Browse files
README.md
CHANGED
@@ -58,13 +58,12 @@ The second dataset is the human-annotated dataset that is used for training part
|
|
58 |
|
59 |
#### Training Hyperparameters
|
60 |
|
61 |
-
- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
|
62 |
|
63 |
|
64 |
[More Information Needed]
|
65 |
|
66 |
## Evaluation
|
67 |
-
The following figure below displays the performance and compares it to two benchmarks (
|
68 |
`
|
69 |
|
70 |
The figure below displays the confusion matrix of the individual classes on the test set.
|
|
|
58 |
|
59 |
#### Training Hyperparameters
|
60 |
|
|
|
61 |
|
62 |
|
63 |
[More Information Needed]
|
64 |
|
65 |
## Evaluation
|
66 |
+
The following figure below displays the performance and compares it to two benchmarks ([scores as csv](https://huggingface.co/Sami92/XLM-R-Large-PartyPress/blob/main/scores.csv)). The first benchmark is the coder agreement of the two coders per country (for details, see [Erfort et al. (2023)](https://journals.sagepub.com/doi/10.1177/20531680231183512)). It is referred to as Coder F1 and the difference between the model performance and the coder agreement is referred to as Coder Difference. The model comes close to the agreement of human coders in almost all classes. One notable exception is Foreign Trade and to a lesser extent Defence and Law and Crime. The second benchmark are the results of partypress/partypress-multilingual, referred to as Party Press F1 and the difference to the present model is referred to as Party Press Difference. Except for Foreign Trade and Law and Crime, the present model is on par or stronger than the other Party Press Model. In total it achieves an F1 score that is .06 higher.
|
67 |
`
|
68 |
|
69 |
The figure below displays the confusion matrix of the individual classes on the test set.
|