Sami92 commited on
Commit
8b9d06f
·
verified ·
1 Parent(s): af5e10c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -2
README.md CHANGED
@@ -58,13 +58,12 @@ The second dataset is the human-annotated dataset that is used for training part
58
 
59
  #### Training Hyperparameters
60
 
61
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
62
 
63
 
64
  [More Information Needed]
65
 
66
  ## Evaluation
67
- The following figure below displays the performance and compares it to two benchmarks ((scores as csv)[https://huggingface.co/Sami92/XLM-R-Large-PartyPress/blob/main/scores.csv]). The first benchmark is the coder agreement of the two coders per country (for details, see [Erfort et al. (2023)](https://journals.sagepub.com/doi/10.1177/20531680231183512)). It is referred to as Coder F1 and the difference between the model performance and the coder agreement is referred to as Coder Difference. The model comes close to the agreement of human coders in almost all classes. One notable exception is Foreign Trade and to a lesser extent Defence and Law and Crime. The second benchmark are the results of partypress/partypress-multilingual, referred to as Party Press F1 and the difference to the present model is referred to as Party Press Difference. Except for Foreign Trade and Law and Crime, the present model is on par or stronger than the other Party Press Model. In total it achieves an F1 score that is .06 higher.
68
  `![](./scores.png)
69
 
70
  The figure below displays the confusion matrix of the individual classes on the test set.
 
58
 
59
  #### Training Hyperparameters
60
 
 
61
 
62
 
63
  [More Information Needed]
64
 
65
  ## Evaluation
66
+ The following figure below displays the performance and compares it to two benchmarks ([scores as csv](https://huggingface.co/Sami92/XLM-R-Large-PartyPress/blob/main/scores.csv)). The first benchmark is the coder agreement of the two coders per country (for details, see [Erfort et al. (2023)](https://journals.sagepub.com/doi/10.1177/20531680231183512)). It is referred to as Coder F1 and the difference between the model performance and the coder agreement is referred to as Coder Difference. The model comes close to the agreement of human coders in almost all classes. One notable exception is Foreign Trade and to a lesser extent Defence and Law and Crime. The second benchmark are the results of partypress/partypress-multilingual, referred to as Party Press F1 and the difference to the present model is referred to as Party Press Difference. Except for Foreign Trade and Law and Crime, the present model is on par or stronger than the other Party Press Model. In total it achieves an F1 score that is .06 higher.
67
  `![](./scores.png)
68
 
69
  The figure below displays the confusion matrix of the individual classes on the test set.