stefan-it commited on
Commit
db2171b
·
verified ·
1 Parent(s): 62fceb1

readme: add benchmarks on NER datasets

Browse files
Files changed (1) hide show
  1. README.md +39 -1
README.md CHANGED
@@ -26,9 +26,47 @@ Version 1 of the Zeitungs-LM was pretrained on the following publicly available
26
 
27
  In total, the pretraining corpus has a size of 133GB.
28
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  # Changelog
30
 
31
- * 02.10.2024: Initial version of the model. More details about pretraining or benchmarks on downstream tasks are coming very soon!
32
 
33
  # Acknowledgements
34
 
 
26
 
27
  In total, the pretraining corpus has a size of 133GB.
28
 
29
+ ## Benchmarks (Named Entity Recognition)
30
+
31
+ We compare our Zeitungs-LM directly to the Europeana BERT model (as Zeitungs-LM is supposed to be the successor of it) on various downstream tasks from the [hmBench](https://github.com/stefan-it/hmBench) repository, which is focussed on Named Entity Recognition.
32
+
33
+ Additionally, we use two additional datasets (ONB and LFT) from the ["A Named Entity Recognition Shootout for German"](https://aclanthology.org/P18-2020/) paper.
34
+
35
+ We report averaged micro F1-Score over 5 runs with different seeds and use the best hyper-parameter configuration on the development set of each dataset to report the final test score.
36
+
37
+ ### Development Set
38
+
39
+ The results on the development set can be seen in the following table:
40
+
41
+ | Model \ Dataset | [LFT][1] | [ONB][2] | [HisGermaNER][3] | [HIPE-2020][4] | [NewsEye][5] | [AjMC][6] | Avg. |
42
+ |:--------------------|:---------|:---------|:-----------------|:---------------|:-------------|:----------|:----------|
43
+ | [Europeana BERT][7] | 79.22 | 88.20 | 81.41 | 80.92 | 41.65 | 87.91 | 76.55 |
44
+ | Zeitungs-LM v1 | 79.39 | 88.53 | 83.10 | 81.55 | 44.53 | 89.71 | **77.80** |
45
+
46
+ Our Zeitungs-LM leads to a performance boost of 1.25% compared to the German Europeana BERT model.
47
+
48
+ ### Test Set
49
+
50
+ The final results on the test set can be seen here:
51
+
52
+ | Model \ Dataset | [LFT][1] | [ONB][2] | [HisGermaNER][3] | [HIPE-2020][4] | [NewsEye][5] | [AjMC][6] | Avg.
53
+ |:--------------------|:---------|:---------|:-----------------|:---------------|:-------------|:----------|:---------|
54
+ | [Europeana BERT][7] | 80.43 | 84.39 | 83.21 | 77.49 | 42.96 | 90.52 | 76.50 |
55
+ | Zeitungs-LM v1 | 80.35 | 87.28 | 84.92 | 79.91 | 47.16 | 92.76 | **78.73**|
56
+
57
+ Our Zeitungs-LM beats the German Europeana BERT model by a large margin (2.23%).
58
+
59
+ [1]: https://aclanthology.org/P18-2020/
60
+ [2]: https://aclanthology.org/P18-2020/
61
+ [3]: https://huggingface.co/datasets/stefan-it/HisGermaNER
62
+ [4]: https://github.com/hipe-eval/HIPE-2022-data/blob/main/documentation/README-hipe2020.md
63
+ [5]: https://github.com/hipe-eval/HIPE-2022-data/blob/main/documentation/README-newseye.md
64
+ [6]: https://github.com/hipe-eval/HIPE-2022-data/blob/main/documentation/README-ajmc.md
65
+ [7]: https://huggingface.co/dbmdz/bert-base-german-europeana-cased
66
+
67
  # Changelog
68
 
69
+ * 02.10.2024: Initial version of the model. More details are coming very soon!
70
 
71
  # Acknowledgements
72