File size: 4,329 Bytes
fcdec08
9edc948
fcdec08
9edc948
fcdec08
9edc948
 
 
fcdec08
 
 
9edc948
 
 
 
 
 
 
 
6850f09
9edc948
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7b4bf40
48a1c0a
 
b033006
7b4bf40
 
 
 
 
 
 
9edc948
 
 
 
7b4bf40
 
 
 
 
 
 
 
 
 
fcdec08
 
9edc948
 
 
 
 
 
 
 
fcdec08
 
9edc948
 
 
 
 
 
fcdec08
 
9edc948
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
<!DOCTYPE html>
<html>
<head>
  <title>Bootstrap Online Editor</title>
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/css/bootstrap.min.css">
  <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.6.0/jquery.min.js"></script>
  <script src="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/js/bootstrap.min.js"></script>
</head>
<body>

<div class="container">
  <h2 style="text-align: center;">NLPre-PL Dataset</h2>
  <p>The official NLPre-PL dataset - a uniformly paragraph-level divided version of NKJP1M corpus – the 1-million token balanced subcorpus of the National Corpus of Polish (Narodowy Korpus Języka Polskiego).
  </p>
  <p></p>
    The NLPre dataset aims at fairly dividing the paragraphs length-wise and topic-wise into train, development, and test sets. Thus, we ensure a similar number of segments distribution per paragraph and avoid the situation when paragraphs with a small (or large) number of segments are available only e.g. during test time.
    </p>
    <p>
  <a  style="text-align: center;" href="http://huggingface.co/datasets/ipipan/nlprepl" class="btn btn-primary btn-lg active" role="button" aria-pressed="true">&#129303; NLPre-PL Dataset</a>
  
  <a   style="text-align: center;"href="http://git.nlp.ipipan.waw.pl/alina/PDBUD" class="btn btn-primary btn-lg active" role="button" aria-pressed="true">&#129303; PDB-UD Dataset</a>
  </p>
  
<div><p></p></div>

<div class="container">
  <h2 style="text-align: center;">NLPre-PL Trained models</h2>
    <p>Here are listed all available models, trained for the purpouse of creating NLPre-PL Benchmark.</p>
    
<div class="alert alert-primary" role="alert">
  COMBO
</div>

<p><b>UD TAGSET</b></p>
<p>
<ul class="list-group list-group-light list-group-small">

  
      <li class="list-group-item"><a href="https://git.nlp.ipipan.waw.pl/alina/PDBUD" class="btn btn-seconday btn-lg active" target="_blank" > alina</a></li>
      <li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_ud_herBERT_pdb" class="btn btn-seconday btn-lg active" > COMBO + HerBERT + PDB-UD</a></li>
      <li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_ud_fasttext_pdb" class="btn btn-seconday btn-lg active" >COMBO + fasttext + PDB-UD </a></li>
      <li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_ud_herBERT_nkjp-by-name" class="btn btn-seconday btn-lg active" > COMBO + HerBERT + NLPrePL-fair-by-name </a></li>
      <li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_ud_herBERT_nkjp-by-type" class="btn btn-seconday btn-lg active" > COMBO + HerBERT + NLPrePL-fair-by-type </a></li>
      <li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_ud_fasttext_nkjp-by-name" class="btn btn-seconday btn-lg active" > COMBO + fasttext + NLPrePL-fair-by-name </a></li>
      <li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_ud_fasttext_nkjp-by-type" class="btn btn-seconday btn-lg active" > COMBO + fasttext + NLPrePL-fair-by-type </a></li>
</ul>
</p>

<p><b>NKJP TAGSET</b></p>
<p>
    <ul class="list-group list-group-light list-group-small">
        <li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_nkjp_herBERT_nkjp-by-name" class="btn btn-seconday btn-lg active" > COMBO + HerBERT + NLPrePL-fair-by-name</a></li>
        <li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_nkjp_herBERT_nkjp-by-type" class="btn btn-seconday btn-lg active" >COMBO + HerBERT + NLPrePL-fair-by-type  </a></li>
        <li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_nkjp_fasttext_nkjp-by-name" class="btn btn-seconday btn-lg active" > COMBO + fasttext + NLPrePL-fair-by-name</a></li>
        <li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_nkjp_fasttext_nkjp-by-type" class="btn btn-seconday btn-lg active" >COMBO + fasttext + NLPrePL-fair-by-type  </a></li>
    </ul>






</p>

<div class="alert alert-primary" role="alert">
  Spacy
</div>

<div class="alert alert-primary" role="alert">
  Stanza
</div>

<div class="alert alert-primary" role="alert">
  Trankit
</div>


</div>

</body>
</html>