File size: 4,329 Bytes
fcdec08 9edc948 fcdec08 9edc948 fcdec08 9edc948 fcdec08 9edc948 6850f09 9edc948 7b4bf40 48a1c0a b033006 7b4bf40 9edc948 7b4bf40 fcdec08 9edc948 fcdec08 9edc948 fcdec08 9edc948 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
<!DOCTYPE html>
<html>
<head>
<title>Bootstrap Online Editor</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/css/bootstrap.min.css">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.6.0/jquery.min.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/js/bootstrap.min.js"></script>
</head>
<body>
<div class="container">
<h2 style="text-align: center;">NLPre-PL Dataset</h2>
<p>The official NLPre-PL dataset - a uniformly paragraph-level divided version of NKJP1M corpus – the 1-million token balanced subcorpus of the National Corpus of Polish (Narodowy Korpus Języka Polskiego).
</p>
<p></p>
The NLPre dataset aims at fairly dividing the paragraphs length-wise and topic-wise into train, development, and test sets. Thus, we ensure a similar number of segments distribution per paragraph and avoid the situation when paragraphs with a small (or large) number of segments are available only e.g. during test time.
</p>
<p>
<a style="text-align: center;" href="http://huggingface.co/datasets/ipipan/nlprepl" class="btn btn-primary btn-lg active" role="button" aria-pressed="true">🤗 NLPre-PL Dataset</a>
<a style="text-align: center;"href="http://git.nlp.ipipan.waw.pl/alina/PDBUD" class="btn btn-primary btn-lg active" role="button" aria-pressed="true">🤗 PDB-UD Dataset</a>
</p>
<div><p></p></div>
<div class="container">
<h2 style="text-align: center;">NLPre-PL Trained models</h2>
<p>Here are listed all available models, trained for the purpouse of creating NLPre-PL Benchmark.</p>
<div class="alert alert-primary" role="alert">
COMBO
</div>
<p><b>UD TAGSET</b></p>
<p>
<ul class="list-group list-group-light list-group-small">
<li class="list-group-item"><a href="https://git.nlp.ipipan.waw.pl/alina/PDBUD" class="btn btn-seconday btn-lg active" target="_blank" > alina</a></li>
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_ud_herBERT_pdb" class="btn btn-seconday btn-lg active" > COMBO + HerBERT + PDB-UD</a></li>
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_ud_fasttext_pdb" class="btn btn-seconday btn-lg active" >COMBO + fasttext + PDB-UD </a></li>
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_ud_herBERT_nkjp-by-name" class="btn btn-seconday btn-lg active" > COMBO + HerBERT + NLPrePL-fair-by-name </a></li>
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_ud_herBERT_nkjp-by-type" class="btn btn-seconday btn-lg active" > COMBO + HerBERT + NLPrePL-fair-by-type </a></li>
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_ud_fasttext_nkjp-by-name" class="btn btn-seconday btn-lg active" > COMBO + fasttext + NLPrePL-fair-by-name </a></li>
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_ud_fasttext_nkjp-by-type" class="btn btn-seconday btn-lg active" > COMBO + fasttext + NLPrePL-fair-by-type </a></li>
</ul>
</p>
<p><b>NKJP TAGSET</b></p>
<p>
<ul class="list-group list-group-light list-group-small">
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_nkjp_herBERT_nkjp-by-name" class="btn btn-seconday btn-lg active" > COMBO + HerBERT + NLPrePL-fair-by-name</a></li>
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_nkjp_herBERT_nkjp-by-type" class="btn btn-seconday btn-lg active" >COMBO + HerBERT + NLPrePL-fair-by-type </a></li>
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_nkjp_fasttext_nkjp-by-name" class="btn btn-seconday btn-lg active" > COMBO + fasttext + NLPrePL-fair-by-name</a></li>
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_nkjp_fasttext_nkjp-by-type" class="btn btn-seconday btn-lg active" >COMBO + fasttext + NLPrePL-fair-by-type </a></li>
</ul>
</p>
<div class="alert alert-primary" role="alert">
Spacy
</div>
<div class="alert alert-primary" role="alert">
Stanza
</div>
<div class="alert alert-primary" role="alert">
Trankit
</div>
</div>
</body>
</html> |