|
<!DOCTYPE html> |
|
<html> |
|
<head> |
|
<title>Bootstrap Online Editor</title> |
|
<meta name="viewport" content="width=device-width, initial-scale=1"> |
|
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/css/bootstrap.min.css"> |
|
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.6.0/jquery.min.js"></script> |
|
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/js/bootstrap.min.js"></script> |
|
</head> |
|
<body> |
|
|
|
<div class="container"> |
|
<h2 style="text-align: center;">NLPre-PL Dataset</h2> |
|
<p>The official NLPre-PL dataset - a uniformly paragraph-level divided version of NKJP1M corpus – the 1-million token balanced subcorpus of the National Corpus of Polish (Narodowy Korpus Języka Polskiego). |
|
</p> |
|
<p></p> |
|
The NLPre dataset aims at fairly dividing the paragraphs length-wise and topic-wise into train, development, and test sets. Thus, we ensure a similar number of segments distribution per paragraph and avoid the situation when paragraphs with a small (or large) number of segments are available only e.g. during test time. |
|
</p> |
|
<p> |
|
<a style="text-align: center;" href="http://huggingface.co/datasets/ipipan/nlprepl" class="btn btn-primary btn-lg active" role="button" aria-pressed="true">🤗 NLPre-PL Dataset</a> |
|
|
|
<a style="text-align: center;"href="http://git.nlp.ipipan.waw.pl/alina/PDBUD" class="btn btn-primary btn-lg active" role="button" aria-pressed="true">🤗 PDB-UD Dataset</a> |
|
</p> |
|
|
|
<div><p></p></div> |
|
|
|
<div class="container"> |
|
<h2 style="text-align: center;">NLPre-PL Trained models</h2> |
|
<p>Here are listed all available models, trained for the purpouse of creating NLPre-PL Benchmark.</p> |
|
|
|
<div class="alert alert-primary" role="alert"> |
|
COMBO |
|
</div> |
|
|
|
<p><b>UD TAGSET</b></p> |
|
<p> |
|
<ul class="list-group list-group-light list-group-small"> |
|
|
|
|
|
<li class="list-group-item"><a href="https://git.nlp.ipipan.waw.pl/alina/PDBUD" class="btn btn-seconday btn-lg active" target="_blank" > alina</a></li> |
|
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_ud_herBERT_pdb" class="btn btn-seconday btn-lg active" > COMBO + HerBERT + PDB-UD</a></li> |
|
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_ud_fasttext_pdb" class="btn btn-seconday btn-lg active" >COMBO + fasttext + PDB-UD </a></li> |
|
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_ud_herBERT_nkjp-by-name" class="btn btn-seconday btn-lg active" > COMBO + HerBERT + NLPrePL-fair-by-name </a></li> |
|
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_ud_herBERT_nkjp-by-type" class="btn btn-seconday btn-lg active" > COMBO + HerBERT + NLPrePL-fair-by-type </a></li> |
|
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_ud_fasttext_nkjp-by-name" class="btn btn-seconday btn-lg active" > COMBO + fasttext + NLPrePL-fair-by-name </a></li> |
|
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_ud_fasttext_nkjp-by-type" class="btn btn-seconday btn-lg active" > COMBO + fasttext + NLPrePL-fair-by-type </a></li> |
|
</ul> |
|
</p> |
|
|
|
<p><b>NKJP TAGSET</b></p> |
|
<p> |
|
<ul class="list-group list-group-light list-group-small"> |
|
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_nkjp_herBERT_nkjp-by-name" class="btn btn-seconday btn-lg active" > COMBO + HerBERT + NLPrePL-fair-by-name</a></li> |
|
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_nkjp_herBERT_nkjp-by-type" class="btn btn-seconday btn-lg active" >COMBO + HerBERT + NLPrePL-fair-by-type </a></li> |
|
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_nkjp_fasttext_nkjp-by-name" class="btn btn-seconday btn-lg active" > COMBO + fasttext + NLPrePL-fair-by-name</a></li> |
|
<li class="list-group-item"><a href="https://huggingface.co/ipipan/nlpre_combo_nkjp_fasttext_nkjp-by-type" class="btn btn-seconday btn-lg active" >COMBO + fasttext + NLPrePL-fair-by-type </a></li> |
|
</ul> |
|
|
|
|
|
|
|
|
|
|
|
|
|
</p> |
|
|
|
<div class="alert alert-primary" role="alert"> |
|
Spacy |
|
</div> |
|
|
|
<div class="alert alert-primary" role="alert"> |
|
Stanza |
|
</div> |
|
|
|
<div class="alert alert-primary" role="alert"> |
|
Trankit |
|
</div> |
|
|
|
|
|
</div> |
|
|
|
</body> |
|
</html> |