Merge-Effect
Collection
17 items
•
Updated
Tokenisers trained on the MiniPile. The _raw_tokenisers
folder contains the original tokenisers trained with a vocabulary size of 320k. Then, each folder is a transformers
-compatible tokeniser of a smaller size.