songy / transformers /docs /source /en /internal /tokenization_utils.md
trishv's picture
Upload 2383 files
96e9536
|
raw
history blame
1.54 kB

Utilities for Tokenizers

This page lists all the utility functions used by the tokenizers, mainly the class [~tokenization_utils_base.PreTrainedTokenizerBase] that implements the common methods between [PreTrainedTokenizer] and [PreTrainedTokenizerFast] and the mixin [~tokenization_utils_base.SpecialTokensMixin].

Most of those are only useful if you are studying the code of the tokenizers in the library.

PreTrainedTokenizerBase

[[autodoc]] tokenization_utils_base.PreTrainedTokenizerBase - call - all

SpecialTokensMixin

[[autodoc]] tokenization_utils_base.SpecialTokensMixin

Enums and namedtuples

[[autodoc]] tokenization_utils_base.TruncationStrategy

[[autodoc]] tokenization_utils_base.CharSpan

[[autodoc]] tokenization_utils_base.TokenSpan