BPE tokenizer based on tiktoken.cl100k_base.
tiktoken.cl100k_base
from transformers import AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("stabilityai/arcade100k", trust_remote_code=True) tokenizer("hello, world!", return_tensors='pt')