|
--- |
|
library_name: transformers |
|
license: apache-2.0 |
|
--- |
|
|
|
|
|
See [transformers で複数のトークナイザーを一つのプロセッサーで扱う](https://zenn.dev/platina/articles/732feb7c3e9852). |
|
|
|
https://zenn.dev/platina/articles/732feb7c3e9852 |
|
|
|
|
|
## Example usage |
|
|
|
```py |
|
from transformers import AutoProcessor |
|
|
|
processor = AutoProcessor.from_pretrained( |
|
"p1atdev/multi-tokenizers-processor-sample", |
|
trust_remote_code=True, |
|
commit_hash="111e8a30609fb5bc13e16d08f7c49196b23d5056" |
|
) |
|
|
|
print(processor( |
|
text_1="テキスト1", |
|
text_2="テキスト2", |
|
)) |
|
# {'input_ids': tensor([[ 1, 43412, 28745]]), 'attention_mask': tensor([[1, 1, 1]]), 'input_ids_2': tensor([[56833, 61803, 70534, 17]]), 'attention_mask_2': tensor([[1, 1, 1, 1]])} |
|
``` |
|
|
|
|
|
|
|
|
|
|