trishv's picture
Upload 2383 files
96e9536
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
โš ๏ธ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# ๋ฒˆ์—ญ[[translation]]
[[open-in-colab]]
<Youtube id="1JvfrvZgi6c"/>
๋ฒˆ์—ญ์€ ํ•œ ์–ธ์–ด๋กœ ๋œ ์‹œํ€€์Šค๋ฅผ ๋‹ค๋ฅธ ์–ธ์–ด๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ๋ฒˆ์—ญ์ด๋‚˜ ์š”์•ฝ์€ ์ž…๋ ฅ์„ ๋ฐ›์•„ ์ผ๋ จ์˜ ์ถœ๋ ฅ์„ ๋ฐ˜ํ™˜ํ•˜๋Š” ๊ฐ•๋ ฅํ•œ ํ”„๋ ˆ์ž„์›Œํฌ์ธ ์‹œํ€€์Šค-ํˆฌ-์‹œํ€€์Šค ๋ฌธ์ œ๋กœ ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” ๋Œ€ํ‘œ์ ์ธ ํƒœ์Šคํฌ์ž…๋‹ˆ๋‹ค. ๋ฒˆ์—ญ ์‹œ์Šคํ…œ์€ ์ผ๋ฐ˜์ ์œผ๋กœ ๋‹ค๋ฅธ ์–ธ์–ด๋กœ ๋œ ํ…์ŠคํŠธ ๊ฐ„์˜ ๋ฒˆ์—ญ์— ์‚ฌ์šฉ๋˜์ง€๋งŒ, ์Œ์„ฑ ๊ฐ„์˜ ํ†ต์—ญ์ด๋‚˜ ํ…์ŠคํŠธ-์Œ์„ฑ ๋˜๋Š” ์Œ์„ฑ-ํ…์ŠคํŠธ์™€ ๊ฐ™์€ ์กฐํ•ฉ์—๋„ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์ด ๊ฐ€์ด๋“œ์—์„œ ํ•™์Šตํ•  ๋‚ด์šฉ์€:
1. ์˜์–ด ํ…์ŠคํŠธ๋ฅผ ํ”„๋ž‘์Šค์–ด๋กœ ๋ฒˆ์—ญํ•˜๊ธฐ ์œ„ํ•ด [T5](https://huggingface.co/t5-small) ๋ชจ๋ธ์„ OPUS Books ๋ฐ์ดํ„ฐ์„ธํŠธ์˜ ์˜์–ด-ํ”„๋ž‘์Šค์–ด ํ•˜์œ„ ์ง‘ํ•ฉ์œผ๋กœ ํŒŒ์ธํŠœ๋‹ํ•˜๋Š” ๋ฐฉ๋ฒ•๊ณผ
2. ํŒŒ์ธํŠœ๋‹๋œ ๋ชจ๋ธ์„ ์ถ”๋ก ์— ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.
<Tip>
์ด ํƒœ์Šคํฌ ๊ฐ€์ด๋“œ๋Š” ์•„๋ž˜ ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜์—๋„ ์‘์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
<!--This tip is automatically generated by `make fix-copies`, do not fill manually!-->
[BART](../model_doc/bart), [BigBird-Pegasus](../model_doc/bigbird_pegasus), [Blenderbot](../model_doc/blenderbot), [BlenderbotSmall](../model_doc/blenderbot-small), [Encoder decoder](../model_doc/encoder-decoder), [FairSeq Machine-Translation](../model_doc/fsmt), [GPTSAN-japanese](../model_doc/gptsan-japanese), [LED](../model_doc/led), [LongT5](../model_doc/longt5), [M2M100](../model_doc/m2m_100), [Marian](../model_doc/marian), [mBART](../model_doc/mbart), [MT5](../model_doc/mt5), [MVP](../model_doc/mvp), [NLLB](../model_doc/nllb), [NLLB-MOE](../model_doc/nllb-moe), [Pegasus](../model_doc/pegasus), [PEGASUS-X](../model_doc/pegasus_x), [PLBart](../model_doc/plbart), [ProphetNet](../model_doc/prophetnet), [SwitchTransformers](../model_doc/switch_transformers), [T5](../model_doc/t5), [XLM-ProphetNet](../model_doc/xlm-prophetnet)
<!--End of the generated tip-->
</Tip>
์‹œ์ž‘ํ•˜๊ธฐ ์ „์— ํ•„์š”ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ๋ชจ๋‘ ์„ค์น˜๋˜์–ด ์žˆ๋Š”์ง€ ํ™•์ธํ•˜์„ธ์š”:
```bash
pip install transformers datasets evaluate sacrebleu
```
๋ชจ๋ธ์„ ์—…๋กœ๋“œํ•˜๊ณ  ์ปค๋ฎค๋‹ˆํ‹ฐ์™€ ๊ณต์œ ํ•  ์ˆ˜ ์žˆ๋„๋ก Hugging Face ๊ณ„์ •์— ๋กœ๊ทธ์ธํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. ์ƒˆ๋กœ์šด ์ฐฝ์ด ํ‘œ์‹œ๋˜๋ฉด ํ† ํฐ์„ ์ž…๋ ฅํ•˜์—ฌ ๋กœ๊ทธ์ธํ•˜์„ธ์š”.
```py
>>> from huggingface_hub import notebook_login
>>> notebook_login()
```
## OPUS Books ๋ฐ์ดํ„ฐ์„ธํŠธ ๊ฐ€์ ธ์˜ค๊ธฐ[[load-opus-books-dataset]]
๋จผ์ € ๐Ÿค— Datasets ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์—์„œ [OPUS Books](https://huggingface.co/datasets/opus_books) ๋ฐ์ดํ„ฐ์„ธํŠธ์˜ ์˜์–ด-ํ”„๋ž‘์Šค์–ด ํ•˜์œ„ ์ง‘ํ•ฉ์„ ๊ฐ€์ ธ์˜ค์„ธ์š”.
```py
>>> from datasets import load_dataset
>>> books = load_dataset("opus_books", "en-fr")
```
๋ฐ์ดํ„ฐ์„ธํŠธ๋ฅผ [`~datasets.Dataset.train_test_split`] ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ›ˆ๋ จ ๋ฐ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ๋กœ ๋ถ„ํ• ํ•˜์„ธ์š”.
```py
>>> books = books["train"].train_test_split(test_size=0.2)
```
ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์—์„œ ์˜ˆ์‹œ๋ฅผ ์‚ดํŽด๋ณผ๊นŒ์š”?
```py
>>> books["train"][0]
{'id': '90560',
'translation': {'en': 'But this lofty plateau measured only a few fathoms, and soon we reentered Our Element.',
'fr': 'Mais ce plateau รฉlevรฉ ne mesurait que quelques toises, et bientรดt nous fรปmes rentrรฉs dans notre รฉlรฉment.'}}
```
๋ฐ˜ํ™˜๋œ ๋”•์…”๋„ˆ๋ฆฌ์˜ `translation` ํ‚ค๊ฐ€ ํ…์ŠคํŠธ์˜ ์˜์–ด, ํ”„๋ž‘์Šค์–ด ๋ฒ„์ „์„ ํฌํ•จํ•˜๊ณ  ์žˆ๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
## ์ „์ฒ˜๋ฆฌ[[preprocess]]
<Youtube id="XAR8jnZZuUs"/>
๋‹ค์Œ ๋‹จ๊ณ„๋กœ ์˜์–ด-ํ”„๋ž‘์Šค์–ด ์Œ์„ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด T5 ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๊ฐ€์ ธ์˜ค์„ธ์š”.
```py
>>> from transformers import AutoTokenizer
>>> checkpoint = "t5-small"
>>> tokenizer = AutoTokenizer.from_pretrained(checkpoint)
```
๋งŒ๋“ค ์ „์ฒ˜๋ฆฌ ํ•จ์ˆ˜๋Š” ์•„๋ž˜ ์š”๊ตฌ์‚ฌํ•ญ์„ ์ถฉ์กฑํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค:
1. T5๊ฐ€ ๋ฒˆ์—ญ ํƒœ์Šคํฌ์ž„์„ ์ธ์ง€ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ž…๋ ฅ ์•ž์— ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ถ”๊ฐ€ํ•˜์„ธ์š”. ์—ฌ๋Ÿฌ NLP ํƒœ์Šคํฌ๋ฅผ ํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋ธ ์ค‘ ์ผ๋ถ€๋Š” ์ด๋ ‡๊ฒŒ ํƒœ์Šคํฌ ํ”„๋กฌํ”„ํŠธ๋ฅผ ๋ฏธ๋ฆฌ ์ค˜์•ผํ•ฉ๋‹ˆ๋‹ค.
2. ์›์–ด(์˜์–ด)๊ณผ ๋ฒˆ์—ญ์–ด(ํ”„๋ž‘์Šค์–ด)๋ฅผ ๋ณ„๋„๋กœ ํ† ํฐํ™”ํ•˜์„ธ์š”. ์˜์–ด ์–ดํœ˜๋กœ ์‚ฌ์ „ ํ•™์Šต๋œ ํ† ํฌ๋‚˜์ด์ €๋กœ ํ”„๋ž‘์Šค์–ด ํ…์ŠคํŠธ๋ฅผ ํ† ํฐํ™”ํ•  ์ˆ˜๋Š” ์—†๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.
3. `max_length` ๋งค๊ฐœ๋ณ€์ˆ˜๋กœ ์„ค์ •ํ•œ ์ตœ๋Œ€ ๊ธธ์ด๋ณด๋‹ค ๊ธธ์ง€ ์•Š๋„๋ก ์‹œํ€€์Šค๋ฅผ truncateํ•˜์„ธ์š”.
```py
>>> source_lang = "en"
>>> target_lang = "fr"
>>> prefix = "translate English to French: "
>>> def preprocess_function(examples):
... inputs = [prefix + example[source_lang] for example in examples["translation"]]
... targets = [example[target_lang] for example in examples["translation"]]
... model_inputs = tokenizer(inputs, text_target=targets, max_length=128, truncation=True)
... return model_inputs
```
์ „์ฒด ๋ฐ์ดํ„ฐ์„ธํŠธ์— ์ „์ฒ˜๋ฆฌ ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•˜๋ ค๋ฉด ๐Ÿค— Datasets์˜ [`~datasets.Dataset.map`] ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์„ธ์š”. `map` ํ•จ์ˆ˜์˜ ์†๋„๋ฅผ ๋†’์ด๋ ค๋ฉด `batched=True`๋ฅผ ์„ค์ •ํ•˜์—ฌ ๋ฐ์ดํ„ฐ์„ธํŠธ์˜ ์—ฌ๋Ÿฌ ์š”์†Œ๋ฅผ ํ•œ ๋ฒˆ์— ์ฒ˜๋ฆฌํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ์žˆ์Šต๋‹ˆ๋‹ค.
```py
>>> tokenized_books = books.map(preprocess_function, batched=True)
```
์ด์ œ [`DataCollatorForSeq2Seq`]๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ œ ๋ฐฐ์น˜๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ์„ธํŠธ์˜ ์ตœ๋Œ€ ๊ธธ์ด๋กœ ์ „๋ถ€๋ฅผ paddingํ•˜๋Š” ๋Œ€์‹ , ๋ฐ์ดํ„ฐ ์ •๋ ฌ ์ค‘ ๊ฐ ๋ฐฐ์น˜์˜ ์ตœ๋Œ€ ๊ธธ์ด๋กœ ๋ฌธ์žฅ์„ *๋™์ ์œผ๋กœ padding*ํ•˜๋Š” ๊ฒƒ์ด ๋” ํšจ์œจ์ ์ž…๋‹ˆ๋‹ค.
<frameworkcontent>
<pt>
```py
>>> from transformers import DataCollatorForSeq2Seq
>>> data_collator = DataCollatorForSeq2Seq(tokenizer=tokenizer, model=checkpoint)
```
</pt>
<tf>
```py
>>> from transformers import DataCollatorForSeq2Seq
>>> data_collator = DataCollatorForSeq2Seq(tokenizer=tokenizer, model=checkpoint, return_tensors="tf")
```
</tf>
</frameworkcontent>
## ํ‰๊ฐ€[[evalulate]]
ํ›ˆ๋ จ ์ค‘์— ๋ฉ”ํŠธ๋ฆญ์„ ํฌํ•จํ•˜๋ฉด ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋ฉ๋‹ˆ๋‹ค. ๐Ÿค— [Evaluate](https://huggingface.co/docs/evaluate/index) ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋กœ ํ‰๊ฐ€ ๋ฐฉ๋ฒ•(evaluation method)์„ ๋น ๋ฅด๊ฒŒ ๊ฐ€์ ธ์˜ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ˜„์žฌ ํƒœ์Šคํฌ์— ์ ํ•ฉํ•œ SacreBLEU ๋ฉ”ํŠธ๋ฆญ์„ ๊ฐ€์ ธ์˜ค์„ธ์š”. (๋ฉ”ํŠธ๋ฆญ์„ ๊ฐ€์ ธ์˜ค๊ณ  ๊ณ„์‚ฐํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์•Œ์•„๋ณด๋ ค๋ฉด ๐Ÿค— Evaluate [๋‘˜๋Ÿฌ๋ณด๊ธฐ](https://huggingface.co/docs/evaluate/a_quick_tour)๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”):
```py
>>> import evaluate
>>> metric = evaluate.load("sacrebleu")
```
๊ทธ๋Ÿฐ ๋‹ค์Œ [`~evaluate.EvaluationModule.compute`]์— ์˜ˆ์ธก๊ฐ’๊ณผ ๋ ˆ์ด๋ธ”์„ ์ „๋‹ฌํ•˜์—ฌ SacreBLEU ์ ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ์ƒ์„ฑํ•˜์„ธ์š”:
```py
>>> import numpy as np
>>> def postprocess_text(preds, labels):
... preds = [pred.strip() for pred in preds]
... labels = [[label.strip()] for label in labels]
... return preds, labels
>>> def compute_metrics(eval_preds):
... preds, labels = eval_preds
... if isinstance(preds, tuple):
... preds = preds[0]
... decoded_preds = tokenizer.batch_decode(preds, skip_special_tokens=True)
... labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
... decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)
... decoded_preds, decoded_labels = postprocess_text(decoded_preds, decoded_labels)
... result = metric.compute(predictions=decoded_preds, references=decoded_labels)
... result = {"bleu": result["score"]}
... prediction_lens = [np.count_nonzero(pred != tokenizer.pad_token_id) for pred in preds]
... result["gen_len"] = np.mean(prediction_lens)
... result = {k: round(v, 4) for k, v in result.items()}
... return result
```
์ด์ œ `compute_metrics` ํ•จ์ˆ˜๋Š” ์ค€๋น„๋˜์—ˆ๊ณ , ํ›ˆ๋ จ ๊ณผ์ •์„ ์„ค์ •ํ•  ๋•Œ ๋‹ค์‹œ ์‚ดํŽด๋ณผ ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.
## ํ›ˆ๋ จ[[train]]
<frameworkcontent>
<pt>
<Tip>
[`Trainer`]๋กœ ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ์ต์ˆ™ํ•˜์ง€ ์•Š๋‹ค๋ฉด [์—ฌ๊ธฐ](../training#train-with-pytorch-trainer)์—์„œ ๊ธฐ๋ณธ ํŠœํ† ๋ฆฌ์–ผ์„ ์‚ดํŽด๋ณด์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค!
</Tip>
๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œํ‚ฌ ์ค€๋น„๊ฐ€ ๋˜์—ˆ๊ตฐ์š”! [`AutoModelForSeq2SeqLM`]์œผ๋กœ T5๋ฅผ ๋กœ๋“œํ•˜์„ธ์š”:
```py
>>> from transformers import AutoModelForSeq2SeqLM, Seq2SeqTrainingArguments, Seq2SeqTrainer
>>> model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint)
```
์ด์ œ ์„ธ ๋‹จ๊ณ„๋งŒ ๊ฑฐ์น˜๋ฉด ๋์ž…๋‹ˆ๋‹ค:
1. [`Seq2SeqTrainingArguments`]์—์„œ ํ›ˆ๋ จ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ •์˜ํ•˜์„ธ์š”. ์œ ์ผํ•œ ํ•„์ˆ˜ ๋งค๊ฐœ๋ณ€์ˆ˜๋Š” ๋ชจ๋ธ์„ ์ €์žฅํ•  ์œ„์น˜์ธ `output_dir`์ž…๋‹ˆ๋‹ค. ๋ชจ๋ธ์„ Hub์— ํ‘ธ์‹œํ•˜๊ธฐ ์œ„ํ•ด `push_to_hub=True`๋กœ ์„ค์ •ํ•˜์„ธ์š”. (๋ชจ๋ธ์„ ์—…๋กœ๋“œํ•˜๋ ค๋ฉด Hugging Face์— ๋กœ๊ทธ์ธํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.) [`Trainer`]๋Š” ์—ํญ์ด ๋๋‚ ๋•Œ๋งˆ๋‹ค SacreBLEU ๋ฉ”ํŠธ๋ฆญ์„ ํ‰๊ฐ€ํ•˜๊ณ  ํ›ˆ๋ จ ์ฒดํฌํฌ์ธํŠธ๋ฅผ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
2. [`Seq2SeqTrainer`]์— ํ›ˆ๋ จ ์ธ์ˆ˜๋ฅผ ์ „๋‹ฌํ•˜์„ธ์š”. ๋ชจ๋ธ, ๋ฐ์ดํ„ฐ ์„ธํŠธ, ํ† ํฌ๋‚˜์ด์ €, data collator ๋ฐ `compute_metrics` ํ•จ์ˆ˜๋„ ๋ฉ๋‹ฌ์•„ ์ „๋‹ฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
3. [`~Trainer.train`]์„ ํ˜ธ์ถœํ•˜์—ฌ ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•˜์„ธ์š”.
```py
>>> training_args = Seq2SeqTrainingArguments(
... output_dir="my_awesome_opus_books_model",
... evaluation_strategy="epoch",
... learning_rate=2e-5,
... per_device_train_batch_size=16,
... per_device_eval_batch_size=16,
... weight_decay=0.01,
... save_total_limit=3,
... num_train_epochs=2,
... predict_with_generate=True,
... fp16=True,
... push_to_hub=True,
... )
>>> trainer = Seq2SeqTrainer(
... model=model,
... args=training_args,
... train_dataset=tokenized_books["train"],
... eval_dataset=tokenized_books["test"],
... tokenizer=tokenizer,
... data_collator=data_collator,
... compute_metrics=compute_metrics,
... )
>>> trainer.train()
````
ํ•™์Šต์ด ์™„๋ฃŒ๋˜๋ฉด [`~transformers.Trainer.push_to_hub`] ๋ฉ”์„œ๋“œ๋กœ ๋ชจ๋ธ์„ Hub์— ๊ณต์œ ํ•˜์„ธ์š”. ์ด๋Ÿฌ๋ฉด ๋ˆ„๊ตฌ๋‚˜ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค:
```py
>>> trainer.push_to_hub()
```
</pt>
<tf>
<Tip>
Keras๋กœ ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ์ต์ˆ™ํ•˜์ง€ ์•Š๋‹ค๋ฉด, [์—ฌ๊ธฐ](../training#train-a-tensorflow-model-with-keras)์—์„œ ๊ธฐ๋ณธ ํŠœํ† ๋ฆฌ์–ผ์„ ์‚ดํŽด๋ณด์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค!
</Tip>
TensorFlow์—์„œ ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•˜๋ ค๋ฉด ์šฐ์„  optimizer ํ•จ์ˆ˜, ํ•™์Šต๋ฅ  ์Šค์ผ€์ค„ ๋“ฑ์˜ ํ›ˆ๋ จ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์„ค์ •ํ•˜์„ธ์š”:
```py
>>> from transformers import AdamWeightDecay
>>> optimizer = AdamWeightDecay(learning_rate=2e-5, weight_decay_rate=0.01)
```
์ด์ œ [`TFAutoModelForSeq2SeqLM`]๋กœ T5๋ฅผ ๊ฐ€์ ธ์˜ค์„ธ์š”:
```py
>>> from transformers import TFAutoModelForSeq2SeqLM
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained(checkpoint)
```
[`~transformers.TFPreTrainedModel.prepare_tf_dataset`]๋กœ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ `tf.data.Dataset` ํ˜•์‹์œผ๋กœ ๋ณ€ํ™˜ํ•˜์„ธ์š”:
```py
>>> tf_train_set = model.prepare_tf_dataset(
... tokenized_books["train"],
... shuffle=True,
... batch_size=16,
... collate_fn=data_collator,
... )
>>> tf_test_set = model.prepare_tf_dataset(
... tokenized_books["test"],
... shuffle=False,
... batch_size=16,
... collate_fn=data_collator,
... )
```
ํ›ˆ๋ จํ•˜๊ธฐ ์œ„ํ•ด [`compile`](https://keras.io/api/models/model_training_apis/#compile-method) ๋ฉ”์„œ๋“œ๋กœ ๋ชจ๋ธ์„ ๊ตฌ์„ฑํ•˜์„ธ์š”:
```py
>>> import tensorflow as tf
>>> model.compile(optimizer=optimizer)
```
ํ›ˆ๋ จ์„ ์‹œ์ž‘ํ•˜๊ธฐ ์ „์— ์˜ˆ์ธก๊ฐ’์œผ๋กœ๋ถ€ํ„ฐ SacreBLEU ๋ฉ”ํŠธ๋ฆญ์„ ๊ณ„์‚ฐํ•˜๋Š” ๋ฐฉ๋ฒ•๊ณผ ๋ชจ๋ธ์„ Hub์— ์—…๋กœ๋“œํ•˜๋Š” ๋ฐฉ๋ฒ• ๋‘ ๊ฐ€์ง€๋ฅผ ๋ฏธ๋ฆฌ ์„ค์ •ํ•ด๋‘ฌ์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋‘˜ ๋‹ค [Keras callbacks](../main_classes/keras_callbacks)๋กœ ๊ตฌํ˜„ํ•˜์„ธ์š”.
[`~transformers.KerasMetricCallback`]์— `compute_metrics` ํ•จ์ˆ˜๋ฅผ ์ „๋‹ฌํ•˜์„ธ์š”.
```py
>>> from transformers.keras_callbacks import KerasMetricCallback
>>> metric_callback = KerasMetricCallback(metric_fn=compute_metrics, eval_dataset=tf_validation_set)
```
๋ชจ๋ธ๊ณผ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์—…๋กœ๋“œํ•  ์œ„์น˜๋ฅผ [`~transformers.PushToHubCallback`]์—์„œ ์ง€์ •ํ•˜์„ธ์š”:
```py
>>> from transformers.keras_callbacks import PushToHubCallback
>>> push_to_hub_callback = PushToHubCallback(
... output_dir="my_awesome_opus_books_model",
... tokenizer=tokenizer,
... )
```
์ด์ œ ์ฝœ๋ฐฑ๋“ค์„ ํ•œ๋ฐ๋กœ ๋ฌถ์–ด์ฃผ์„ธ์š”:
```py
>>> callbacks = [metric_callback, push_to_hub_callback]
```
๋“œ๋””์–ด ๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œํ‚ฌ ๋ชจ๋“  ์ค€๋น„๋ฅผ ๋งˆ์ณค๊ตฐ์š”! ์ด์ œ ํ›ˆ๋ จ ๋ฐ ๊ฒ€์ฆ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) ๋ฉ”์„œ๋“œ๋ฅผ ์—ํญ ์ˆ˜์™€ ๋งŒ๋“ค์–ด๋‘” ์ฝœ๋ฐฑ๊ณผ ํ•จ๊ป˜ ํ˜ธ์ถœํ•˜์—ฌ ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•˜์„ธ์š”:
```py
>>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3, callbacks=callbacks)
```
ํ•™์Šต์ด ์™„๋ฃŒ๋˜๋ฉด ๋ชจ๋ธ์ด ์ž๋™์œผ๋กœ Hub์— ์—…๋กœ๋“œ๋˜๊ณ , ๋ˆ„๊ตฌ๋‚˜ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค!
</tf>
</frameworkcontent>
<Tip>
๋ฒˆ์—ญ์„ ์œ„ํ•ด ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ๋ณด๋‹ค ์ž์„ธํ•œ ์˜ˆ์ œ๋Š” ํ•ด๋‹น [PyTorch ๋…ธํŠธ๋ถ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/translation.ipynb) ๋˜๋Š” [TensorFlow ๋…ธํŠธ๋ถ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/translation-tf.ipynb)์„ ์ฐธ์กฐํ•˜์„ธ์š”.
</Tip>
## ์ถ”๋ก [[inference]]
์ข‹์•„์š”, ์ด์ œ ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ–ˆ์œผ๋‹ˆ ์ถ”๋ก ์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค!
๋‹ค๋ฅธ ์–ธ์–ด๋กœ ๋ฒˆ์—ญํ•˜๊ณ  ์‹ถ์€ ํ…์ŠคํŠธ๋ฅผ ์จ๋ณด์„ธ์š”. T5์˜ ๊ฒฝ์šฐ ์›ํ•˜๋Š” ํƒœ์Šคํฌ๋ฅผ ์ž…๋ ฅ์˜ ์ ‘๋‘์‚ฌ๋กœ ์ถ”๊ฐ€ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ์˜์–ด์—์„œ ํ”„๋ž‘์Šค์–ด๋กœ ๋ฒˆ์—ญํ•˜๋Š” ๊ฒฝ์šฐ, ์•„๋ž˜์™€ ๊ฐ™์€ ์ ‘๋‘์‚ฌ๊ฐ€ ์ถ”๊ฐ€๋ฉ๋‹ˆ๋‹ค:
```py
>>> text = "translate English to French: Legumes share resources with nitrogen-fixing bacteria."
```
ํŒŒ์ธํŠœ๋‹๋œ ๋ชจ๋ธ๋กœ ์ถ”๋ก ํ•˜๊ธฐ์— ์ œ์ผ ๊ฐ„๋‹จํ•œ ๋ฐฉ๋ฒ•์€ [`pipeline`]์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ํ•ด๋‹น ๋ชจ๋ธ๋กœ ๋ฒˆ์—ญ `pipeline`์„ ๋งŒ๋“  ๋’ค, ํ…์ŠคํŠธ๋ฅผ ์ „๋‹ฌํ•˜์„ธ์š”:
```py
>>> from transformers import pipeline
>>> translator = pipeline("translation", model="my_awesome_opus_books_model")
>>> translator(text)
[{'translation_text': 'Legumes partagent des ressources avec des bactรฉries azotantes.'}]
```
์›ํ•œ๋‹ค๋ฉด `pipeline`์˜ ๊ฒฐ๊ณผ๋ฅผ ์ง์ ‘ ๋ณต์ œํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค:
<frameworkcontent>
<pt>
ํ…์ŠคํŠธ๋ฅผ ํ† ํฐํ™”ํ•˜๊ณ  `input_ids`๋ฅผ PyTorch ํ…์„œ๋กœ ๋ฐ˜ํ™˜ํ•˜์„ธ์š”:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("my_awesome_opus_books_model")
>>> inputs = tokenizer(text, return_tensors="pt").input_ids
```
[`~transformers.generation_utils.GenerationMixin.generate`] ๋ฉ”์„œ๋“œ๋กœ ๋ฒˆ์—ญ์„ ์ƒ์„ฑํ•˜์„ธ์š”. ๋‹ค์–‘ํ•œ ํ…์ŠคํŠธ ์ƒ์„ฑ ์ „๋žต ๋ฐ ์ƒ์„ฑ์„ ์ œ์–ดํ•˜๊ธฐ ์œ„ํ•œ ๋งค๊ฐœ๋ณ€์ˆ˜์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ [Text Generation](../main_classes/text_generation) API๋ฅผ ์‚ดํŽด๋ณด์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.
```py
>>> from transformers import AutoModelForSeq2SeqLM
>>> model = AutoModelForSeq2SeqLM.from_pretrained("my_awesome_opus_books_model")
>>> outputs = model.generate(inputs, max_new_tokens=40, do_sample=True, top_k=30, top_p=0.95)
```
์ƒ์„ฑ๋œ ํ† ํฐ ID๋“ค์„ ๋‹ค์‹œ ํ…์ŠคํŠธ๋กœ ๋””์ฝ”๋”ฉํ•˜์„ธ์š”:
```py
>>> tokenizer.decode(outputs[0], skip_special_tokens=True)
'Les lignรฉes partagent des ressources avec des bactรฉries enfixant l'azote.'
```
</pt>
<tf>
ํ…์ŠคํŠธ๋ฅผ ํ† ํฐํ™”ํ•˜๊ณ  `input_ids`๋ฅผ TensorFlow ํ…์„œ๋กœ ๋ฐ˜ํ™˜ํ•˜์„ธ์š”:
```py
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("my_awesome_opus_books_model")
>>> inputs = tokenizer(text, return_tensors="tf").input_ids
```
[`~transformers.generation_tf_utils.TFGenerationMixin.generate`] ๋ฉ”์„œ๋“œ๋กœ ๋ฒˆ์—ญ์„ ์ƒ์„ฑํ•˜์„ธ์š”. ๋‹ค์–‘ํ•œ ํ…์ŠคํŠธ ์ƒ์„ฑ ์ „๋žต ๋ฐ ์ƒ์„ฑ์„ ์ œ์–ดํ•˜๊ธฐ ์œ„ํ•œ ๋งค๊ฐœ๋ณ€์ˆ˜์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ [Text Generation](../main_classes/text_generation) API๋ฅผ ์‚ดํŽด๋ณด์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.
```py
>>> from transformers import TFAutoModelForSeq2SeqLM
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("my_awesome_opus_books_model")
>>> outputs = model.generate(inputs, max_new_tokens=40, do_sample=True, top_k=30, top_p=0.95)
```
์ƒ์„ฑ๋œ ํ† ํฐ ID๋“ค์„ ๋‹ค์‹œ ํ…์ŠคํŠธ๋กœ ๋””์ฝ”๋”ฉํ•˜์„ธ์š”:
```py
>>> tokenizer.decode(outputs[0], skip_special_tokens=True)
'Les lugumes partagent les ressources avec des bactรฉries fixatrices d'azote.'
```
</tf>
</frameworkcontent>