songy / transformers /docs /source /ko /transformers_agents.md
trishv's picture
Upload 2383 files
96e9536
|
raw
history blame
18.3 kB

Transformers Agent [[transformers-agent]]

Transformers Agent๋Š” ์‹คํ—˜ ์ค‘์ธ API๋กœ ์–ธ์ œ๋“ ์ง€ ๋ณ€๊ฒฝ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. API ๋˜๋Š” ๊ธฐ๋ฐ˜ ๋ชจ๋ธ์ด ๋ณ€๊ฒฝ๋˜๊ธฐ ์‰ฝ๊ธฐ ๋•Œ๋ฌธ์— ์—์ด์ „ํŠธ๊ฐ€ ๋ฐ˜ํ™˜ํ•˜๋Š” ๊ฒฐ๊ณผ๋„ ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Transformers ๋ฒ„์ „ 4.29.0.์—์„œ ๋„๊ตฌ์™€ ์—์ด์ „ํŠธ๋ผ๋Š” ์ปจ์…‰์„ ๋„์ž…ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด colab์—์„œ ์‚ฌ์šฉํ•ด๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ฐ„๋‹จํžˆ ๋งํ•˜๋ฉด, Agent๋Š” ํŠธ๋žœ์Šคํฌ๋จธ ์œ„์— ์ž์—ฐ์–ด API๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์—„์„ ๋œ ๋„๊ตฌ ์„ธํŠธ๋ฅผ ์ •์˜ํ•˜๊ณ , ์ž์—ฐ์–ด๋ฅผ ํ•ด์„ํ•˜์—ฌ ์ด๋Ÿฌํ•œ ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์—์ด์ „ํŠธ๋ฅผ ์„ค๊ณ„ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด API๋Š” ํ™•์žฅ์ด ๊ฐ€๋Šฅํ•˜๋„๋ก ์„ค๊ณ„ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ฃผ์š” ๋„๊ตฌ๋ฅผ ์„ ๋ณ„ํ•ด๋‘์—ˆ์ง€๋งŒ, ์ปค๋ฎค๋‹ˆํ‹ฐ์—์„œ ๊ฐœ๋ฐœํ•œ ๋ชจ๋“  ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ์‹œ์Šคํ…œ์„ ์‰ฝ๊ฒŒ ํ™•์žฅํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•๋„ ๋ณด์—ฌ๋“œ๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค.

๋ช‡ ๊ฐ€์ง€ ์˜ˆ๋ฅผ ํ†ตํ•ด ์ƒˆ๋กœ์šด API๋กœ ๋ฌด์—‡์„ ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์ด API๋Š” ํŠนํžˆ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์ž‘์—…์—์„œ ๊ฐ•๋ ฅํ•˜๋ฏ€๋กœ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜๊ณ  ํ…์ŠคํŠธ๋ฅผ ์†Œ๋ฆฌ๋‚ด์–ด ์ฝ์–ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

agent.run("Caption the following image", image=image)
Input Output
A beaver is swimming in the water

agent.run("Read the following text out loud", text=text)
Input Output
A beaver is swimming in the water

agent.run(
    "In the following `document`, where will the TRRF Scientific Advisory Council Meeting take place?",
    document=document,
)
Input Output
ballroom foyer

๋ฐ”๋กœ ์‹œ์ž‘ํ•˜๊ธฐ [[quickstart]]

agent.run์„ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ๋จผ์ € ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ(LLM)์ธ ์—์ด์ „ํŠธ๋ฅผ ์ธ์Šคํ„ด์Šคํ™”ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ €ํฌ๋Š” openAI ๋ชจ๋ธ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ BigCode ๋ฐ OpenAssistant์˜ ์˜คํ”ˆ์†Œ์Šค ๋Œ€์ฒด ๋ชจ๋ธ๋„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค. openAI ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์ด ๋” ์šฐ์ˆ˜ํ•˜์ง€๋งŒ(๋‹จ, openAI API ํ‚ค๊ฐ€ ํ•„์š”ํ•˜๋ฏ€๋กœ ๋ฌด๋ฃŒ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†์Œ), Hugging Face๋Š” BigCode์™€ OpenAssistant ๋ชจ๋ธ์˜ ์—”๋“œํฌ์ธํŠธ์— ๋Œ€ํ•œ ๋ฌด๋ฃŒ ์•ก์„ธ์Šค๋ฅผ ์ œ๊ณตํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

์šฐ์„  ๋ชจ๋“  ๊ธฐ๋ณธ ์ข…์†์„ฑ์„ ์„ค์น˜ํ•˜๋ ค๋ฉด agents๋ฅผ ์ถ”๊ฐ€๋กœ ์„ค์น˜ํ•˜์„ธ์š”.

pip install transformers[agents]

openAI ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋ ค๋ฉด openai ์ข…์†์„ฑ์„ ์„ค์น˜ํ•œ ํ›„ [OpenAiAgent]๋ฅผ ์ธ์Šคํ„ด์Šคํ™”ํ•ฉ๋‹ˆ๋‹ค:

pip install openai
from transformers import OpenAiAgent

agent = OpenAiAgent(model="text-davinci-003", api_key="<your_api_key>")

BigCode ๋˜๋Š” OpenAssistant๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ๋จผ์ € ๋กœ๊ทธ์ธํ•˜์—ฌ Inference API์— ์•ก์„ธ์Šคํ•˜์„ธ์š”:

from huggingface_hub import login

login("<YOUR_TOKEN>")

๊ทธ๋Ÿฐ ๋‹ค์Œ ์—์ด์ „ํŠธ๋ฅผ ์ธ์Šคํ„ด์Šคํ™”ํ•ฉ๋‹ˆ๋‹ค.

from transformers import HfAgent

# Starcoder
agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder")
# StarcoderBase
# agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoderbase")
# OpenAssistant
# agent = HfAgent(url_endpoint="https://api-inference.huggingface.co/models/OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5")

ํ˜„์žฌ Hugging Face์—์„œ ๋ฌด๋ฃŒ๋กœ ์ œ๊ณตํ•˜๋Š” ์ถ”๋ก  API๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์— ๋Œ€ํ•œ ์ž์ฒด ์ถ”๋ก  ์—”๋“œํฌ์ธํŠธ๊ฐ€ ์žˆ๋Š” ๊ฒฝ์šฐ(๋˜๋Š” ๋‹ค๋ฅธ ์—”๋“œํฌ์ธํŠธ๊ฐ€ ์žˆ๋Š” ๊ฒฝ์šฐ) ์œ„์˜ URL์„ ํ•ด๋‹น URL ์—”๋“œํฌ์ธํŠธ๋กœ ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

StarCoder์™€ OpenAssistant๋Š” ๋ฌด๋ฃŒ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ๊ฐ„๋‹จํ•œ ์ž‘์—…์—์„œ ๋†€๋ผ์šธ ์ •๋„๋กœ ์ž˜ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋” ๋ณต์žกํ•œ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ฒ˜๋ฆฌํ•  ๋•Œ๋Š” ์ฒดํฌํฌ์ธํŠธ๊ฐ€ ์ž˜ ์ž‘๋™ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜๋ฉด OpenAI ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•ด ๋ณด์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค. ์•„์‰ฝ๊ฒŒ๋„ ์˜คํ”ˆ์†Œ์Šค๋Š” ์•„๋‹ˆ์ง€๋งŒ ํ˜„์žฌ๋กœ์„œ๋Š” ๋” ๋‚˜์€ ์„ฑ๋Šฅ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

์ด์ œ ์ค€๋น„๊ฐ€ ์™„๋ฃŒ๋˜์—ˆ์Šต๋‹ˆ๋‹ค! ์ด์ œ ์ž์œ ๋กญ๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋‘ ๊ฐ€์ง€ API์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

๋‹จ์ผ ์‹คํ–‰ (run) [[single-execution-(run)]]

๋‹จ์ผ ์‹คํ–‰ ๋ฐฉ๋ฒ•์€ ์—์ด์ „ํŠธ์˜ [~Agent.run] ๋ฉ”์†Œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ์ž…๋‹ˆ๋‹ค:

agent.run("Draw me a picture of rivers and lakes.")

์ˆ˜ํ–‰ํ•˜๋ ค๋Š” ์ž‘์—…์— ์ ํ•ฉํ•œ ๋„๊ตฌ๋ฅผ ์ž๋™์œผ๋กœ ์„ ํƒํ•˜์—ฌ ์ ์ ˆํ•˜๊ฒŒ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๋™์ผํ•œ ๋ช…๋ น์–ด์—์„œ ํ•˜๋‚˜ ๋˜๋Š” ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค (๋‹ค๋งŒ, ๋ช…๋ น์–ด๊ฐ€ ๋ณต์žกํ• ์ˆ˜๋ก ์—์ด์ „ํŠธ๊ฐ€ ์‹คํŒจํ•  ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์•„์ง‘๋‹ˆ๋‹ค).

agent.run("Draw me a picture of the sea then transform the picture to add an island")

๋ชจ๋“  [~Agent.run] ์ž‘์—…์€ ๋…๋ฆฝ์ ์ด๋ฏ€๋กœ ๋‹ค๋ฅธ ์ž‘์—…์œผ๋กœ ์—ฌ๋Ÿฌ ๋ฒˆ ์—ฐ์†ํ•ด์„œ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

agent๋Š” ํฐ ์–ธ์–ด ๋ชจ๋ธ์ผ ๋ฟ์ด๋ฏ€๋กœ ํ”„๋กฌํ”„ํŠธ์— ์•ฝ๊ฐ„์˜ ๋ณ€ํ™”๋ฅผ ์ฃผ๋ฉด ์™„์ „ํžˆ ๋‹ค๋ฅธ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜ฌ ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์— ์œ ์˜ํ•˜์„ธ์š”. ์ˆ˜ํ–‰ํ•˜๋ ค๋Š” ์ž‘์—…์„ ์ตœ๋Œ€ํ•œ ๋ช…ํ™•ํ•˜๊ฒŒ ์„ค๋ช…ํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. ์ข‹์€ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ž‘์„ฑํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ์—ฌ๊ธฐ์—์„œ ์ž์„ธํžˆ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์—ฌ๋Ÿฌ ์‹คํ–‰์— ๊ฑธ์ณ ์ƒํƒœ๋ฅผ ์œ ์ง€ํ•˜๊ฑฐ๋‚˜ ํ…์ŠคํŠธ๊ฐ€ ์•„๋‹Œ ๊ฐœ์ฒด๋ฅผ ์—์ด์ „ํŠธ์—๊ฒŒ ์ „๋‹ฌํ•˜๋ ค๋Š” ๊ฒฝ์šฐ์—๋Š” ์—์ด์ „ํŠธ๊ฐ€ ์‚ฌ์šฉํ•  ๋ณ€์ˆ˜๋ฅผ ์ง€์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ๊ฐ•๊ณผ ํ˜ธ์ˆ˜์˜ ์ฒซ ๋ฒˆ์งธ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•œ ๋’ค, ๋ชจ๋ธ์ด ํ•ด๋‹น ๊ทธ๋ฆผ์— ์„ฌ์„ ์ถ”๊ฐ€ํ•˜๋„๋ก ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์š”์ฒญํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

picture = agent.run("Generate a picture of rivers and lakes.")
updated_picture = agent.run("Transform the image in `picture` to add an island to it.", picture=picture)

์ด ๋ฐฉ๋ฒ•์€ ๋ชจ๋ธ์ด ์š”์ฒญ์„ ์ดํ•ดํ•˜์ง€ ๋ชปํ•˜๊ณ  ๋„๊ตฌ๋ฅผ ํ˜ผํ•ฉํ•  ๋•Œ ์œ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

agent.run("Draw me the picture of a capybara swimming in the sea")

์—ฌ๊ธฐ์„œ ๋ชจ๋ธ์€ ๋‘ ๊ฐ€์ง€ ๋ฐฉ์‹์œผ๋กœ ํ•ด์„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

  • text-to-image์ด ๋ฐ”๋‹ค์—์„œ ํ—ค์—„์น˜๋Š” ์นดํ”ผ๋ฐ”๋ผ๋ฅผ ์ƒ์„ฑํ•˜๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.
  • ๋˜๋Š” text-to-image์ด ์นดํ”ผ๋ฐ”๋ผ๋ฅผ ์ƒ์„ฑํ•œ ๋‹ค์Œ image-transformation ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ”๋‹ค์—์„œ ํ—ค์—„์น˜๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.

์ฒซ ๋ฒˆ์งธ ์‹œ๋‚˜๋ฆฌ์˜ค๋ฅผ ๊ฐ•์ œ๋กœ ์‹คํ–‰ํ•˜๋ ค๋ฉด ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ธ์ˆ˜๋กœ ์ „๋‹ฌํ•˜์—ฌ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

agent.run("Draw me a picture of the `prompt`", prompt="a capybara swimming in the sea")

๋Œ€ํ™” ๊ธฐ๋ฐ˜ ์‹คํ–‰ (chat) [[chat-based-execution-(chat)]]

์—์ด์ „ํŠธ๋Š” [~Agent.chat] ๋ฉ”์†Œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋Œ€ํ™” ๊ธฐ๋ฐ˜ ์ ‘๊ทผ ๋ฐฉ์‹๋„ ์žˆ์Šต๋‹ˆ๋‹ค:

agent.chat("Generate a picture of rivers and lakes")
agent.chat("Transform the picture so that there is a rock in there")

์ด ๋ฐฉ์‹์€ ์—ฌ๋Ÿฌ ๋ช…๋ น์–ด์— ๊ฑธ์ณ ์ƒํƒœ๋ฅผ ์œ ์ง€ํ•˜๊ณ ์ž ํ•  ๋•Œ ํฅ๋ฏธ๋กœ์šด ์ ‘๊ทผ ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค. ์‹คํ—˜์šฉ์œผ๋กœ ๋” ์ข‹์ง€๋งŒ ๋ณต์žกํ•œ ๋ช…๋ น์–ด๋ณด๋‹ค๋Š” ๋‹จ์ผ ๋ช…๋ น์–ด([~Agent.run] ๋ฉ”์†Œ๋“œ๊ฐ€ ๋” ์ž˜ ์ฒ˜๋ฆฌํ•˜๋Š” ๋ช…๋ น์–ด)์— ํ›จ์”ฌ ๋” ์ž˜ ์ž‘๋™ํ•˜๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

์ด ๋ฉ”์†Œ๋“œ๋Š” ํ…์ŠคํŠธ๊ฐ€ ์•„๋‹Œ ์œ ํ˜•์ด๋‚˜ ํŠน์ • ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ „๋‹ฌํ•˜๋ ค๋Š” ๊ฒฝ์šฐ ์ธ์ˆ˜๋ฅผ ๋ฐ›์„ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

โš ๏ธ ์›๊ฒฉ ์‹คํ–‰ [[remote-execution]]

๋ฐ๋ชจ ๋ชฉ์ ๊ณผ ๋ชจ๋“  ์„ค์ •์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ์—์ด์ „ํŠธ๊ฐ€ ์ ‘๊ทผํ•  ์ˆ˜ ์žˆ๋Š” ๋ช‡ ๊ฐ€์ง€ ๊ธฐ๋ณธ ๋„๊ตฌ์— ๋Œ€ํ•œ ์›๊ฒฉ ์‹คํ–‰๊ธฐ๋ฅผ ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋„๊ตฌ๋Š” inference endpoints๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋งŒ๋“ค์–ด์กŒ์Šต๋‹ˆ๋‹ค. ์›๊ฒฉ ์‹คํ–‰๊ธฐ ๋„๊ตฌ๋ฅผ ์ง์ ‘ ์„ค์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด๋ ค๋ฉด ์‚ฌ์šฉ์ž ์ •์˜ ๋„๊ตฌ ๊ฐ€์ด๋“œ๋ฅผ ์ฝ์–ด๋ณด์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.

์›๊ฒฉ ๋„๊ตฌ๋กœ ์‹คํ–‰ํ•˜๋ ค๋ฉด [~Agent.run] ๋˜๋Š” [~Agent.chat] ์ค‘ ํ•˜๋‚˜์— remote=True๋ฅผ ์ง€์ •ํ•˜๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด ๋‹ค์Œ ๋ช…๋ น์€ ๋งŽ์€ RAM์ด๋‚˜ GPU ์—†์ด๋„ ๋ชจ๋“  ์žฅ์น˜์—์„œ ํšจ์œจ์ ์œผ๋กœ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

agent.run("Draw me a picture of rivers and lakes", remote=True)

[~Agent.chat]๋„ ๋งˆ์ฐฌ๊ฐ€์ง€์ž…๋‹ˆ๋‹ค:

agent.chat("Draw me a picture of rivers and lakes", remote=True)

์—ฌ๊ธฐ์„œ ๋ฌด์Šจ ์ผ์ด ์ผ์–ด๋‚˜๋Š” ๊ฑฐ์ฃ ? ๋„๊ตฌ๋ž€ ๋ฌด์—‡์ด๊ณ , ์—์ด์ „ํŠธ๋ž€ ๋ฌด์—‡์ธ๊ฐ€์š”? [[whats-happening-here-what-are-tools-and-what-are-agents]]

์—์ด์ „ํŠธ [[agents]]

์—ฌ๊ธฐ์„œ "์—์ด์ „ํŠธ"๋Š” ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ์ด๋ฉฐ, ํŠน์ • ๋„๊ตฌ ๋ชจ์Œ์— ์ ‘๊ทผํ•  ์ˆ˜ ์žˆ๋„๋ก ํ”„๋กฌํ”„ํŠธํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

LLM์€ ์ž‘์€ ์ฝ”๋“œ ์ƒ˜ํ”Œ์„ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ์ƒ๋‹นํžˆ ๋Šฅ์ˆ™ํ•˜๋ฏ€๋กœ, ์ด ์žฅ์ ์„ ํ™œ์šฉํ•ด ๋„๊ตฌ ๋ชจ์Œ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋Š” ์ž‘์€ ์ฝ”๋“œ ์ƒ˜ํ”Œ์„ ์ œ๊ณตํ•˜๋ผ๋Š” ๋ฉ”์‹œ์ง€๋ฅผ ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฐ ๋‹ค์Œ ์—์ด์ „ํŠธ์—๊ฒŒ ์ œ๊ณตํ•˜๋Š” ์ž‘์—…๊ณผ ์ œ๊ณตํ•˜๋Š” ๋„๊ตฌ์— ๋Œ€ํ•œ ์„ค๋ช…์œผ๋กœ ์ด ํ”„๋กฌํ”„ํŠธ๊ฐ€ ์™„๋ฃŒ๋ฉ๋‹ˆ๋‹ค. ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ์‚ฌ์šฉ ์ค‘์ธ ๋„๊ตฌ๋“ค์˜ ๋ฌธ์„œ์— ์ ‘๊ทผํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ํ•ด๋‹น ๋„๊ตฌ๋“ค์˜ ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ์„ ์˜ˆ์ƒํ•˜๊ณ , ๊ด€๋ จ๋œ ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋„๊ตฌ [[tools]]

๋„๊ตฌ๋Š” ๋งค์šฐ ๊ฐ„๋‹จํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฆ„๊ณผ ์„ค๋ช…์ด ์žˆ๋Š” ๋‹จ์ผ ๊ธฐ๋Šฅ์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฐ ๋‹ค์Œ ์ด๋Ÿฌํ•œ ๋„๊ตฌ์˜ ์„ค๋ช…์„ ์‚ฌ์šฉํ•˜์—ฌ ์ƒ๋‹ด์›์—๊ฒŒ ํ”„๋กฌํ”„ํŠธ๋ฅผ ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค. ์ด ํ”„๋กฌํ”„ํŠธ๋ฅผ ํ†ตํ•ด ์ƒ๋‹ด์›์—๊ฒŒ ์ฟผ๋ฆฌ์—์„œ ์š”์ฒญ๋œ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๊ธฐ ์œ„ํ•ด ๋„๊ตฌ๋ฅผ ํ™œ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

์—์ด์ „ํŠธ๊ฐ€ ๋งค์šฐ ์›์ž์ ์ธ ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋” ๋‚˜์€ ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํŒŒ์ดํ”„๋ผ์ธ์ด ์•„๋‹Œ ์™„์ „ํžˆ ์ƒˆ๋กœ์šด ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ํŒŒ์ดํ”„๋ผ์ธ์€ ๋” ๋งŽ์ด ๋ฆฌํŒฉํ„ฐ๋ง๋˜๋ฉฐ ์ข…์ข… ์—ฌ๋Ÿฌ ์ž‘์—…์„ ํ•˜๋‚˜๋กœ ๊ฒฐํ•ฉํ•ฉ๋‹ˆ๋‹ค. ๋„๊ตฌ๋Š” ํ•˜๋‚˜์˜ ๋งค์šฐ ๊ฐ„๋‹จํ•œ ์ž‘์—…์—๋งŒ ์ง‘์ค‘ํ•˜๋„๋ก ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

์ฝ”๋“œ ์‹คํ–‰?! [[code-execution]]

๊ทธ๋Ÿฐ ๋‹ค์Œ ์ด ์ฝ”๋“œ๋Š” ๋„๊ตฌ์™€ ํ•จ๊ป˜ ์ „๋‹ฌ๋œ ์ž…๋ ฅ ์„ธํŠธ์— ๋Œ€ํ•ด ์ž‘์€ Python ์ธํ„ฐํ”„๋ฆฌํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค. "์ž„์˜ ์ฝ”๋“œ ์‹คํ–‰์ด๋ผ๋‹ˆ!"์ด๋ผ๊ณ  ๋น„๋ช…์„ ์ง€๋ฅด๋Š” ์†Œ๋ฆฌ๊ฐ€ ๋“ค๋ฆฌ๊ฒ ์ง€๋งŒ, ๊ทธ๋ ‡์ง€ ์•Š์€ ์ด์œ ๋ฅผ ์„ค๋ช…ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

ํ˜ธ์ถœํ•  ์ˆ˜ ์žˆ๋Š” ํ•จ์ˆ˜๋Š” ์ œ๊ณตํ•œ ๋„๊ตฌ์™€ ์ธ์‡„ ๊ธฐ๋Šฅ๋ฟ์ด๋ฏ€๋กœ ์ด๋ฏธ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋Šฅ์ด ์ œํ•œ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. Hugging Face ๋„๊ตฌ๋กœ ์ œํ•œ๋˜์–ด ์žˆ๋‹ค๋ฉด ์•ˆ์ „ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๊ทธ๋ฆฌ๊ณ  ์–ดํŠธ๋ฆฌ๋ทฐํŠธ ์กฐํšŒ๋‚˜ ๊ฐ€์ ธ์˜ค๊ธฐ๋ฅผ ํ—ˆ์šฉํ•˜์ง€ ์•Š์œผ๋ฏ€๋กœ (์–ด์ฐจํ”ผ ์ž‘์€ ํ•จ์ˆ˜ ์ง‘ํ•ฉ์— ์ž…/์ถœ๋ ฅ์„ ์ „๋‹ฌํ•  ๋•Œ๋Š” ํ•„์š”ํ•˜์ง€ ์•Š์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค) ๊ฐ€์žฅ ๋ช…๋ฐฑํ•œ ๊ณต๊ฒฉ(์–ด์ฐจํ”ผ LLM์— ์ถœ๋ ฅํ•˜๋ผ๋Š” ๋ฉ”์‹œ์ง€๋ฅผ ํ‘œ์‹œํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค)์€ ๋ฌธ์ œ๊ฐ€ ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋งค์šฐ ์•ˆ์ „ํ•˜๊ฒŒ ํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด ์ถ”๊ฐ€ ์ธ์ˆ˜ return_code=True๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ run() ๋ฉ”์†Œ๋“œ๋ฅผ ์‹คํ–‰ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค. ์ด ๊ฒฝ์šฐ ์—์ด์ „ํŠธ๊ฐ€ ์‹คํ–‰ํ•  ์ฝ”๋“œ๋ฅผ ๋ฐ˜ํ™˜ํ•˜๊ณ  ์‹คํ–‰ํ• ์ง€ ์—ฌ๋ถ€๋ฅผ ๊ฒฐ์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ถˆ๋ฒ•์ ์ธ ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜๋ ค๊ณ  ํ•˜๊ฑฐ๋‚˜ ์—์ด์ „ํŠธ๊ฐ€ ์ƒ์„ฑํ•œ ์ฝ”๋“œ์— ์ผ๋ฐ˜์ ์ธ ํŒŒ์ด์ฌ ์˜ค๋ฅ˜๊ฐ€ ์žˆ๋Š” ๊ฒฝ์šฐ ์‹คํ–‰์ด ์ค‘์ง€๋ฉ๋‹ˆ๋‹ค.

์—„์„ ๋œ ๋„๊ตฌ ๋ชจ์Œ [[a-curated-set-of-tools]]

์ €ํฌ๋Š” ์ด๋Ÿฌํ•œ ์—์ด์ „ํŠธ๋“ค์˜ ์—ญ๋Ÿ‰์„ ๊ฐ•ํ™”ํ•  ์ˆ˜ ์žˆ๋Š” ์ผ๋ จ์˜ ๋„๊ตฌ๋ฅผ ํ™•์ธํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ์€ ์—ฐ๋™๋œ ๋„๊ตฌ์˜ ์ตœ์‹  ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค:

  • ๋ฌธ์„œ ์งˆ๋ฌธ ๋‹ต๋ณ€: ์ด๋ฏธ์ง€ ํ˜•์‹์˜ ๋ฌธ์„œ(์˜ˆ: PDF)๊ฐ€ ์ฃผ์–ด์ง€๋ฉด ์ด ๋ฌธ์„œ์— ๋Œ€ํ•œ ์งˆ๋ฌธ์— ๋‹ต๋ณ€ํ•ฉ๋‹ˆ๋‹ค. (Donut)
  • ํ…์ŠคํŠธ ์งˆ๋ฌธ ๋‹ต๋ณ€: ๊ธด ํ…์ŠคํŠธ์™€ ์งˆ๋ฌธ์ด ์ฃผ์–ด์ง€๋ฉด ํ…์ŠคํŠธ์—์„œ ์งˆ๋ฌธ์— ๋‹ต๋ณ€ํ•ฉ๋‹ˆ๋‹ค. (Flan-T5)
  • ๋ฌด์กฐ๊ฑด ์ด๋ฏธ์ง€ ์บก์…”๋‹: ์ด๋ฏธ์ง€์— ์บก์…˜์„ ๋‹ต๋‹ˆ๋‹ค! (BLIP)
  • ์ด๋ฏธ์ง€ ์งˆ๋ฌธ ๋‹ต๋ณ€: ์ด๋ฏธ์ง€๊ฐ€ ์ฃผ์–ด์ง€๋ฉด ์ด ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ ์งˆ๋ฌธ์— ๋‹ต๋ณ€ํ•˜๊ธฐ. (VILT)
  • ์ด๋ฏธ์ง€ ๋ถ„ํ• : ์ด๋ฏธ์ง€์™€ ํ”„๋กฌํ”„ํŠธ๊ฐ€ ์ฃผ์–ด์ง€๋ฉด ํ•ด๋‹น ํ”„๋กฌํ”„ํŠธ์˜ ๋ถ„ํ•  ๋งˆ์Šคํฌ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค. (CLIPSeg)
  • ์Œ์„ฑ์„ ํ…์ŠคํŠธ๋กœ ๋ณ€ํ™˜: ์‚ฌ๋žŒ์ด ๋งํ•˜๋Š” ์˜ค๋””์˜ค ๋…น์Œ์ด ์ฃผ์–ด์ง€๋ฉด ์Œ์„ฑ์„ ํ…์ŠคํŠธ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค. (Whisper)
  • ํ…์ŠคํŠธ ์Œ์„ฑ ๋ณ€ํ™˜: ํ…์ŠคํŠธ๋ฅผ ์Œ์„ฑ์œผ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค. (SpeechT5)
  • ์ œ๋กœ ์ƒท(zero-shot) ํ…์ŠคํŠธ ๋ถ„๋ฅ˜: ํ…์ŠคํŠธ์™€ ๋ ˆ์ด๋ธ” ๋ชฉ๋ก์ด ์ฃผ์–ด์ง€๋ฉด ํ…์ŠคํŠธ์™€ ๊ฐ€์žฅ ๊ด€๋ จ ์žˆ๋Š” ๋ ˆ์ด๋ธ”์„ ์‹๋ณ„ํ•ฉ๋‹ˆ๋‹ค. (BART)
  • ํ…์ŠคํŠธ ์š”์•ฝ: ๊ธด ํ…์ŠคํŠธ๋ฅผ ํ•œ ๋ฌธ์žฅ ๋˜๋Š” ๋ช‡ ๋ฌธ์žฅ์œผ๋กœ ์š”์•ฝํ•ฉ๋‹ˆ๋‹ค. (BART)
  • ๋ฒˆ์—ญ: ํ…์ŠคํŠธ๋ฅผ ์ง€์ •๋œ ์–ธ์–ด๋กœ ๋ฒˆ์—ญํ•ฉ๋‹ˆ๋‹ค. (NLLB)

์ด๋Ÿฌํ•œ ๋„๊ตฌ๋Š” ํŠธ๋žœ์Šคํฌ๋จธ์— ํ†ตํ•ฉ๋˜์–ด ์žˆ์œผ๋ฉฐ, ์˜ˆ๋ฅผ ๋“ค์–ด ์ˆ˜๋™์œผ๋กœ๋„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

from transformers import load_tool

tool = load_tool("text-to-speech")
audio = tool("This is a text to speech tool")

์‚ฌ์šฉ์ž ์ •์˜ ๋„๊ตฌ [[custom-tools]]

์—„์„ ๋œ ๋„๊ตฌ ์„ธํŠธ๋„ ์žˆ์ง€๋งŒ, ์ด ๊ตฌํ˜„์ด ์ œ๊ณตํ•˜๋Š” ๊ฐ€์žฅ ํฐ ๊ฐ€์น˜๋Š” ์‚ฌ์šฉ์ž ์ง€์ • ๋„๊ตฌ๋ฅผ ๋น ๋ฅด๊ฒŒ ๋งŒ๋“ค๊ณ  ๊ณต์œ ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค.

๋„๊ตฌ์˜ ์ฝ”๋“œ๋ฅผ Hugging Face Space๋‚˜ ๋ชจ๋ธ ์ €์žฅ์†Œ์— ํ‘ธ์‹œํ•˜๋ฉด ์—์ด์ „ํŠธ์—๊ฒŒ ์ง์ ‘ ๋„๊ตฌ๋ฅผ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. huggingface-tools organization์— ๋ช‡ ๊ฐ€์ง€ ํŠธ๋žœ์Šคํฌ๋จธ์— ๊ตฌ์• ๋ฐ›์ง€ ์•Š๋Š” ํˆด์„ ์ถ”๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค:

  • ํ…์ŠคํŠธ ๋‹ค์šด๋กœ๋”: ์›น URL์—์„œ ํ…์ŠคํŠธ๋ฅผ ๋‹ค์šด๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค.
  • ํ…์ŠคํŠธ ์ด๋ฏธ์ง€ ๋ณ€ํ™˜: ํ”„๋กฌํ”„ํŠธ์— ๋”ฐ๋ผ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜์—ฌ ์•ˆ์ •์ ์ธ ํ™•์‚ฐ์„ ํ™œ์šฉํ•ฉ๋‹ˆ๋‹ค.
  • ์ด๋ฏธ์ง€ ๋ณ€ํ™˜: ์ดˆ๊ธฐ ์ด๋ฏธ์ง€์™€ ํ”„๋กฌํ”„ํŠธ๊ฐ€ ์ฃผ์–ด์ง„ ์ด๋ฏธ์ง€๋ฅผ ์ˆ˜์ •ํ•˜๊ณ , ์•ˆ์ •์ ์ธ ํ™•์‚ฐ์„ ํ™œ์šฉํ•˜๋Š” ์ง€์‹œ ํ”ฝ์…€ 2 ํ”ฝ์…€์„ ํ™œ์šฉํ•ฉ๋‹ˆ๋‹ค.
  • ํ…์ŠคํŠธ ๋น„๋””์˜ค ๋ณ€ํ™˜: ํ”„๋กฌํ”„ํŠธ์— ๋”ฐ๋ผ ์ž‘์€ ๋น„๋””์˜ค๋ฅผ ์ƒ์„ฑํ•˜๋ฉฐ, damo-vilab์„ ํ™œ์šฉํ•ฉ๋‹ˆ๋‹ค.

์ €ํฌ๊ฐ€ ์ฒ˜์Œ๋ถ€ํ„ฐ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š” ํ…์ŠคํŠธ-์ด๋ฏธ์ง€ ๋ณ€ํ™˜ ๋„๊ตฌ๋Š” huggingface-tools/text-to-image์— ์žˆ๋Š” ์›๊ฒฉ ๋„๊ตฌ์ž…๋‹ˆ๋‹ค! ์ €ํฌ๋Š” ์ด ๋„๊ตฌ์™€ ๋‹ค๋ฅธ ์กฐ์ง์— ์ด๋Ÿฌํ•œ ๋„๊ตฌ๋ฅผ ๊ณ„์† ์ถœ์‹œํ•˜์—ฌ ์ด ๊ตฌํ˜„์„ ๋”์šฑ ๊ฐ•ํ™”ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์—์ด์ „ํŠธ๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ huggingface-tools์— ์žˆ๋Š” ๋„๊ตฌ์— ์ ‘๊ทผํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ ๊ฐ€์ด๋“œ์—์„œ ๋„๊ตฌ๋ฅผ ์ž‘์„ฑํ•˜๊ณ  ๊ณต์œ ํ•˜๋Š” ๋ฐฉ๋ฒ•๊ณผ Hub์— ์žˆ๋Š” ์‚ฌ์šฉ์ž ์ง€์ • ๋„๊ตฌ๋ฅผ ํ™œ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

์ฝ”๋“œ ์ƒ์„ฑ[[code-generation]]

์ง€๊ธˆ๊นŒ์ง€ ์—์ด์ „ํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ๋“œ๋ ธ์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์—์ด์ „ํŠธ๋Š” ๋งค์šฐ ์ œํ•œ๋œ Python ์ธํ„ฐํ”„๋ฆฌํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‹คํ–‰ํ•  ์ฝ”๋“œ๋งŒ ์ƒ์„ฑํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ์„ค์ •์—์„œ ์ƒ์„ฑ๋œ ์ฝ”๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋Š” ๊ฒฝ์šฐ ์—์ด์ „ํŠธ์—๊ฒŒ ๋„๊ตฌ ์ •์˜ ๋ฐ ์ •ํ™•ํ•œ ๊ฐ€์ ธ์˜ค๊ธฐ์™€ ํ•จ๊ป˜ ์ฝ”๋“œ๋ฅผ ๋ฐ˜ํ™˜ํ•˜๋ผ๋Š” ๋ฉ”์‹œ์ง€๋ฅผ ํ‘œ์‹œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด ๋‹ค์Œ ๋ช…๋ น์–ด๋Š”

agent.run("Draw me a picture of rivers and lakes", return_code=True)

๋‹ค์Œ ์ฝ”๋“œ๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

from transformers import load_tool

image_generator = load_tool("huggingface-tools/text-to-image")

image = image_generator(prompt="rivers and lakes")

์ด ์ฝ”๋“œ๋Š” ์ง์ ‘ ์ˆ˜์ •ํ•˜๊ณ  ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.