๋‰ด์Šค ๋ถ„์„ ๋ชจ๋ธ

์ด ์ €์žฅ์†Œ์—๋Š” ์ฃผ์–ด์ง„ ๋‰ด์Šค ๋ณธ๋ฌธ์„ ๋ถ„์„ํ•˜์—ฌ ๋‹ค์Œ์˜ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ชจ๋ธ์ด ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค:

  • ์š”์•ฝ(Summarization): ๋‰ด์Šค ๊ธฐ์‚ฌ์˜ ์ฃผ์š” ๋‚ด์šฉ์„ 1~3์ค„๋กœ ์š”์•ฝํ•ฉ๋‹ˆ๋‹ค.
  • ๊ฐ์„ฑ ๋ถ„์„(Sentiment Analysis): ๊ธฐ์‚ฌ ๋‚ด์šฉ์˜ ๊ฐ์„ฑ์„ ๊ธ์ •, ๋ถ€์ •, ์ค‘๋ฆฝ์œผ๋กœ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.
  • ์ข…๋ชฉ ์ฝ”๋“œ ์ถ”์ถœ(Stock Code Identification): ์–ธ๊ธ‰๋œ ํšŒ์‚ฌ๋ช…์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ด€๋ จ ์ฃผ์‹ ์ข…๋ชฉ ์ฝ”๋“œ๋ฅผ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค.
  • ๊ด‘๊ณ ์„ฑ ์—ฌ๋ถ€ ํŒ๋ณ„(Advertisement Detection): ๋ณธ๋ฌธ์ด ๊ด‘๊ณ ์ธ์ง€ ์—ฌ๋ถ€๋ฅผ ํŒ๋ณ„ํ•ฉ๋‹ˆ๋‹ค.

๋ชจ๋ธ ์ •๋ณด

๋ชจ๋ธ์€ meta-llama์˜ Llama-3.2-3B๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•™์Šต ํ•˜์˜€์œผ๋ฉฐ, Hugging Face์˜ transformers ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

  • ๋ชจ๋ธ: irene93/Llama3-news-analysis
  • ํ† ํฌ๋‚˜์ด์ €: AutoTokenizer
  • ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜: AutoModelForCausalLM

์„ค์น˜ ๋ฐฉ๋ฒ•

๋จผ์ € ํ™˜๊ฒฝ์„ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค:

pip install torch transformers

์‚ฌ์šฉ ๋ฐฉ๋ฒ•

๋‹ค์Œ์€ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‰ด์Šค ๊ธฐ์‚ฌ๋ฅผ ๋ถ„์„ํ•˜๋Š” ์˜ˆ์‹œ ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# ๋ชจ๋ธ ๋ฐ ํ† ํฌ๋‚˜์ด์ € ๋กœ๋“œ
tokenizer = AutoTokenizer.from_pretrained('irene93/Llama3-news-analysis')
model = AutoModelForCausalLM.from_pretrained('irene93/Llama3-news-analysis')
model = torch.nn.DataParallel(model).cuda()

device = "cuda:0"

user_content = """ํ•œํ™”์—์–ด๋กœ์ŠคํŽ˜์ด์Šค๊ฐ€ โ€˜๋ฐ€๋ ˜ ๋กœ๋ณดํ‹ฑ์Šคโ€™์™€ ์„ธ๊ณ„ ์ตœ๊ณ ์˜ ๋ฌด์ธ์ฐจ๋Ÿ‰ ๊ฐœ๋ฐœ์— ๋‚˜์„ ๋‹ค. 
ํ•œํ™”์—์–ด๋กœ์ŠคํŽ˜์ด์Šค๋Š” 19์ผ ์œ ๋Ÿฝ ์ตœ๋Œ€์˜ ๋ฌด์ธ์ฐจ๋Ÿ‰(UGV) ๊ธฐ์—…์ธ ๋ฐ€๋ ˜ ๋กœ๋ณดํ‹ฑ์Šค์™€ โ€˜IDEX 2025โ€™์—์„œ ์ตœ์‹  ๊ถค๋„ํ˜• UGV์ธ T-RCV(Tracked-Robotic Combat Vehicle)์˜ ๊ณต๋™๊ฐœ๋ฐœ ๋ฐ ๊ธ€๋กœ๋ฒŒ์‹œ์žฅ ๊ณต๋žต์„ ์œ„ํ•œ ์ „๋žต์  ํŒŒํŠธ๋„ˆ์‹ญ์„ ํ™•๋Œ€ํ•œ๋‹ค๋Š” ๋‚ด์šฉ์˜ ์–‘ํ•ด๊ฐ์„œ๋ฅผ ์ฒด๊ฒฐํ–ˆ๋‹ค๊ณ  ๋ฐํ˜”๋‹ค.
์—์Šคํ† ๋‹ˆ์•„์˜ โ€˜๋ฐ€๋ ˜ ๋กœ๋ณดํ‹ฑ์Šคโ€™๋Š” ๋ฏธ๊ตญ, ์˜๊ตญ, ํ”„๋ž‘์Šค ๋“ฑ ๋ถ๋Œ€์„œ์–‘์กฐ์•ฝ๊ธฐ๊ตฌ(NATO) 8๊ฐœ๊ตญ์„ ํฌํ•จํ•œ ์ด 16๊ฐœ๊ตญ์— ๊ถค๋„ํ˜• UGV๋ฅผ ๊ณต๊ธ‰ํ•˜๋Š” ๋“ฑ ๊ธ€๋กœ๋ฒŒ UGV์˜ ํ‘œ์ค€ํ™”๋ฅผ ์ฃผ๋„ํ•˜๋Š” ์„ธ๊ณ„ ์ตœ๊ณ  ์ˆ˜์ค€์˜ ๊ธฐ์ˆ ์„ ๋ณด์œ ํ•˜๊ณ  ์žˆ๋‹ค. 

ํ•œํ™”์—์–ด๋กœ์ŠคํŽ˜์ด์Šค๋Š” ์ฐจ๋ฅœํ˜• UGV โ€˜์•„๋ฆฌ์˜จ์Šค๋ฉงโ€™์„ ํ†ตํ•ด ๋ฏธ๊ตฐ์˜ใ… ํ•ด์™ธ๋น„๊ต์„ฑ๋Šฅ์‹œํ—˜(FCT)์„ ์„ฑ๊ณต์ ์œผ๋กœ ์ˆ˜ํ–‰ํ•˜๊ณ , ์ฐจ์„ธ๋Œ€ UGV์ธ โ€˜๊ทธ๋ŸฐํŠธ(GRUNT)โ€™๋ฅผ ์ž์ฒด ๊ฐœ๋ฐœํ•˜๋Š” ๋“ฑ ๊ธ€๋กœ๋ฒŒ ์‹œ์žฅ์—์„œ ๊ธฐ์ˆ ๋ ฅ์„ ์ธ์ •๋ฐ›์œผ๋ฉด์„œ ์˜ฌํ•ด ํ•œ๊ตญ ์œก๊ตฐ์˜ ๋‹ค๋ชฉ์ ๋ฌด์ธ์ฐจ๋Ÿ‰ ๊ตฌ๋งค์‚ฌ์—…์ž ์„ ์ •์„ ์•ž๋‘๊ณ  ์žˆ๋‹ค.
ํ•œํ™”์—์–ด๋กœ์ŠคํŽ˜์ด์Šค ์ธก์€ โ€œ์–‘์‚ฌ ํ˜‘๋ ฅ์„ ๋ฐ”ํƒ•์œผ๋กœ ๊ตญ๋‚ด์™ธ ๊ณ ๊ฐ๋“ค์—๊ฒŒ ๋น ๋ฅด๊ฒŒ ๋ณ€ํ™”ํ•˜๋Š” ํ˜„๋Œ€ ์ „ํˆฌ ํ™˜๊ฒฝ์— ๋Œ€์‘ํ•  ์ƒˆ๋กœ์šด ๋Œ€์•ˆ์„ ์ œ์‹œํ•˜๊ฒ ๋‹คโ€๊ณ  ํ–ˆ๋‹ค.

๋ฐ€๋ ˜ ๋กœ๋ณดํ‹ฑ์Šค ์ธก๋„ โ€œ์–‘์‚ฌ์˜ ํ˜์‹ ์ ์ธ ๊ธฐ์ˆ ๊ณผ ํ’๋ถ€ํ•œ ๊ธ€๋กœ๋ฒŒ ์‹œ์žฅ ๊ฒฝํ—˜์„ ๋ฐ”ํƒ•์œผ๋กœ ์ตœ์ฒจ๋‹จ ๋ฌด์ธํ™” ์†”๋ฃจ์…˜ ๊ฐœ๋ฐœ์— ์ตœ์„ ์„ ๋‹คํ•˜๊ฒ ๋‹คโ€๊ณ  ๋งํ–ˆ๋‹ค."""

messages = [
    {"role": "system", "content": "๋‹น์‹ ์€ ์ฃผ์–ด์ง„ ๋‰ด์Šค๋ฅผ ๋ถ„์„ํ•˜๋Š” ์ฑ—๋ด‡์ž…๋‹ˆ๋‹ค. **์ง€์‹œ์‚ฌํ•ญ**:- ์ฃผ์–ด์ง„ ๋‰ด์Šค์— ๋Œ€ํ•˜์—ฌ summary, advr, stk_code, sent_score ๋ถ„์„ํ•˜๊ณ  json ํ˜•ํƒœ๋กœ ์ถœ๋ ฅํ•˜์„ธ์š”. - summary๋Š” 1~3์ค„ ์‚ฌ์ด๋กœ ์ž‘์„ฑํ•ฉ๋‹ˆ๋‹ค.- advr๋Š” ํ•ด๋‹น ๋ณธ๋ฌธ์ด ๊ด‘๊ณ ๋ฉด 1 ๊ด‘๊ณ ๊ฐ€ ์•„๋‹๊ฒฝ์šฐ์— 0 ์œผ๋กœ ์ •์ˆ˜ 1๊ฐœ์˜ ๊ฐ’์œผ๋กœ ์ถœ๋ ฅํ•˜์„ธ์š”.- stk_code๋Š” ํ•ด๋‹น ๋ณธ๋ฌธ์—์„œ ์–ธ๊ธ‰๋œ ์ข…๋ชฉ๋ช…์„ ์ฐพ๊ณ , ๊ทธ ์ข…๋ชฉ๋ช…์˜ ์ข…๋ชฉ ์ฝ”๋“œ๋ฅผ ์ฐพ์•„ ํŒŒ์ด์ฌ ๋ฆฌ์ŠคํŠธ ํ˜•ํƒœ๋กœ ์ž‘์„ฑํ•˜์„ธ์š”. - sent_score๋Š” ํ•ด๋‹น ๋ณธ๋ฌธ์ด ๊ธ์ •์ ์ผ๊ฒฝ์šฐ 1 ๋ถ€์ •์ ์ผ๊ฒฝ์šฐ -1 , ๊ธ์ •์ ์ด์ง€๋„ ๋ถ€์ •:์ ์ด์ง€๋„ ์•Š์„๊ฒฝ์šฐ 0 ์œผ๋กœ ์ •์ˆ˜ 1๊ฐœ์˜ ๊ฐ’์„ ์ถœ๋ ฅํ•˜์„ธ์š” - ๋ณธ๋ฌธ: ์ด ์ฃผ์–ด์ง€๋ฉด ๊ฒฐ๊ณผ: ๋‹ค์Œ์— json ํ˜•ํƒœ๋กœ ์ž‘์„ฑํ•˜์„ธ์š”"},
    {"role": "user", "content": user_content}
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(device)


terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = model.module.generate(
    input_ids,
    max_new_tokens=2048,
    eos_token_id=terminators,
    do_sample=False,
)

response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))

์˜ˆ์‹œ ์ถœ๋ ฅ

{
  'summary': 'ํ•œํ™”์—์–ด๋กœ์ŠคํŽ˜์ด์Šค๊ฐ€ ๋ฐ€๋ ˜ ๋กœ๋ณดํ‹ฑ์Šค์™€ ํ˜‘๋ ฅํ•ด ๋ฌด์ธ์ฐจ๋Ÿ‰ ๊ฐœ๋ฐœ์— ๋‚˜์„ฐ์Šต๋‹ˆ๋‹ค.',
  'advr_tp': '0',
  'stk_code': ['012450'],
  'sent_score': 1
}

์š”๊ตฌ ์‚ฌํ•ญ

  • torch
  • transformers

๋ผ์ด์„ ์Šค

์ด ํ”„๋กœ์ ํŠธ๋Š” MIT ๋ผ์ด์„ ์Šค๋ฅผ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค.

Downloads last month
21
Safetensors
Model size
3.21B params
Tensor type
FP16
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.