- Developed by: lwef
- License: apache-2.0
- Finetuned from model : beomi/Llama-3-Open-Ko-8B
korean dialogue summary fine-tuned model
how to use
prompt_template = '''
μλ λνλ₯Ό μμ½ν΄ μ£ΌμΈμ. λν νμμ '#λν μ°Έμ¬μ#: λν λ΄μ©'μ
λλ€.
### λν >>>{dialogue}
### μμ½ >>>'''
if True:
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "lwef/llama3-8B-ko-dialogue-summary-finetuned", # YOUR MODEL YOU USED FOR TRAINING
max_seq_length = 2048,
dtype = None,
load_in_4bit = True,
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
dialogue = '''#P01#: μ νμΆ κ³Όμ λ무 μ΄λ €μ... 5μͺ½ μΈκ² μλλ° γ
‘γ
‘ #P02#: λͺ¬λλͺ¬λλκ°λμμ¨ γ
γ
#P01#: 5μͺ½ λμΆ© μμμ νλ¦λλ‘ μ μ¨μΌμ§..μ΄μ 1μͺ½μ ;; 5μͺ½ μλ λ€μ€λ§ μ μ΄μΌμ§ #P02#: μλ... λκ°λΆλμ€μν κ±°κ°μ κ±°μκ½μ±μμμ°μ
#P01#: λͺ»μ¨ μΈλ§μ
μ¨ #P02#: μ΄κ±°μ€κ°λ체μ¬?? #P01#: γ΄γ΄ κ·Έλ₯ κ³Όμ μ κ·Έλμ λ μ§μ¦λ¨'''
formatted_prompt = prompt_template.format(dialogue=dialogue)
# ν ν¬λμ΄μ§
inputs = tokenizer(
formatted_prompt,
return_tensors="pt"
).to("cuda")
outputs = model.generate(
**inputs,
max_new_tokens = 128,
eos_token_id=tokenizer.eos_token_id, # EOS ν ν°μ μ¬μ©νμ¬ λͺ
μμ μΌλ‘ μΆλ ₯μ λμ μ§μ .
use_cache = True
)
decoded_outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True)
result = decoded_outputs[0]
print(result)
result = result.split('### μμ½ >>>')[-1].strip()
print(result)
This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no pipeline_tag.
Model tree for lwef/llama3-8B-ko-dialogue-summary-finetuned
Base model
beomi/Llama-3-Open-Ko-8B