|
--- |
|
datasets: |
|
- PrompTart/PTT_advanced_en_ko |
|
language: |
|
- en |
|
- ko |
|
base_model: |
|
- beomi/Llama-3-KoEn-8B-Instruct-preview |
|
- meta-llama/Meta-Llama-3-8B |
|
library_name: transformers |
|
--- |
|
|
|
# Llama-3-KoEn-8B-Instruct-preview Fine-Tuned on Parenthetical Terminology Translation (PTT) Dataset |
|
|
|
## Model Overview |
|
|
|
This is a **Llama-3-KoEn-8B-Instruct-preview** model fine-tuned on the [**Parenthetical Terminology Translation (PTT)**](https://arxiv.org/abs/2410.00683) dataset. [The PTT dataset](https://huggingface.co/datasets/PrompTart/PTT_advanced_en_ko) focuses on translating technical terms accurately by placing the original English term in parentheses alongside its Korean translation, enhancing clarity and precision in specialized fields. This fine-tuned model is optimized for handling technical terminology in the **Artificial Intelligence (AI)** domain. |
|
|
|
|
|
## Example Usage |
|
|
|
Hereβs how to use this fine-tuned model with the Hugging Face `transformers` library: |
|
|
|
```python |
|
import transformers |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
# Load Model and Tokenizer |
|
model_name = "PrompTartLAB/Llama3ko_8B_inst_PTT_enko" |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
torch_dtype="auto", |
|
device_map="auto", |
|
) |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
|
# Example sentence |
|
text = "The model was fine-tuned using knowledge distillation techniques. The training dataset was created using a collaborative multi-agent framework powered by large language models." |
|
prompt = f"Translate input sentence to Korean \n### Input: {text} \n### Translated:" |
|
|
|
# Tokenize and generate translation |
|
input_ids = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
outputs = model.generate(**input_ids, max_new_tokens=1024) |
|
out_message = tokenizer.decode(outputs[0][len(input_ids["input_ids"][0]):], skip_special_tokens=True) |
|
|
|
# " μ΄ λͺ¨λΈμ μ§μ μ¦λ₯ κΈ°λ²(knowledge distillation techniques)μ μ¬μ©νμ¬ λ―ΈμΈ μ‘°μ λμμ΅λλ€. νλ ¨ λ°μ΄ν°μ
μ λν μΈμ΄ λͺ¨λΈ(large language models)λ‘ κ΅¬λλλ νλ ₯μ λ€μ€ μμ΄μ νΈ νλ μμν¬(collaborative multi-agent framework)λ₯Ό μ¬μ©νμ¬ μμ±λμμ΅λλ€." |
|
|
|
``` |
|
|
|
## Limitations |
|
|
|
- **Out-of-Domain Accuracy**: While the model generalizes to some extent, accuracy may vary in domains that were not part of the training set. |
|
- **Incomplete Parenthetical Annotation**: Not all technical terms are consistently displayed in parentheses; in some cases, terms may be omitted or not annotated as expected. |
|
|
|
## Citation |
|
|
|
If you use this model in your research, please cite the original dataset and paper: |
|
|
|
```tex |
|
@misc{myung2024efficienttechnicaltermtranslation, |
|
title={Efficient Technical Term Translation: A Knowledge Distillation Approach for Parenthetical Terminology Translation}, |
|
author={Jiyoon Myung and Jihyeon Park and Jungki Son and Kyungro Lee and Joohyung Han}, |
|
year={2024}, |
|
eprint={2410.00683}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL}, |
|
url={https://arxiv.org/abs/2410.00683}, |
|
} |
|
``` |
|
|
|
## Contact |
|
|
|
For questions or feedback, please contact [[email protected]](mailto:[email protected]). |