|
--- |
|
language: |
|
- en |
|
- ko |
|
metrics: |
|
- bleu |
|
base_model: |
|
- google/gemma-2-9b-it |
|
tags: |
|
- translation |
|
- korean |
|
- colloquial |
|
--- |
|
|
|
# gemma2_colloquial_korean_translator |
|
|
|
## Model Description |
|
|
|
This model is fine-tuned to translate English text into natural and fluent colloquial Korean based on the gemma-2-9b language model. It improves the accuracy and naturalness of translation by effectively reflecting expressions and vocabulary used in everyday conversation. It uses the PEFT (Parameter-Efficient Fine-Tuning) technique, specifically LoRA (Low-Rank Adaptation), for efficient training. |
|
|
|
## Key Features |
|
|
|
- Base Model: Google/gemma-2-9b-Instruct |
|
- Task: English colloquial → Korean translation |
|
- Training Technique: QLORA (Quantized Low-Rank Adaptation) |
|
- Quantization: 4-bit quantization (nf4) |
|
- LoRA Configuration: |
|
- rank (r): 6 |
|
- alpha: 8 |
|
- dropout: 0.05 |
|
- target modules: "q_proj", "o_proj", "k_proj", "v_proj", "gate_proj", "up_proj", "down_proj" |
|
|
|
## Training Data |
|
|
|
The model was trained on a dataset consisting of English colloquial expressions and their corresponding Korean translations. The data was provided in JSON format. |
|
|
|
(Specific) We used the 'Korean-English Translation Parallel Corpus for Daily Life and Colloquial Expressions' from AI Hub. This dataset includes 500,000 pairs of English-Korean text, significantly enhancing the model's ability to handle everyday expressions and colloquial language. |
|
|
|
## Training Settings |
|
|
|
- Epochs: 3 |
|
- Batch Size: 4 |
|
- Gradient Accumulation Steps: 4 |
|
- Learning Rate: 2e-4 |
|
- Weight Decay: 0.01 |
|
- Optimizer: AdamW (8-bit) |
|
- Max Sequence Length: 512 tokens |
|
|
|
## Usage |
|
|
|
You can use this model to translate English colloquial expressions into Korean. Here's an example: |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
model_path = "Soonchan/gemma2_colloquial_korean_translator" |
|
tokenizer = AutoTokenizer.from_pretrained(model_path) |
|
model = AutoModelForCausalLM.from_pretrained(model_path) |
|
|
|
def translate(text): |
|
prompt = f"""<bos><start_of_turn>user |
|
Please translate the following English colloquial expression into Korean.: |
|
{text}<end_of_turn> |
|
<start_of_turn>model |
|
""" |
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
outputs = model.generate(**inputs, max_new_tokens=100) |
|
return tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
|
# Usage example |
|
english_text = "What's up?" |
|
korean_translation = translate(english_text) |
|
print(korean_translation) |
|
``` |
|
|
|
## Limitations |
|
|
|
- This model is specialized for colloquial expressions and may not be suitable for translating formal documents or technical content. |
|
- The model's output should always be reviewed, as it may generate inappropriate or inaccurate translations depending on the context. |
|
|
|
## License |
|
|
|
This model follows the license of the original Gemma model. Please check the relevant license before use. |
|
|
|
## References |
|
|
|
- [Gemma: Open Models Based on Gemini Technology and Research](https://blog.google/technology/developers/gemma-open-models/) |
|
- [PEFT: Parameter-Efficient Fine-Tuning of Billion-Scale Models on Low-Resource Hardware](https://arxiv.org/abs/2106.09685) |
|
- [QLoRA: Efficient Finetuning of Quantized LLMs](https://arxiv.org/abs/2305.14314) |
|
|
|
## Framework Versions |
|
|
|
- PEFT 0.12.0 |