Soonchan's picture
Update README.md
7838f85 verified
---
language:
- en
- ko
metrics:
- bleu
base_model:
- google/gemma-2-9b-it
tags:
- translation
- korean
- colloquial
---
# gemma2_colloquial_korean_translator
## Model Description
This model is fine-tuned to translate English text into natural and fluent colloquial Korean based on the gemma-2-9b language model. It improves the accuracy and naturalness of translation by effectively reflecting expressions and vocabulary used in everyday conversation. It uses the PEFT (Parameter-Efficient Fine-Tuning) technique, specifically LoRA (Low-Rank Adaptation), for efficient training.
## Key Features
- Base Model: Google/gemma-2-9b-Instruct
- Task: English colloquial → Korean translation
- Training Technique: QLORA (Quantized Low-Rank Adaptation)
- Quantization: 4-bit quantization (nf4)
- LoRA Configuration:
- rank (r): 6
- alpha: 8
- dropout: 0.05
- target modules: "q_proj", "o_proj", "k_proj", "v_proj", "gate_proj", "up_proj", "down_proj"
## Training Data
The model was trained on a dataset consisting of English colloquial expressions and their corresponding Korean translations. The data was provided in JSON format.
(Specific) We used the 'Korean-English Translation Parallel Corpus for Daily Life and Colloquial Expressions' from AI Hub. This dataset includes 500,000 pairs of English-Korean text, significantly enhancing the model's ability to handle everyday expressions and colloquial language.
## Training Settings
- Epochs: 3
- Batch Size: 4
- Gradient Accumulation Steps: 4
- Learning Rate: 2e-4
- Weight Decay: 0.01
- Optimizer: AdamW (8-bit)
- Max Sequence Length: 512 tokens
## Usage
You can use this model to translate English colloquial expressions into Korean. Here's an example:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
model_path = "Soonchan/gemma2_colloquial_korean_translator"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path)
def translate(text):
prompt = f"""<bos><start_of_turn>user
Please translate the following English colloquial expression into Korean.:
{text}<end_of_turn>
<start_of_turn>model
"""
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
# Usage example
english_text = "What's up?"
korean_translation = translate(english_text)
print(korean_translation)
```
## Limitations
- This model is specialized for colloquial expressions and may not be suitable for translating formal documents or technical content.
- The model's output should always be reviewed, as it may generate inappropriate or inaccurate translations depending on the context.
## License
This model follows the license of the original Gemma model. Please check the relevant license before use.
## References
- [Gemma: Open Models Based on Gemini Technology and Research](https://blog.google/technology/developers/gemma-open-models/)
- [PEFT: Parameter-Efficient Fine-Tuning of Billion-Scale Models on Low-Resource Hardware](https://arxiv.org/abs/2106.09685)
- [QLoRA: Efficient Finetuning of Quantized LLMs](https://arxiv.org/abs/2305.14314)
## Framework Versions
- PEFT 0.12.0