Soonchan
/

gemma2_colloquial_korean_translator

Model card Files Files and versions Community

gemma2_colloquial_korean_translator / README.md

Soonchan's picture

Update README.md

7838f85 verified 5 months ago

|

history blame contribute delete

3.24 kB

	---
	language:
	- en
	- ko
	metrics:
	- bleu
	base_model:
	- google/gemma-2-9b-it
	tags:
	- translation
	- korean
	- colloquial
	---

	# gemma2_colloquial_korean_translator

	## Model Description

	This model is fine-tuned to translate English text into natural and fluent colloquial Korean based on the gemma-2-9b language model. It improves the accuracy and naturalness of translation by effectively reflecting expressions and vocabulary used in everyday conversation. It uses the PEFT (Parameter-Efficient Fine-Tuning) technique, specifically LoRA (Low-Rank Adaptation), for efficient training.

	## Key Features

	- Base Model: Google/gemma-2-9b-Instruct
	- Task: English colloquial → Korean translation
	- Training Technique: QLORA (Quantized Low-Rank Adaptation)
	- Quantization: 4-bit quantization (nf4)
	- LoRA Configuration:
	- rank (r): 6
	- alpha: 8
	- dropout: 0.05
	- target modules: "q_proj", "o_proj", "k_proj", "v_proj", "gate_proj", "up_proj", "down_proj"

	## Training Data

	The model was trained on a dataset consisting of English colloquial expressions and their corresponding Korean translations. The data was provided in JSON format.

	(Specific) We used the 'Korean-English Translation Parallel Corpus for Daily Life and Colloquial Expressions' from AI Hub. This dataset includes 500,000 pairs of English-Korean text, significantly enhancing the model's ability to handle everyday expressions and colloquial language.

	## Training Settings

	- Epochs: 3
	- Batch Size: 4
	- Gradient Accumulation Steps: 4
	- Learning Rate: 2e-4
	- Weight Decay: 0.01
	- Optimizer: AdamW (8-bit)
	- Max Sequence Length: 512 tokens

	## Usage

	You can use this model to translate English colloquial expressions into Korean. Here's an example:

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	model_path = "Soonchan/gemma2_colloquial_korean_translator"
	tokenizer = AutoTokenizer.from_pretrained(model_path)
	model = AutoModelForCausalLM.from_pretrained(model_path)

	def translate(text):
	prompt = f"""<bos><start_of_turn>user
	Please translate the following English colloquial expression into Korean.:
	{text}<end_of_turn>
	<start_of_turn>model
	"""
	inputs = tokenizer(prompt, return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=100)
	return tokenizer.decode(outputs[0], skip_special_tokens=True)

	# Usage example
	english_text = "What's up?"
	korean_translation = translate(english_text)
	print(korean_translation)
	```

	## Limitations

	- This model is specialized for colloquial expressions and may not be suitable for translating formal documents or technical content.
	- The model's output should always be reviewed, as it may generate inappropriate or inaccurate translations depending on the context.

	## License

	This model follows the license of the original Gemma model. Please check the relevant license before use.

	## References

	- [Gemma: Open Models Based on Gemini Technology and Research](https://blog.google/technology/developers/gemma-open-models/)
	- [PEFT: Parameter-Efficient Fine-Tuning of Billion-Scale Models on Low-Resource Hardware](https://arxiv.org/abs/2106.09685)
	- [QLoRA: Efficient Finetuning of Quantized LLMs](https://arxiv.org/abs/2305.14314)

	## Framework Versions

	- PEFT 0.12.0