bart_eng_hin_mt / README.md

Update README.md

126d185 verified over 1 year ago

5.19 kB

	---
	library_name: transformers
	language:
	- hi
	base_model: ar5entum/bart_eng_hin_mt
	tags:
	- generated_from_trainer
	model-index:
	- name: bart_eng_hin_mt
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bart_eng_hin_mt

	This model is a fine-tuned version of [danasone/bart-small-ru-en](https://huggingface.co/danasone/bart-small-ru-en) on [cfilt/iitb-english-hindi](https://huggingface.co/datasets/cfilt/iitb-english-hindi) dataset.
	It achieves the following results on the evaluation set:
	- eval_loss: 0.5147
	- eval_model_preparation_time: 0.0051
	- eval_bleu: 11.8141
	- eval_gen_len: 122.6932
	- eval_runtime: 3.6543
	- eval_samples_per_second: 142.3
	- eval_steps_per_second: 1.642
	- step: 0

	## Model description

	Machine Translation model from English to Hindi on bart small model.

	## Inference and Evaluation

	```python
	import torch
	import evaluate
	from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

	class BartSmall():
	def __init__(self, model_path = 'ar5entum/bart_eng_hin_mt', device = None):
	self.tokenizer = AutoTokenizer.from_pretrained(model_path)
	self.model = AutoModelForSeq2SeqLM.from_pretrained(model_path)
	if not device:
	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
	self.device = device
	self.model.to(device)

	def predict(self, input_text):
	inputs = self.tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True).to(self.device)
	pred_ids = self.model.generate(inputs.input_ids, max_length=512, num_beams=4, early_stopping=True)
	prediction = self.tokenizer.decode(pred_ids[0], skip_special_tokens=True)
	return prediction

	def predict_batch(self, input_texts, batch_size=32):
	all_predictions = []
	for i in range(0, len(input_texts), batch_size):
	batch_texts = input_texts[i:i+batch_size]
	inputs = self.tokenizer(batch_texts, return_tensors="pt", max_length=512,
	truncation=True, padding=True).to(self.device)

	with torch.no_grad():
	pred_ids = self.model.generate(inputs.input_ids,
	max_length=512,
	num_beams=4,
	early_stopping=True)

	predictions = self.tokenizer.batch_decode(pred_ids, skip_special_tokens=True)
	all_predictions.extend(predictions)

	return all_predictions

	model = BartSmall(device='cuda')

	input_texts = [
	"This is a repayable amount.",
	"Watch this video to find out.",
	"He was a father of two daughters and a son."
	]
	ground_truths = [
	"यह शोध्य रकम है।",
	"जानने के लिए देखें ये वीडियो.",
	"वह दो बेटियों व एक बेटे का पिता था।"
	]
	import time
	start = time.time()

	predictions = model.predict_batch(input_texts, batch_size=len(input_texts))
	end = time.time()
	print("TIME: ", end-start)
	for i in range(len(input_texts)):
	print("‾‾‾‾‾‾‾‾‾‾‾‾")
	print("Input text:\t", input_texts[i])
	print("Prediction:\t", predictions[i])
	print("Ground Truth:\t", ground_truths[i])
	bleu = evaluate.load("bleu")
	results = bleu.compute(predictions=predictions, references=ground_truths)
	print(results)

	# TIME: 3.65848970413208
	# ‾‾‾‾‾‾‾‾‾‾‾‾
	# Input text: This is a repayable amount.
	# Prediction: यह एक चुकौती राशि है।
	# Ground Truth: यह शोध्य रकम है।
	# ‾‾‾‾‾‾‾‾‾‾‾‾
	# Input text: Watch this video to find out.
	# Prediction: इस वीडियो को बाहर ढूंढने के लिए इस वीडियो को देख�
	# Ground Truth: जानने के लिए देखें ये वीडियो.
	# ‾‾‾‾‾‾‾‾‾‾‾‾
	# Input text: He was a father of two daughters and a son.
	# Prediction: वह दो बेटियों और एक पुत्र के पिता थे।
	# Ground Truth: वह दो बेटियों व एक बेटे का पिता था।
	# {'bleu': 0.0, 'precisions': [0.4, 0.13636363636363635, 0.05263157894736842, 0.0], 'brevity_penalty': 1.0, 'length_ratio': 1.25, 'translation_length': 25, 'reference_length': 20}
	```
	## Training Procedure
	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 8
	- eval_batch_size: 22
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 4
	- total_train_batch_size: 32
	- total_eval_batch_size: 88
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3.0

	### Framework versions

	- Transformers 4.45.0.dev0
	- Pytorch 2.3.0+cu121
	- Datasets 2.20.0
	- Tokenizers 0.19.1