Neural Machine Translation with Attention πŸš€

A PyTorch implementation of a Sequence-to-Sequence model with Attention for English-Spanish translation.

Python PyTorch License

🌟 Features

  • Bidirectional GRU Encoder: Captures context from both directions of the input sequence
  • Attention Mechanism: Helps the model focus on relevant parts of the input sequence
  • Teacher Forcing: Implements curriculum learning for better training stability
  • Dynamic Batching: Efficient training with variable sequence lengths
  • Hugging Face Integration: Uses MarianTokenizer for robust text processing

πŸ—οΈ Architecture

The model consists of three main components:

  1. Encoder: Bidirectional GRU network that processes input sequences
  2. Attention: Computes attention weights for each encoder state
  3. Decoder: GRU network that generates translations using attention context
Input β†’ Encoder β†’ Attention β†’ Decoder β†’ Translation
      ↑          ↑          ↑
      Embeddings Context    Attention Weights

πŸš€ Quick Start

  1. Clone the repository:
git clone https://github.com/yourusername/nmt-attention.git
cd nmt-attention
  1. Install dependencies:
pip install torch transformers datasets
  1. Train the model:
python train.py
  1. Translate text:
from translate import translate
text = "How are you?"
translated = translate(model, text, tokenizer)
print(translated)

# Loading a saved model
model = Seq2Seq(encoder, decoder, device)
model.load_state_dict(torch.load('LSTM_text_generator.pth'))
model.eval()

πŸ“Š Model Performance

Training metrics after 10 epochs:

  • Initial Loss: 11.147
  • Final Loss: 3.527
  • Training Time: ~2 hours on NVIDIA V100

πŸ”§ Hyperparameters

BATCH_SIZE = 32
LEARNING_RATE = 1e-3
CLIP = 1.0
N_EPOCHS = 10
ENC_EMB_DIM = 256
DEC_EMB_DIM = 256
ENC_HID_DIM = 512
DEC_HID_DIM = 512

πŸ“š Dataset

Using the loresiensis/corpus-en-es dataset from Hugging Face Hub, which provides English-Spanish sentence pairs for training.

🀝 Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Attention Is All You Need paper
  • Hugging Face for the transformers library and datasets
  • PyTorch team for the amazing deep learning framework

⭐️ If you found this project helpful, please consider giving it a star!

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.