Hanish09
/

seq2seq-en-es

Model card Files Files and versions Community

Neural Machine Translation with Attention 🚀

A PyTorch implementation of a Sequence-to-Sequence model with Attention for English-Spanish translation.

🌟 Features

Bidirectional GRU Encoder: Captures context from both directions of the input sequence
Attention Mechanism: Helps the model focus on relevant parts of the input sequence
Teacher Forcing: Implements curriculum learning for better training stability
Dynamic Batching: Efficient training with variable sequence lengths
Hugging Face Integration: Uses MarianTokenizer for robust text processing

🏗️ Architecture

The model consists of three main components:

Encoder: Bidirectional GRU network that processes input sequences
Attention: Computes attention weights for each encoder state
Decoder: GRU network that generates translations using attention context

Input → Encoder → Attention → Decoder → Translation
      ↑          ↑          ↑
      Embeddings Context    Attention Weights

🚀 Quick Start

Clone the repository:

git clone https://github.com/yourusername/nmt-attention.git
cd nmt-attention

Install dependencies:

pip install torch transformers datasets

Train the model:

python train.py

Translate text:

from translate import translate
text = "How are you?"
translated = translate(model, text, tokenizer)
print(translated)

# Loading a saved model
model = Seq2Seq(encoder, decoder, device)
model.load_state_dict(torch.load('LSTM_text_generator.pth'))
model.eval()

📊 Model Performance

Training metrics after 10 epochs:

Initial Loss: 11.147
Final Loss: 3.527
Training Time: ~2 hours on NVIDIA V100

🔧 Hyperparameters

BATCH_SIZE = 32
LEARNING_RATE = 1e-3
CLIP = 1.0
N_EPOCHS = 10
ENC_EMB_DIM = 256
DEC_EMB_DIM = 256
ENC_HID_DIM = 512
DEC_HID_DIM = 512

📚 Dataset

Using the loresiensis/corpus-en-es dataset from Hugging Face Hub, which provides English-Spanish sentence pairs for training.

🤝 Contributing

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Attention Is All You Need paper
Hugging Face for the transformers library and datasets
PyTorch team for the amazing deep learning framework

⭐️ If you found this project helpful, please consider giving it a star!

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.