Subhashsah
/

seq2seq-en-es

Model card Files Files and versions Community

Neural Machine Translation with Attention 🚀

A PyTorch implementation of a Sequence-to-Sequence model with Attention for English-Spanish translation.

🌟 Features

Bidirectional GRU Encoder: Captures context from both directions of the input sequence
Attention Mechanism: Helps the model focus on relevant parts of the input sequence
Teacher Forcing: Implements curriculum learning for better training stability
Dynamic Batching: Efficient training with variable sequence lengths
Hugging Face Integration: Uses MarianTokenizer for robust text processing

🏗️ Architecture

The model consists of three main components:

Encoder: Bidirectional GRU network that processes input sequences
Attention: Computes attention weights for each encoder state
Decoder: GRU network that generates translations using attention context

Input → Encoder → Attention → Decoder → Translation
      ↑          ↑          ↑
      Embeddings Context    Attention Weights

🚀 Quick Start

Clone the repository:

git clone https://github.com/yourusername/nmt-attention.git
cd nmt-attention

Install dependencies:

pip install torch transformers datasets

Train the model:

python train.py

Translate text:

from translate import translate
text = "How are you?"
translated = translate(model, text, tokenizer)
print(translated)

# Loading a saved model
model = Seq2Seq(encoder, decoder, device)
model.load_state_dict(torch.load('LSTM_text_generator.pth'))
model.eval()

📊 Model Performance

Training metrics after 10 epochs:

Initial Loss: 11.147
Final Loss: 3.527
Training Time: ~2 hours on NVIDIA V100

🔧 Hyperparameters

BATCH_SIZE = 32
LEARNING_RATE = 1e-3
CLIP = 1.0
N_EPOCHS = 10
ENC_EMB_DIM = 256
DEC_EMB_DIM = 256
ENC_HID_DIM = 512
DEC_HID_DIM = 512

📚 Dataset

Using the loresiensis/corpus-en-es dataset from Hugging Face Hub, which provides English-Spanish sentence pairs for training.

🤝 Contributing

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Attention Is All You Need paper
Hugging Face for the transformers library and datasets
PyTorch team for the amazing deep learning framework

⭐️ If you found this project helpful, please consider giving it a star!

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.