Hanish09 commited on
Commit
7772580
Β·
verified Β·
1 Parent(s): 42f6a7c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +111 -3
README.md CHANGED
@@ -1,3 +1,111 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ - es
6
+ ---
7
+ # Neural Machine Translation with Attention πŸš€
8
+
9
+ A PyTorch implementation of a Sequence-to-Sequence model with Attention for English-Spanish translation.
10
+
11
+ ![Python](https://img.shields.io/badge/Python-3.8%2B-blue)
12
+ ![PyTorch](https://img.shields.io/badge/PyTorch-2.0%2B-orange)
13
+ ![License](https://img.shields.io/badge/license-MIT-green)
14
+
15
+ ## 🌟 Features
16
+
17
+ - **Bidirectional GRU Encoder**: Captures context from both directions of the input sequence
18
+ - **Attention Mechanism**: Helps the model focus on relevant parts of the input sequence
19
+ - **Teacher Forcing**: Implements curriculum learning for better training stability
20
+ - **Dynamic Batching**: Efficient training with variable sequence lengths
21
+ - **Hugging Face Integration**: Uses MarianTokenizer for robust text processing
22
+
23
+ ## πŸ—οΈ Architecture
24
+
25
+ The model consists of three main components:
26
+
27
+ 1. **Encoder**: Bidirectional GRU network that processes input sequences
28
+ 2. **Attention**: Computes attention weights for each encoder state
29
+ 3. **Decoder**: GRU network that generates translations using attention context
30
+
31
+ ```plaintext
32
+ Input β†’ Encoder β†’ Attention β†’ Decoder β†’ Translation
33
+ ↑ ↑ ↑
34
+ Embeddings Context Attention Weights
35
+ ```
36
+
37
+ ## πŸš€ Quick Start
38
+
39
+ 1. Clone the repository:
40
+ ```bash
41
+ git clone https://github.com/yourusername/nmt-attention.git
42
+ cd nmt-attention
43
+ ```
44
+
45
+ 2. Install dependencies:
46
+ ```bash
47
+ pip install torch transformers datasets
48
+ ```
49
+
50
+ 3. Train the model:
51
+ ```python
52
+ python train.py
53
+ ```
54
+
55
+ 4. Translate text:
56
+ ```python
57
+ from translate import translate
58
+ text = "How are you?"
59
+ translated = translate(model, text, tokenizer)
60
+ print(translated)
61
+
62
+ # Loading a saved model
63
+ model = Seq2Seq(encoder, decoder, device)
64
+ model.load_state_dict(torch.load('LSTM_text_generator.pth'))
65
+ model.eval()
66
+ ```
67
+
68
+ ## πŸ“Š Model Performance
69
+
70
+ Training metrics after 10 epochs:
71
+ - Initial Loss: 11.147
72
+ - Final Loss: 3.527
73
+ - Training Time: ~2 hours on NVIDIA V100
74
+
75
+ ## πŸ”§ Hyperparameters
76
+
77
+ ```python
78
+ BATCH_SIZE = 32
79
+ LEARNING_RATE = 1e-3
80
+ CLIP = 1.0
81
+ N_EPOCHS = 10
82
+ ENC_EMB_DIM = 256
83
+ DEC_EMB_DIM = 256
84
+ ENC_HID_DIM = 512
85
+ DEC_HID_DIM = 512
86
+ ```
87
+
88
+ ## πŸ“š Dataset
89
+
90
+ Using the `loresiensis/corpus-en-es` dataset from Hugging Face Hub, which provides English-Spanish sentence pairs for training.
91
+
92
+ ## 🀝 Contributing
93
+
94
+ 1. Fork the repository
95
+ 2. Create your feature branch (`git checkout -b feature/amazing-feature`)
96
+ 3. Commit your changes (`git commit -m 'Add amazing feature'`)
97
+ 4. Push to the branch (`git push origin feature/amazing-feature`)
98
+ 5. Open a Pull Request
99
+
100
+ ## πŸ“ License
101
+
102
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
103
+
104
+ ## πŸ™ Acknowledgments
105
+
106
+ - [Attention Is All You Need](https://arxiv.org/abs/1706.03762) paper
107
+ - Hugging Face for the transformers library and datasets
108
+ - PyTorch team for the amazing deep learning framework
109
+
110
+ ---
111
+ ⭐️ If you found this project helpful, please consider giving it a star!