Update README.md
Browse files
README.md
CHANGED
@@ -2,4 +2,230 @@
|
|
2 |
license: apache-2.0
|
3 |
language:
|
4 |
- en
|
5 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: apache-2.0
|
3 |
language:
|
4 |
- en
|
5 |
+
---
|
6 |
+
# Model Card for Xylaria-1.4-smol
|
7 |
+
|
8 |
+
## Model Details
|
9 |
+
|
10 |
+
### Model Description
|
11 |
+
|
12 |
+
**Xylaria-1.4-smol** is a highly compact Recurrent Neural Network (RNN) with just **1 MB of storage** and **2 million parameters**. Designed for efficiency, this model represents a breakthrough in lightweight neural network architecture, optimized for resource-constrained environments.
|
13 |
+
|
14 |
+
- **Developed by:** Sk Md Saad Amin
|
15 |
+
- **Model type:** Recurrent Neural Network (RNN)
|
16 |
+
- **Parameters:** 2 million
|
17 |
+
- **Storage Size:** 1 MB
|
18 |
+
- **Language(s):** English
|
19 |
+
- **License:** Apache-2.0
|
20 |
+
|
21 |
+
|
22 |
+
### Direct Use
|
23 |
+
|
24 |
+
Xylaria-1.4-smol is ideal for:
|
25 |
+
- Edge computing applications
|
26 |
+
- Mobile and IoT devices
|
27 |
+
- Low-resource environment deployments
|
28 |
+
- Real-time inference with minimal computational overhead
|
29 |
+
|
30 |
+
### Downstream Use
|
31 |
+
|
32 |
+
The model can be fine-tuned for various tasks such as:
|
33 |
+
- Lightweight text generation
|
34 |
+
- Simple sequence prediction
|
35 |
+
- Embedded system applications
|
36 |
+
- Educational demonstrations of efficient neural network design
|
37 |
+
|
38 |
+
### Out-of-Scope Use
|
39 |
+
|
40 |
+
- High-complexity natural language processing tasks
|
41 |
+
- Applications requiring extensive computational resources
|
42 |
+
- Tasks demanding state-of-the-art accuracy in complex domains
|
43 |
+
- It doesn't shine in tasks that are very heavy as this is made for educational and research purposes only
|
44 |
+
## Bias, Risks, and Limitations
|
45 |
+
|
46 |
+
- Limited capacity due to compact design
|
47 |
+
- Potential performance trade-offs for complexity
|
48 |
+
- May not perform as well as larger models in nuanced tasks
|
49 |
+
- Has extremely small vocab size of 108
|
50 |
+
|
51 |
+
### Recommendations
|
52 |
+
|
53 |
+
- Carefully evaluate performance for specific use cases
|
54 |
+
- Consider model limitations in critical applications
|
55 |
+
- Potential for transfer learning and fine-tuning
|
56 |
+
|
57 |
+
### Model Architecture and Objective
|
58 |
+
|
59 |
+
- **Architecture:** Compact Recurrent Neural Network
|
60 |
+
- **Objective:** Efficient sequence processing
|
61 |
+
- **Key Features:**
|
62 |
+
- Minimal parameter count
|
63 |
+
- Reduced storage footprint
|
64 |
+
- Low computational requirements
|
65 |
+
|
66 |
+
|
67 |
+
#### Hardware
|
68 |
+
- Suitable for:
|
69 |
+
- Microcontrollers
|
70 |
+
- Mobile devices
|
71 |
+
- Edge computing platforms
|
72 |
+
|
73 |
+
#### Software
|
74 |
+
- Compatible with:
|
75 |
+
- TensorFlow Lite
|
76 |
+
- PyTorch Mobile
|
77 |
+
- ONNX Runtime
|
78 |
+
|
79 |
+
## Citation (If you find my work helpful, please consider giving a cite)
|
80 |
+
|
81 |
+
**BibTeX:**
|
82 |
+
```bibtex
|
83 |
+
@misc{xylaria2024smol,
|
84 |
+
title={Xylaria-1.4-smol: A Compact Efficient RNN},
|
85 |
+
author={[Your Name]},
|
86 |
+
year={2024}
|
87 |
+
}
|
88 |
+
```
|
89 |
+
## One Can include the xylaria code like this
|
90 |
+
```python
|
91 |
+
import torch
|
92 |
+
import torch.nn as nn
|
93 |
+
|
94 |
+
class XylariaSmolRNN(nn.Module):
|
95 |
+
def __init__(self, config):
|
96 |
+
super(XylariaSmolRNN, self).__init__()
|
97 |
+
|
98 |
+
|
99 |
+
self.vocab_size = config['vocab_size']
|
100 |
+
self.embedding_dim = config['embedding_dim']
|
101 |
+
self.hidden_dim = config['hidden_dim']
|
102 |
+
self.num_layers = config['num_layers']
|
103 |
+
self.char_to_idx = config['char_to_idx']
|
104 |
+
|
105 |
+
|
106 |
+
self.embedding = nn.Embedding(
|
107 |
+
num_embeddings=self.vocab_size,
|
108 |
+
embedding_dim=self.embedding_dim,
|
109 |
+
padding_idx=self.char_to_idx['<PAD>']
|
110 |
+
)
|
111 |
+
|
112 |
+
|
113 |
+
self.rnn = nn.LSTM(
|
114 |
+
input_size=self.embedding_dim,
|
115 |
+
hidden_size=self.hidden_dim,
|
116 |
+
num_layers=self.num_layers,
|
117 |
+
batch_first=True
|
118 |
+
)
|
119 |
+
|
120 |
+
|
121 |
+
self.fc = nn.Linear(self.hidden_dim, self.vocab_size)
|
122 |
+
|
123 |
+
|
124 |
+
self.dropout = nn.Dropout(0.3)
|
125 |
+
|
126 |
+
def forward(self, x):
|
127 |
+
|
128 |
+
embedded = self.embedding(x)
|
129 |
+
|
130 |
+
|
131 |
+
rnn_out, (hidden, cell) = self.rnn(embedded)
|
132 |
+
|
133 |
+
|
134 |
+
rnn_out = self.dropout(rnn_out)
|
135 |
+
|
136 |
+
|
137 |
+
output = self.fc(rnn_out)
|
138 |
+
|
139 |
+
return output, (hidden, cell)
|
140 |
+
|
141 |
+
def demonstrate_xylaria_model():
|
142 |
+
|
143 |
+
model_config = {
|
144 |
+
"vocab_size": 108,
|
145 |
+
"embedding_dim": 50,
|
146 |
+
"hidden_dim": 128,
|
147 |
+
"num_layers": 2,
|
148 |
+
"char_to_idx": {" ": 1, "!": 2, "\"": 3, "#": 4, "$": 5, "%": 6, "&": 7, "'": 8, "(": 9, ")": 10, "*": 11, "+": 12, ",": 13, "-": 14, ".": 15, "/": 16, "0": 17, "1": 18, "2": 19, "3": 20, "4": 21, "5": 22, "6": 23, "7": 24, "8": 25, "9": 26, ":": 27, ";": 28, "<": 29, "=": 30, ">": 31, "?": 32, "A": 33, "B": 34, "C": 35, "D": 36, "E": 37, "F": 38, "G": 39, "H": 40, "I": 41, "J": 42, "K": 43, "L": 44, "M": 45, "N": 46, "O": 47, "P": 48, "Q": 49, "R": 50, "S": 51, "T": 52, "U": 53, "V": 54, "W": 55, "X": 56, "Y": 57, "Z": 58, "[": 59, "\\": 60, "]": 61, "^": 62, "_": 63, "a": 64, "b": 65, "c": 66, "d": 67, "e": 68, "f": 69, "g": 70, "h": 71, "i": 72, "j": 73, "k": 74, "l": 75, "m": 76, "n": 77, "o": 78, "p": 79, "q": 80, "r": 81, "s": 82, "t": 83, "u": 84, "v": 85, "w": 86, "x": 87, "y": 88, "z": 89, "{": 90, "}": 91, "°": 92, "²": 93, "à": 94, "á": 95, "æ": 96, "é": 97, "í": 98, "ó": 99, "ö": 100, "–": 101, "'": 102, "'": 103, """: 104, """: 105, "…": 106, "<PAD>": 0, "<UNK>": 107}
|
149 |
+
}
|
150 |
+
|
151 |
+
|
152 |
+
model = XylariaSmolRNN(model_config)
|
153 |
+
|
154 |
+
|
155 |
+
total_params = sum(p.numel() for p in model.parameters())
|
156 |
+
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
|
157 |
+
|
158 |
+
print(f"Total Parameters: {total_params}")
|
159 |
+
print(f"Trainable Parameters: {trainable_params}")
|
160 |
+
print(f"Model Size Estimate: {total_params * 4 / 1024 / 1024:.2f} MB")
|
161 |
+
|
162 |
+
|
163 |
+
batch_size = 1
|
164 |
+
sequence_length = 20
|
165 |
+
x = torch.randint(0, model_config['vocab_size'], (batch_size, sequence_length))
|
166 |
+
|
167 |
+
|
168 |
+
with torch.no_grad():
|
169 |
+
output, (hidden, cell) = model(x)
|
170 |
+
print("Model Output Shape:", output.shape)
|
171 |
+
print("Hidden State Shape:", hidden.shape)
|
172 |
+
print("Cell State Shape:", cell.shape)
|
173 |
+
|
174 |
+
|
175 |
+
try:
|
176 |
+
|
177 |
+
scripted_model = torch.jit.script(model)
|
178 |
+
scripted_model.save("xylaria_smol_model.pt")
|
179 |
+
print("Model exported for deployment")
|
180 |
+
except Exception as e:
|
181 |
+
print(f"Export failed: {e}")
|
182 |
+
|
183 |
+
|
184 |
+
def generate_text(model, start_char, max_length=100):
|
185 |
+
|
186 |
+
current_char = torch.tensor([[model.char_to_idx.get(start_char, model.char_to_idx['<UNK>'])]])
|
187 |
+
|
188 |
+
|
189 |
+
hidden = None
|
190 |
+
generated_text = [start_char]
|
191 |
+
|
192 |
+
for _ in range(max_length - 1):
|
193 |
+
with torch.no_grad():
|
194 |
+
|
195 |
+
embedded = model.embedding(current_char)
|
196 |
+
if hidden is None:
|
197 |
+
rnn_out, (hidden, cell) = model.rnn(embedded)
|
198 |
+
else:
|
199 |
+
rnn_out, (hidden, cell) = model.rnn(embedded, (hidden, cell))
|
200 |
+
|
201 |
+
|
202 |
+
output = model.fc(rnn_out)
|
203 |
+
|
204 |
+
|
205 |
+
probabilities = torch.softmax(output[0, -1], dim=0)
|
206 |
+
next_char_idx = torch.multinomial(probabilities, 1).item()
|
207 |
+
|
208 |
+
|
209 |
+
idx_to_char = {idx: char for char, idx in model.char_to_idx.items()}
|
210 |
+
next_char = idx_to_char.get(next_char_idx, '<UNK>')
|
211 |
+
|
212 |
+
generated_text.append(next_char)
|
213 |
+
current_char = torch.tensor([[next_char_idx]])
|
214 |
+
|
215 |
+
if next_char == '<UNK>':
|
216 |
+
break
|
217 |
+
|
218 |
+
return ''.join(generated_text)
|
219 |
+
|
220 |
+
|
221 |
+
print("\nText Generation Example:")
|
222 |
+
generated = generate_text(model, 'A')
|
223 |
+
print(generated)
|
224 |
+
|
225 |
+
if __name__ == "__main__":
|
226 |
+
demonstrate_xylaria_model()
|
227 |
+
```
|
228 |
+
PS: THE CODE MY BE A BIT WRONG SO, ADJUST ACCORDINGLY
|
229 |
+
## More Information
|
230 |
+
|
231 |
+
Xylaria-1.4-smol represents a significant step towards ultra-efficient neural network design, demonstrating that powerful machine learning can be achieved with minimal computational resources.
|