typo fix
Browse files
README.md
CHANGED
|
@@ -309,8 +309,8 @@ output = tokenizer.batch_decode(output)
|
|
| 309 |
print(output)
|
| 310 |
```
|
| 311 |
|
| 312 |
-
**Model
|
| 313 |
-
Granite-3.0-2B-Instruct is based on a decoder-only dense transformer architecture. Core components of this architecture are: GQA and RoPE, MLP with SwiGLU, RMSNorm, and shared input/output
|
| 314 |
|
| 315 |
| Model | 2B Dense | 8B Dense | 1B MoE | 3B MoE |
|
| 316 |
| :-------- | :-------- | :--------| :--------| :--------|
|
|
|
|
| 309 |
print(output)
|
| 310 |
```
|
| 311 |
|
| 312 |
+
**Model Architecture:**
|
| 313 |
+
Granite-3.0-2B-Instruct is based on a decoder-only dense transformer architecture. Core components of this architecture are: GQA and RoPE, MLP with SwiGLU, RMSNorm, and shared input/output embeddings.
|
| 314 |
|
| 315 |
| Model | 2B Dense | 8B Dense | 1B MoE | 3B MoE |
|
| 316 |
| :-------- | :-------- | :--------| :--------| :--------|
|