Update README.md
Browse files
README.md
CHANGED
|
@@ -9,7 +9,7 @@ license: cc-by-nc-4.0
|
|
| 9 |
---
|
| 10 |
|
| 11 |
|
| 12 |
-
**EgyBERT** is a large language model focused exclusively on
|
| 13 |
|
| 14 |
|
| 15 |
|
|
|
|
| 9 |
---
|
| 10 |
|
| 11 |
|
| 12 |
+
**EgyBERT** is a large language model focused exclusively on Egyptian dialectal texts. The model was pretrained on two large-scale corpora: the Egyptian Tweets Corpus (ETC), which contains +34 million tweets, and the Egyptian Forum Corpus, which includes +44 million sentences collected from various online forums. The datasets comprise **10.4GB of text**. The code files along with the results are available on [repo](https://github.com/FaisalQarah/EgyBERT).
|
| 13 |
|
| 14 |
|
| 15 |
|