Update README.md
Browse files
README.md
CHANGED
@@ -9,7 +9,7 @@ base_model:
|
|
9 |
|
10 |
## **Pretraining on Phosphosites and their MSAs with MLM Objective on ESM-1b Architecture**
|
11 |
|
12 |
-
This repository presents a pretrained ESM-1b architecture, where the weights are initialized **from scratch** and trained using the Masked Language Modeling (MLM) objective. The training data consists of labeled phosphosites derived from (
|
13 |
|
14 |
### **Developed by:**
|
15 |
Zeynep Işık (MSc, Sabanci University)
|
@@ -18,7 +18,7 @@ Zeynep Işık (MSc, Sabanci University)
|
|
18 |
|
19 |
Architecture: ESM-1b (trained from scratch)
|
20 |
Pretraining Objective: Masked Language Modeling (MLM)
|
21 |
-
Dataset: Labeled phosphosites from (
|
22 |
Total Samples: 702,468 (10% seperated for validation)
|
23 |
Sequence Length: ≤ 128 residues
|
24 |
Batch Size: 64
|
|
|
9 |
|
10 |
## **Pretraining on Phosphosites and their MSAs with MLM Objective on ESM-1b Architecture**
|
11 |
|
12 |
+
This repository presents a pretrained ESM-1b architecture, where the weights are initialized **from scratch** and trained using the Masked Language Modeling (MLM) objective. The training data consists of labeled phosphosites derived from [DARKIN](https://openreview.net/forum?id=a4x5tbYRYV) and and their Multiple Sequence Alignments (MSA).
|
13 |
|
14 |
### **Developed by:**
|
15 |
Zeynep Işık (MSc, Sabanci University)
|
|
|
18 |
|
19 |
Architecture: ESM-1b (trained from scratch)
|
20 |
Pretraining Objective: Masked Language Modeling (MLM)
|
21 |
+
Dataset: Labeled phosphosites from [DARKIN](https://openreview.net/forum?id=a4x5tbYRYV) and their MSAs.
|
22 |
Total Samples: 702,468 (10% seperated for validation)
|
23 |
Sequence Length: ≤ 128 residues
|
24 |
Batch Size: 64
|