Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -58,7 +58,7 @@ logits, emb = model(inputs)
|
|
58 |
|
59 |
### Training Data
|
60 |
|
61 |
-
- **Pretraining corpus:** Our initial model chrono-gpt-v1-19991231 is pretrained on
|
62 |
- **Incremental updates:** Yearly updates from 2000 to 2024 with an additional 65 billion tokens of timestamped text.
|
63 |
|
64 |
### Training Procedure
|
|
|
58 |
|
59 |
### Training Data
|
60 |
|
61 |
+
- **Pretraining corpus:** Our initial model chrono-gpt-v1-19991231 is pretrained on 21 billion tokens of pre-2000, diverse, high-quality, and open-source text data to ensure no leakage of data afterwards.
|
62 |
- **Incremental updates:** Yearly updates from 2000 to 2024 with an additional 65 billion tokens of timestamped text.
|
63 |
|
64 |
### Training Procedure
|