rdiehlmartinez commited on
Commit
8aa4460
Β·
verified Β·
1 Parent(s): 2e0d2c9

Removing redundant text from README

Browse files
Files changed (1) hide show
  1. README.md +2 -10
README.md CHANGED
@@ -22,7 +22,7 @@ For full documentation and code, visit our two main repositories:
22
 
23
  This HuggingFace organization hosts our pre-trained models and datasets, while the GitHub repository provides the code to train and analyze your own model suites from scratch.
24
 
25
- > Pro Tip πŸš€:
26
  > To learn more about these libraries and explore detailed tutorials, visit our official website [**picolm.io**](https://www.picolm.io) and get fully acquainted with the Pico ecosystem.
27
 
28
  ---
@@ -37,7 +37,7 @@ Our complete suite of models from 10M to 500M parameters trained with Pico:
37
  - [**pico-decoder-medium**](https://huggingface.co/pico-lm/pico-decoder-medium) (100M parameters)
38
  - [**pico-decoder-large**](https://huggingface.co/pico-lm/pico-decoder-large) (500M parameters)
39
 
40
- > 🚧 **Coming Soon!** **pico-decoder-xl** (1B parameters) Watch this space or star our [GitHub repository](https://github.com/rdiehlmartinez/pico) for updates!
41
 
42
  All models are trained for 50,000 steps on the [**pretokenized-dolma**](https://huggingface.co/datasets/pico-lm/pretokenized-dolma) dataset. They all see the same training data at each training step, use the same optimizatation process, and share the same model architecture; the only difference between models is the size of their hidden dimension.
43
 
@@ -61,14 +61,6 @@ We visualize the learning process in our **[Wandb](https://wandb.ai/pico-lm/pico
61
  | **Precision** | Mixed precision training |
62
  | **Vocabulary Size** | 50,280 |
63
 
64
-
65
- In each model repository, we version control checkpoints every 1000 steps that contain:
66
- - Weights and optimizer states (HuggingFace and Lightning Fabric-compatible versions)
67
- - Model activations and gradients
68
- - The batch of training data observed at the given training step
69
-
70
-
71
-
72
  ### **2. Datasets**
73
  1. **[pretokenized-dolma](https://huggingface.co/datasets/pico-lm/pretokenized-dolma)**
74
  - 420B tokens of pre-processed, tokenized and shuffled text extraced from the **[DOLMA](https://allenai.org/dolma)** corpus
 
22
 
23
  This HuggingFace organization hosts our pre-trained models and datasets, while the GitHub repository provides the code to train and analyze your own model suites from scratch.
24
 
25
+ > Pro Tip πŸš€ :
26
  > To learn more about these libraries and explore detailed tutorials, visit our official website [**picolm.io**](https://www.picolm.io) and get fully acquainted with the Pico ecosystem.
27
 
28
  ---
 
37
  - [**pico-decoder-medium**](https://huggingface.co/pico-lm/pico-decoder-medium) (100M parameters)
38
  - [**pico-decoder-large**](https://huggingface.co/pico-lm/pico-decoder-large) (500M parameters)
39
 
40
+ > 🚧 **Coming Soon!** **pico-decoder-xl** (1B parameters) Watch this space or star our [GitHub repository](https://github.com/pico-lm) for updates!
41
 
42
  All models are trained for 50,000 steps on the [**pretokenized-dolma**](https://huggingface.co/datasets/pico-lm/pretokenized-dolma) dataset. They all see the same training data at each training step, use the same optimizatation process, and share the same model architecture; the only difference between models is the size of their hidden dimension.
43
 
 
61
  | **Precision** | Mixed precision training |
62
  | **Vocabulary Size** | 50,280 |
63
 
 
 
 
 
 
 
 
 
64
  ### **2. Datasets**
65
  1. **[pretokenized-dolma](https://huggingface.co/datasets/pico-lm/pretokenized-dolma)**
66
  - 420B tokens of pre-processed, tokenized and shuffled text extraced from the **[DOLMA](https://allenai.org/dolma)** corpus