Update README.md
Browse files
README.md
CHANGED
@@ -24,7 +24,7 @@ It was trained on **160B tokens** using a mix of 40% FineWeb-Edu and 60% from Fi
|
|
24 |
|
25 |
### Intended use
|
26 |
|
27 |
-
This model was trained on English math data and is not instruction-tuned, making it intended for text completion in English.
|
28 |
|
29 |
### Generation
|
30 |
|
@@ -43,21 +43,6 @@ outputs = model.generate(inputs)
|
|
43 |
print(tokenizer.decode(outputs[0]))
|
44 |
```
|
45 |
|
46 |
-
## Intermediate checkpoints
|
47 |
-
|
48 |
-
We are releasing intermediate checkpoints for this model at intervals of every 10000 training steps (10B tokens) in separate branches. The naming convention is `10B`.
|
49 |
-
|
50 |
-
You can load a specific model revision with `transformers` using the argument `revision`:
|
51 |
-
```python
|
52 |
-
model = AutoModelForCausalLM.from_pretrained("HuggingFaceTB/FineMath-Llama-3B", revision="10B")
|
53 |
-
```
|
54 |
-
You can access all the revisions for the models via the following code:
|
55 |
-
```python
|
56 |
-
from huggingface_hub import list_repo_refs
|
57 |
-
out = list_repo_refs("HuggingFaceTB/FineMath-Llama-3B")
|
58 |
-
print([b.name for b in out.branches])
|
59 |
-
```
|
60 |
-
|
61 |
## Training
|
62 |
### Model
|
63 |
- **Architecture**: Llama3
|
|
|
24 |
|
25 |
### Intended use
|
26 |
|
27 |
+
This model was trained on English math data and is not instruction-tuned, making it intended for text completion in English. It is part of the FineMath ablation models we trained for FineMath (https://huggingface.co/HuggingFaceTB/finemath-ablation-4plus-160B), and is not necessarily the best possible outcome achievable with the given dataset.
|
28 |
|
29 |
### Generation
|
30 |
|
|
|
43 |
print(tokenizer.decode(outputs[0]))
|
44 |
```
|
45 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
46 |
## Training
|
47 |
### Model
|
48 |
- **Architecture**: Llama3
|