Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ library_name: transformers
|
|
10 |
# T5-french-base Model
|
11 |
|
12 |
## Model Overview
|
13 |
-
The T5-French-Base model is a T5 model trained solely on French data from the RedPajama 2 dataset.
|
14 |
This model was trained for 85,000 steps and was only pre-trained without any supervised training.
|
15 |
Therefore, this model has to be fine-tuned before it is useable on a downstream task.
|
16 |
It is intended to serve as a foundation for further fine-tuning and as a starting point for downstream tasks in the French language.
|
@@ -33,7 +33,7 @@ It may be used as a starting point for fine-tuning on tasks such as:
|
|
33 |
## Limitations
|
34 |
The T5-French-Base model may not be suitable for user-facing, or production applications.
|
35 |
It is mainly meant for researchers only.
|
36 |
-
The training budget was really limited (85k steps only, for a final loss of ~1.1).
|
37 |
The model is a base model that hasn't been fine-tuned yet. As such, it does NOT follow instructions.
|
38 |
Additionally, the model was trained solely on French data and won't work for tasks that require cross-lingual understanding or multilingual capabilities.
|
39 |
|
|
|
10 |
# T5-french-base Model
|
11 |
|
12 |
## Model Overview
|
13 |
+
The T5-French-Base model is a ~250M params only T5 model trained solely on French data from the RedPajama 2 dataset.
|
14 |
This model was trained for 85,000 steps and was only pre-trained without any supervised training.
|
15 |
Therefore, this model has to be fine-tuned before it is useable on a downstream task.
|
16 |
It is intended to serve as a foundation for further fine-tuning and as a starting point for downstream tasks in the French language.
|
|
|
33 |
## Limitations
|
34 |
The T5-French-Base model may not be suitable for user-facing, or production applications.
|
35 |
It is mainly meant for researchers only.
|
36 |
+
The training budget was really limited (85k steps only, ~250M params only, for a final loss of ~1.1).
|
37 |
The model is a base model that hasn't been fine-tuned yet. As such, it does NOT follow instructions.
|
38 |
Additionally, the model was trained solely on French data and won't work for tasks that require cross-lingual understanding or multilingual capabilities.
|
39 |
|