Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,53 @@
|
|
1 |
-
---
|
2 |
-
license: cc-by-4.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: cc-by-4.0
|
3 |
+
datasets:
|
4 |
+
- togethercomputer/RedPajama-Data-V2
|
5 |
+
language:
|
6 |
+
- fr
|
7 |
+
library_name: transformers
|
8 |
+
---
|
9 |
+
|
10 |
+
# T5-french-base Model
|
11 |
+
|
12 |
+
## Model Overview
|
13 |
+
The T5-French-Base model is a T5 model trained solely on French data from the RedPajama 2 dataset.
|
14 |
+
This model was trained for 85,000 steps and was only pre-trained without any supervised training.
|
15 |
+
Therefore, this model has to be fine-tuned before it is useable on a downstream task.
|
16 |
+
It is intended to serve as a foundation for further fine-tuning and as a starting point for downstream tasks in the French language.
|
17 |
+
Since the training compute buget was very limited, the model is mainly useful for research only.
|
18 |
+
|
19 |
+
## Model Details
|
20 |
+
- Model Architecture: T5 Base, version 1.1 (GEGLU activation in feed-forward hidden layer, rather than ReLU)
|
21 |
+
- Training Dataset: RedPajama 2 dataset (French-only)
|
22 |
+
- Training Steps: 85,000
|
23 |
+
- Tokenizer: T5 Tokenizer
|
24 |
+
|
25 |
+
## Intended Use
|
26 |
+
The T5-French-Base model is intended to be used for research only, in order to serve as a pre-trained model for further fine-tuning on specific French language tasks.
|
27 |
+
It may be used as a starting point for fine-tuning on tasks such as:
|
28 |
+
- French text generation
|
29 |
+
- French question answering
|
30 |
+
- French language understanding
|
31 |
+
- French text summarization
|
32 |
+
|
33 |
+
## Limitations
|
34 |
+
The T5-French-Base model may not be suitable for user-facing, or production applications.
|
35 |
+
It is mainly meant for researchers only.
|
36 |
+
The training budget was really limited (85k steps only, for a final loss of ~1.1).
|
37 |
+
The model is a base model that hasn't been fine-tuned yet. As such, it does NOT follow instructions.
|
38 |
+
Additionally, the model was trained solely on French data and won't work for tasks that require cross-lingual understanding or multilingual capabilities.
|
39 |
+
|
40 |
+
## Ethical Considerations
|
41 |
+
The T5-French-Base model was trained on publicly available data and does not contain any known biases or ethical concerns.
|
42 |
+
However, researchers should be aware of potential biases in the RedPajama 2 training data and should carefully evaluate the model's outputs for any unintended consequences.
|
43 |
+
|
44 |
+
## Citation
|
45 |
+
If you use the RedPajama-T5-Base-French model in your work, please cite the original Google T5 model, as well as the following:
|
46 |
+
```
|
47 |
+
@article{guillaumeT5french,
|
48 |
+
title={T5-French-Base model: A T5 model trained on french data only},
|
49 |
+
author={guillaumephd},
|
50 |
+
url={https://huggingface.co/guillaumephd/t5-french-base},
|
51 |
+
year={2024}
|
52 |
+
}
|
53 |
+
```
|