guillaumephd commited on
Commit
0a5a0f5
·
verified ·
1 Parent(s): 5cfa792

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -3
README.md CHANGED
@@ -1,3 +1,53 @@
1
- ---
2
- license: cc-by-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-4.0
3
+ datasets:
4
+ - togethercomputer/RedPajama-Data-V2
5
+ language:
6
+ - fr
7
+ library_name: transformers
8
+ ---
9
+
10
+ # T5-french-base Model
11
+
12
+ ## Model Overview
13
+ The T5-French-Base model is a T5 model trained solely on French data from the RedPajama 2 dataset.
14
+ This model was trained for 85,000 steps and was only pre-trained without any supervised training.
15
+ Therefore, this model has to be fine-tuned before it is useable on a downstream task.
16
+ It is intended to serve as a foundation for further fine-tuning and as a starting point for downstream tasks in the French language.
17
+ Since the training compute buget was very limited, the model is mainly useful for research only.
18
+
19
+ ## Model Details
20
+ - Model Architecture: T5 Base, version 1.1 (GEGLU activation in feed-forward hidden layer, rather than ReLU)
21
+ - Training Dataset: RedPajama 2 dataset (French-only)
22
+ - Training Steps: 85,000
23
+ - Tokenizer: T5 Tokenizer
24
+
25
+ ## Intended Use
26
+ The T5-French-Base model is intended to be used for research only, in order to serve as a pre-trained model for further fine-tuning on specific French language tasks.
27
+ It may be used as a starting point for fine-tuning on tasks such as:
28
+ - French text generation
29
+ - French question answering
30
+ - French language understanding
31
+ - French text summarization
32
+
33
+ ## Limitations
34
+ The T5-French-Base model may not be suitable for user-facing, or production applications.
35
+ It is mainly meant for researchers only.
36
+ The training budget was really limited (85k steps only, for a final loss of ~1.1).
37
+ The model is a base model that hasn't been fine-tuned yet. As such, it does NOT follow instructions.
38
+ Additionally, the model was trained solely on French data and won't work for tasks that require cross-lingual understanding or multilingual capabilities.
39
+
40
+ ## Ethical Considerations
41
+ The T5-French-Base model was trained on publicly available data and does not contain any known biases or ethical concerns.
42
+ However, researchers should be aware of potential biases in the RedPajama 2 training data and should carefully evaluate the model's outputs for any unintended consequences.
43
+
44
+ ## Citation
45
+ If you use the RedPajama-T5-Base-French model in your work, please cite the original Google T5 model, as well as the following:
46
+ ```
47
+ @article{guillaumeT5french,
48
+ title={T5-French-Base model: A T5 model trained on french data only},
49
+ author={guillaumephd},
50
+ url={https://huggingface.co/guillaumephd/t5-french-base},
51
+ year={2024}
52
+ }
53
+ ```