guillaumephd
/

t5-french-base

text2text-generation

text-generation-inference

Model card Files Files and versions Community

guillaumephd commited on May 17, 2024

Commit

0a5a0f5

·

verified ·

1 Parent(s): 5cfa792

Update README.md

Files changed (1) hide show

README.md +53 -3

README.md CHANGED Viewed

@@ -1,3 +1,53 @@
----
-license: cc-by-4.0
----

+---
+license: cc-by-4.0
+datasets:
+- togethercomputer/RedPajama-Data-V2
+language:
+- fr
+library_name: transformers
+---
+# T5-french-base Model
+## Model Overview
+The T5-French-Base model is a T5 model trained solely on French data from the RedPajama 2 dataset.
+This model was trained for 85,000 steps and was only pre-trained without any supervised training.
+Therefore, this model has to be fine-tuned before it is useable on a downstream task.
+It is intended to serve as a foundation for further fine-tuning and as a starting point for downstream tasks in the French language.
+Since the training compute buget was very limited, the model is mainly useful for research only.
+## Model Details
+- Model Architecture: T5 Base, version 1.1 (GEGLU activation in feed-forward hidden layer, rather than ReLU)
+- Training Dataset: RedPajama 2 dataset (French-only)
+- Training Steps: 85,000
+- Tokenizer: T5 Tokenizer
+## Intended Use
+The T5-French-Base model is intended to be used for research only, in order to serve as a pre-trained model for further fine-tuning on specific French language tasks.
+It may be used as a starting point for fine-tuning on tasks such as:
+- French text generation
+- French question answering
+- French language understanding
+- French text summarization
+## Limitations
+The T5-French-Base model may not be suitable for user-facing, or production applications.
+It is mainly meant for researchers only.
+The training budget was really limited (85k steps only, for a final loss of ~1.1).
+The model is a base model that hasn't been fine-tuned yet. As such, it does NOT follow instructions.
+Additionally, the model was trained solely on French data and won't work for tasks that require cross-lingual understanding or multilingual capabilities.
+## Ethical Considerations
+The T5-French-Base model was trained on publicly available data and does not contain any known biases or ethical concerns.
+However, researchers should be aware of potential biases in the RedPajama 2 training data and should carefully evaluate the model's outputs for any unintended consequences.
+## Citation
+If you use the RedPajama-T5-Base-French model in your work, please cite the original Google T5 model, as well as the following:
+```
+@article{guillaumeT5french,
+  title={T5-French-Base model: A T5 model trained on french data only},
+  author={guillaumephd},
+  url={https://huggingface.co/guillaumephd/t5-french-base},
+  year={2024}
+}
+```