File size: 787 Bytes
8ddf596 97f00ee 8ddf596 97f00ee 8ddf596 97f00ee 8ddf596 97f00ee |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
---
library_name: transformers
license: apache-2.0
datasets:
- bigscience-data/roots_zh-tw_wikipedia
- bigscience-data/roots_en_wikipedia
language:
- zh
---
# Model Card for Chinese-OpenELM-270M
Finetuned from [apple/eOpenELM-270M](https://huggingface.co/apple/OpenELM-270M):
* Extended tokenizer with ~30K Chinese vocabs trained on [bigscience-data/roots_zh-tw_wikipedia](https://huggingface.co/datasets/bigscience-data/roots_zh-tw_wikipedia).
* Continual pre-trained with a mix of [bigscience-data/roots_zh-tw_wikipedia](https://huggingface.co/datasets/bigscience-data/roots_zh-tw_wikipedia) and [bigscience-data/roots_en_wikipedia](https://huggingface.co/datasets/bigscience-data/roots_en_wikipedia).
* Evaluation ppl = 1.6644828403646825 (split 3% training data as evaluation set) |