YoLo2000
/

TiLamb-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

YoLo2000 commited on Apr 3, 2024

Commit

9e9426a

·

verified ·

1 Parent(s): 2d4228d

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -9,7 +9,7 @@ language:
 # TiLamb-7B（Tibetan Large Language Model Base）
-**TiLamb-7B** 是一款专注于藏文的大型语言模型基座模型，它使用了 26.43GB 的藏文语料库进行开发，并基于 LLaMA2-7B 模型，通过 LoRA 方法进行了增量预训练。该模型在 LLaMA2 的基础上扩展了词表，从原有的词表大小 32,000 扩充藏文词汇至 61,221 ，并对 embedding 和 lm_head 进行了均值扩充初始化。更多信息请访问 [TiLamb-7B GitHub 主页](https://github.com/NLP-Learning/TiLamb)。
 **重要说明**：
 - TiLamb-7B 是一个未经监督微调的基座模型，**不具备对话能力**。
@@ -23,7 +23,7 @@ language:
 # TiLamb-7B (Tibetan Large Language Model Base)
-**TiLamb-7B** is a large-scale language model base focused on the Tibetan language, developed using a 26.43GB Tibetan corpus, and incrementally pre-trained through the LoRA method based on the LLaMA2-7B model. This model expands the vocabulary from the original size of 32,000 to 61,221 Tibetan entries, and initializes the embedding and lm_head with mean expansion. For more information, please visit the [TiLamb-7B GitHub page](https://github.com/NLP-Learning/TiLamb).
 **Important Notes**:
 - TiLamb-7B is an unsupervised fine-tuned base model, **lacking conversational capabilities**.

 # TiLamb-7B（Tibetan Large Language Model Base）
+**TiLamb-7B** 是藏文大语言模型的基座模型，它使用了 26.43GB 的藏文语料，基于Meta发布的可商用大模型 LLaMA2-7B 模型，通过 LoRA 方法进行了增量预训练。该模型在 LLaMA2 的基础上扩展了词表，从原有的词表大小 32,000 扩充藏文词汇至 61,221 ，并对 LLaMA2-7B 原始模型的 embedding 和 lm_head 进行了均值扩充初始化。更多信息请访问 [TiLamb-7B GitHub 主页](https://github.com/NLP-Learning/TiLamb)。
 **重要说明**：
 - TiLamb-7B 是一个未经监督微调的基座模型，**不具备对话能力**。
 # TiLamb-7B (Tibetan Large Language Model Base)
+**TiLamb-7B** is the foundational model for the Tibetan language, utilizing 26.43GB of Tibetan corpora. It's based on Meta's commercially available large model, LLaMA2-7B, and has been incrementally pre-trained using the LoRA method. This model expands on LLaMA2 by enlarging the vocabulary from the original 32,000 to 61,221 Tibetan words and initializes the embedding and lm_head of the original LLaMA2-7B model through mean expansion. For more information, please visit the [TiLamb-7B GitHub page](https://github.com/NLP-Learning/TiLamb).
 **Important Notes**:
 - TiLamb-7B is an unsupervised fine-tuned base model, **lacking conversational capabilities**.