ilsp
/

Llama-Krikri-8B-Base

@@ -3,14 +3,16 @@ license: llama3.1
 language:
 - el
 - en
-pipeline_tag: token-classification
 library_name: transformers
 ---
 # Llama-Krikri-8B: A large foundation Language Model for the Greek language
-Following the release of [Meltemi-7B](https://huggingface.co/ilsp/Meltemi-7B-v1) on the 26th March 2024 we are happy to welcome Krikri to the family of ILSP open Greek LLMs.
-Krikri is built on top of [Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B), extending its capabilities for Greek through continual pretraining on a large corpus of high-quality and locally relevant Greek texts. We present Llama-Krikri-8B-Base, as well as an instruct version [Llama-Krikri-8B-Instruct](https://huggingface.co/ilsp/Llama-Krikri-8B-instruct).
 ![image/png](llama-krikri-image.jpg)
@@ -18,7 +20,7 @@ Krikri is built on top of [Llama-3.1-8B](https://huggingface.co/meta-llama/Llama
 - Vocabulary extension of the Llama-3.1 tokenizer with Greek tokens
 - 128k context length
-- We extend the pretraining of Llama-3.1-8B with added proficiency for the Greek language, by utilizing a large corpus consisting of approximately **110 billion tokens**.
   * This corpus includes 55 billion monolingual Greek tokens, constructed from publicly available resources.
   * Additionaly, to mitigate catastrophic forgetting and ensure that the model has bilingual capabilities, we use additional sub-corpora with monolingual English texts (23,3 billion tokens) and Greek-English parallel data (5,26 billion tokens).
   * The training corpus also contains 6 billion math and code tokens.
@@ -33,7 +35,7 @@ Krikri is built on top of [Llama-3.1-8B](https://huggingface.co/meta-llama/Llama
 | Math/Code |  5,951,964,497   | 6.6%       |
 | **Total** | **89,653,165,085**   |  **100%**       |
-Chosen subsets of the 89.65 billion corpus were upsampled resulting in a size of 110 billion tokens.
 # How to use

 language:
 - el
 - en
+pipeline_tag: text-generation
 library_name: transformers
+tags:
+- text-generation-inference
 ---
 # Llama-Krikri-8B: A large foundation Language Model for the Greek language
+Following the release of [Meltemi-7B](https://huggingface.co/ilsp/Meltemi-7B-v1) on the 26th March 2024, we are happy to welcome Krikri to the family of ILSP open Greek LLMs.
+Krikri is built on top of [Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B), extending its capabilities for Greek through continual pretraining on a large corpus of high-quality and locally relevant Greek texts. We present Llama-Krikri-8B-Base, as well as an instruct version, [Llama-Krikri-8B-Instruct](https://huggingface.co/ilsp/Llama-Krikri-8B-instruct).
 ![image/png](llama-krikri-image.jpg)
 - Vocabulary extension of the Llama-3.1 tokenizer with Greek tokens
 - 128k context length
+- We extend the pretraining of Llama-3.1-8B with added proficiency for the Greek language, by utilizing a large training corpus.
   * This corpus includes 55 billion monolingual Greek tokens, constructed from publicly available resources.
   * Additionaly, to mitigate catastrophic forgetting and ensure that the model has bilingual capabilities, we use additional sub-corpora with monolingual English texts (23,3 billion tokens) and Greek-English parallel data (5,26 billion tokens).
   * The training corpus also contains 6 billion math and code tokens.
 | Math/Code |  5,951,964,497   | 6.6%       |
 | **Total** | **89,653,165,085**   |  **100%**       |
+Chosen subsets of the 89.65 billion corpus were upsampled resulting in a size of **110 billion tokens**.
 # How to use