|
--- |
|
base_model: tokyotech-llm/Llama-3.1-Swallow-70B-v0.1 |
|
base_model_relation: quantized |
|
license: llama3.1 |
|
language: |
|
- ja |
|
- en |
|
--- |
|
|
|
**[2.2bpw](https://huggingface.co/rioshiina/Llama-3.1-Swallow-70B-v0.1-exl2/tree/2.2bpw)** (high quality loss, only for 24GB vRAM test.) |
|
**[4.0bpw](https://huggingface.co/rioshiina/Llama-3.1-Swallow-70B-v0.1-exl2/tree/4.0bpw)** |
|
**[6.0bpw](https://huggingface.co/rioshiina/Llama-3.1-Swallow-70B-v0.1-exl2/tree/6.0bpw)** |
|
**[8.0bpw](https://huggingface.co/rioshiina/Llama-3.1-Swallow-70B-v0.1-exl2/tree/8.0bpw)** |
|
|
|
# Llama-3.1-Swallow-70B-v0.1-exl2 |
|
- Model creator: [tokyotech-llm](https://huggingface.co/tokyotech-llm) |
|
- Original model: [Llama-3.1-Swallow-70B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-v0.1) |
|
|
|
### License |
|
|
|
[META LLAMA 3.1 COMMUNITY LICENSE](https://www.llama.com/llama3_1/license/) |
|
|
|
### Citations |
|
|
|
```tex |
|
@inproceedings{Fujii:COLM2024, |
|
title={Continual Pre-Training for Cross-Lingual LLM Adaptation: |
|
Enhancing Japanese Language Capabilities}, |
|
author={Kazuki Fujii and Taishi Nakamura and Mengsay Loem and Hiroki |
|
Iida and Masanari Ohi and Kakeru Hattori and Hirai Shota and Sakae |
|
Mizuki and Rio Yokota and Naoaki Okazaki}, |
|
booktitle="Proceedings of the First Conference on Language Modeling", |
|
series={COLM}, |
|
pages="(to appear)", |
|
year="2024", |
|
month=oct, |
|
address={University of Pennsylvania, USA}, |
|
} |
|
|
|
@inproceedings{Okazaki:COLM2024, |
|
title={Building a Large Japanese Web Corpus for Large Language Models}, |
|
author={Naoaki Okazaki and Kakeru Hattori and Hirai Shota and Hiroki |
|
Iida and Masanari Ohi and Kazuki Fujii and Taishi Nakamura and Mengsay |
|
Loem and Rio Yokota and Sakae Mizuki}, |
|
booktitle="Proceedings of the First Conference on Language Modeling", |
|
series={COLM}, |
|
pages="(to appear)", |
|
year="2024", |
|
month=oct, |
|
address={University of Pennsylvania, USA}, |
|
} |
|
|
|
@misc{dubey2024llama3herdmodels, |
|
title={The Llama 3 Herd of Models}, |
|
author={Abhimanyu Dubey and Abhinav Jauhri and Abhinav Pandey and Abhishek Kadian and Ahmad Al-Dahle and Aiesha Letman and Akhil Mathur and Alan Schelten and Amy Yang and Angela Fan et al.}, |
|
year={2024}, |
|
eprint={2407.21783}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.AI}, |
|
url={https://arxiv.org/abs/2407.21783}, |
|
} |
|
``` |