File size: 2,313 Bytes
37ee960 5add67a 37ee960 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
---
base_model: tokyotech-llm/Llama-3.1-Swallow-70B-v0.1
base_model_relation: quantized
license: llama3.1
language:
- ja
- en
---
**[2.2bpw](https://huggingface.co/rioshiina/Llama-3.1-Swallow-70B-v0.1-exl2/tree/2.2bpw)** (high quality loss, only for 24GB vRAM test.)
**[4.0bpw](https://huggingface.co/rioshiina/Llama-3.1-Swallow-70B-v0.1-exl2/tree/4.0bpw)**
**[6.0bpw](https://huggingface.co/rioshiina/Llama-3.1-Swallow-70B-v0.1-exl2/tree/6.0bpw)**
**[8.0bpw](https://huggingface.co/rioshiina/Llama-3.1-Swallow-70B-v0.1-exl2/tree/8.0bpw)**
# Llama-3.1-Swallow-70B-v0.1-exl2
- Model creator: [tokyotech-llm](https://huggingface.co/tokyotech-llm)
- Original model: [Llama-3.1-Swallow-70B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-v0.1)
### License
[META LLAMA 3.1 COMMUNITY LICENSE](https://www.llama.com/llama3_1/license/)
### Citations
```tex
@inproceedings{Fujii:COLM2024,
title={Continual Pre-Training for Cross-Lingual LLM Adaptation:
Enhancing Japanese Language Capabilities},
author={Kazuki Fujii and Taishi Nakamura and Mengsay Loem and Hiroki
Iida and Masanari Ohi and Kakeru Hattori and Hirai Shota and Sakae
Mizuki and Rio Yokota and Naoaki Okazaki},
booktitle="Proceedings of the First Conference on Language Modeling",
series={COLM},
pages="(to appear)",
year="2024",
month=oct,
address={University of Pennsylvania, USA},
}
@inproceedings{Okazaki:COLM2024,
title={Building a Large Japanese Web Corpus for Large Language Models},
author={Naoaki Okazaki and Kakeru Hattori and Hirai Shota and Hiroki
Iida and Masanari Ohi and Kazuki Fujii and Taishi Nakamura and Mengsay
Loem and Rio Yokota and Sakae Mizuki},
booktitle="Proceedings of the First Conference on Language Modeling",
series={COLM},
pages="(to appear)",
year="2024",
month=oct,
address={University of Pennsylvania, USA},
}
@misc{dubey2024llama3herdmodels,
title={The Llama 3 Herd of Models},
author={Abhimanyu Dubey and Abhinav Jauhri and Abhinav Pandey and Abhishek Kadian and Ahmad Al-Dahle and Aiesha Letman and Akhil Mathur and Alan Schelten and Amy Yang and Angela Fan et al.},
year={2024},
eprint={2407.21783},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2407.21783},
}
``` |