izumi-lab
/

llama-7b-japanese-lora-v0-5ep

Model card Files Files and versions Community

llama-7b-japanese-lora-v0-5ep / README.md

retarfi's picture

Update README.md

ebe782d over 1 year ago

|

1.46 kB

	---
	license: cc-by-sa-4.0
	datasets:
	- izumi-lab/llm-japanese-dataset
	language:
	- ja
	tags:
	- llama
	- causal-lm
	---

	This repo contains a low-rank adapter for LLaMA-7b
	fit on the [llm-japanese-dataset](https://github.com/masanorihirano/llm-japanese-dataset) dataset.

	This version of the weights was trained with the following hyperparameters:

	- Epochs: 5
	- Batch size: 128
	- Cutoff length: 256
	- Learning rate: 3e-4
	- Lora _r_: 4
	- Lora target modules: q_proj, v_proj

	```python
	import torch
	from transformers import LlamaForCausalLM, LlamaTokenizer
	from peft import PeftModel

	base_model = "decapoda-research/llama-7b-hf"
	# Please note that the special license of decapoda-research/llama-7b-hf is applied.
	model = LlamaForCausalLM.from_pretrained(base_model, torch_dtype=torch.float16)
	tokenizer = LlamaTokenizer.from_pretrained(base_model)
	model = PeftModel.from_pretrained(
	model,
	"izumi-lab/llama-7b-japanese-lora-v0",
	torch_dtype=torch.float16,
	)
	```

	To see more latest information, please go to [llm.msuzuki.me](https://llm.msuzuki.me).

	## Details

	- Japanese Paper: [https://jxiv.jst.go.jp/index.php/jxiv/preprint/view/422](https://jxiv.jst.go.jp/index.php/jxiv/preprint/view/422)
	- English Paper:
	- GitHub: [https://github.com/retarfi/jallm]
	- Website: [llm.msuzuki.me](https://llm.msuzuki.me).

	Citation:


	If you have any inquiries, such as joint research, data provision, various types of support, please email to [email protected] .