Japanese
llama
causal-lm

This repo contains a low-rank adapter for LLaMA-7b fit on the llm-japanese-dataset dataset.

This version of the weights was trained with the following hyperparameters:

  • Epochs: 5
  • Batch size: 128
  • Cutoff length: 256
  • Learning rate: 3e-4
  • Lora r: 4
  • Lora target modules: q_proj, v_proj
import torch
from transformers import LlamaForCausalLM, LlamaTokenizer
from peft import PeftModel

base_model = "decapoda-research/llama-7b-hf"
# Please note that the special license of decapoda-research/llama-7b-hf is applied.
model = LlamaForCausalLM.from_pretrained(base_model, torch_dtype=torch.float16)
tokenizer = LlamaTokenizer.from_pretrained(base_model)
model = PeftModel.from_pretrained(
    model,
    "izumi-lab/llama-7b-japanese-lora-v0",
    torch_dtype=torch.float16,
)

To see more latest information, please go to llm.msuzuki.me.

Details

Citation:

@preprint{Suzuki2023-llmj,
  title={{日本語インストラクションデータを用いた対話可能な日本語大規模言語モデルのLoRAチューニング}},
  author={鈴木 雅弘 and 平野 正徳 and 坂地 泰紀},
  doi={10.51094/jxiv.422},
  archivePrefix={Jxiv},
  year={2023}
}

If you have any inquiries, such as joint research, data provision, various types of support, please email to [email protected] .

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train izumi-lab/llama-7b-japanese-lora-v0-5ep