openorca_stx / README.md

Update README.md

26a395b over 1 year ago

5.22 kB

	---
	license: llama2
	datasets:
	- snow_simplified_japanese_corpus
	- khalidalt/tydiqa-goldp
	- csebuetnlp/xlsum
	language:
	- ja
	---
	# About
	This model is Lightblue's QLoRA finetune of OpenOrca's [Open-Orca/OpenOrcaxOpenChat-Preview2-13B](https://huggingface.co/Open-Orca/OpenOrcaxOpenChat-Preview2-13B) model on Japanese fine-tuning datasets.

	We trained on equal samples of the following three datasets:
	* [SNOW](https://huggingface.co/datasets/snow_simplified_japanese_corpus)
	* [TyDiQA (Ja)](https://huggingface.co/datasets/khalidalt/tydiqa-goldp)
	* [XLSUM (Ja)](https://huggingface.co/datasets/csebuetnlp/xlsum)

	which resulted in a dataset of 13167 samples total.

	These three datasets were chosen as they represent three distinct fine-tuning tasks (Text simplification, question answering, and text summarization, respectively) which we hypothesize can help to improve the language models suitability for dealing with Japanese data.
	These three datasets make up the model name: STX.

	# How to use

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

	tokenizer = AutoTokenizer.from_pretrained(model_dir)
	model = AutoModelForCausalLM.from_pretrained(
	model_dir, torch_dtype=torch.bfloat16, device_map='auto',
	)

	pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

	def do_closed_qa(context, question):
	return context + "\n\n" + question

	test_article = """　モノマネのレパートリーに「リーチ・マイケル選手」があるレイザーラモンRGさん。本人公認のモノマネですが、ラグビーファンの反応に少し驚いたそうです。
	リーチ・マイケル選手のモノマネは、何がきっかけですか。
	「2015年のワールドカップ（W杯）イングランド大会で日本が南アフリカを倒した次の日が、京都での番組ロケでした。当時は、アップルの共同創業者スティーブ・ジョブズのモノマネばかりでしたが、一緒にロケをしていたジャングルポケットから『リーチ・マイケルに似てますよ。ジョブズのまま、いけるんじゃないですか？』と言われたのが始まりです」
	「ただ、みんな知識がない。ラグビーショップを探し、日本代表のユニホームが売り切れだったので、赤っぽいユニホームとピチピチの短パンをはいて。とりあえずSNSで『リーチ・マイケルです』っていっぱい写真を載せました」
	「すると、それを見たリーチさん本人からDM（ダイレクトメッセージ）が届きました。『モノマネありがとうございます。もしモノマネをするなら、僕のユニホームを送りますので着てください』と。W杯後にユニホーム2着とパンツやソックスなどをほんまに送ってきてくれました。今着ているのがそれです」
	これまで、数々の著名人をモノマネしてこられました。リーチ選手のネタの反響はいかがでしたか。
	「僕はラグビー経験がないですし、ラグビーを全然知らなかったけど、やっぱり本人からユニホームを頂いてるっていう“印籠（いんろう）”みたいなのがあって。『あいつはリーチさん本人に認められてる』と。一目置かれているのかなと感じます」
	「やっていることは、見た目を本人に寄せてワンチームって言うだけなんですけどね。それでも『わあ、リーチさんだ』と言ってもらえます」
	「リーチさんと実際に会うことなんて、簡単にはできないじゃないですか。でも、リーチさんのまねをしているRGには会えたわ、みたいな（笑）。何だろうな、有名な神社の支社のような存在ですかね。ありがたがられるという意味では他のモノマネとはすごく違いますね」
	"""

	test_question = "　リーチ・マイケルは何を送ってきましたか？"

	pipe(do_closed_qa(test_article, question), max_new_tokens=128, temperature=0)[0]["generated_text"]
	# "ユニホーム2着とパンツやソックスなど"
	```


	# Training details

	This model was trained for 1000 steps (1.2 epochs) with the model being evaluated every 50 steps. We then chose the best model from these evaluations based on validation loss.
	We used the [qlora](https://github.com/artidoro/qlora) package from artidoro.
	We trained with the following hyperparameters:

	```
	Per device evaluation batch size: 16
	Per device train batch size: 8
	LoRA (lora_r): 64
	LoRA alpha (lora_alpha): 16
	LoRA modules: all
	Double quantization: Enabled
	Quantization type: nf4
	BF16: Enabled
	Bits: 4
	Warmup ratio: 0.03
	Learning rate scheduler type: Constant
	Gradient checkpointing: Enabled
	Gradient accumulation steps: 2
	Learning rate: 0.0002
	Adam beta2: 0.999
	Maximum gradient norm: 0.3
	LoRA dropout: 0.05
	Weight decay: 0.0
	```

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b63f8ad57e02621dc93c8b/UWiE7z5tG8t_vdSFrb5WC.png)

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b63f8ad57e02621dc93c8b/_fKBf9sdq9UAKKYMxM6ad.png)