Himeyuri v0.1 12B

Base Model
This model is built upon Mistral-Nemo-Instruct-2407
Usage Notes
Low Temperature Recommendation: As explained on Mistral-Nemo-Instruct-2407 repo, it's recommended to use a low temperature. I don't exactly know why, but my rough guess is that based on my experience, Mistral Nemo tends to switch the scene drastically so a low temperature can mitigate it, somewhat akin to "slow pace" style in AI Novelist.
Description
This is an experimental model with significant room for improvement.
Since Japanese Mistral 7B-based models appear to have plateaued, I've been exploring other architectures like LLaMA 3. Currently, I find Mistral-Nemo-Instruct-2407 the most promising for Japanese open LLMs, so I made this one.
Example
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Elizezen/Himeyuri-v0.1-12B")
model = AutoModelForCausalLM.from_pretrained(
"Elizezen/Himeyuri-v0.1-12B",
torch_dtype="auto",
)
model.eval()
if torch.cuda.is_available():
model = model.to("cuda")
input_ids = tokenizer.encode(
"ๅนดๆซใฎใใผใในใๅๅใฃใฆๅ ๅฅๆฑใ็คพใใๅธฐใใใจใใใจใใงใใฃใใ",
add_special_tokens=True,
return_tensors="pt"
)
tokens = model.generate(
input_ids.to(device=model.device),
max_new_tokens=512,
temperature=0.35,
top_p=1,
do_sample=True,
)
out = tokenizer.decode(tokens[0][input_ids.shape[1]:], skip_special_tokens=True).strip()
print(out)
"""
output example:
ๅพใใใ่ฉใๅฉใใใฆๆฏใๅใใจใใใใซใฏ่ฆ็ฅใใฌ็ทใ็ซใฃใฆใใใใฉใใใง่ฆใใใจใใใใใใช้ก็ซใกใใใฆใใใ็ทใฏใซใใใใซ็ฌใฃใฆๅ ๅฅๆฑใซ่ฉฑใใใใฆใใใ
ใใใใฐใใฏใๅ ๅฅๆฑใใใงใใใญ๏ผใใกใใฃใจใ่ฉฑใใใใใใจใใใใใงใใใฉใใใใงใใ๏ผใ
ใใฎ็ทใฎ่จ่ใซๅ ๅฅๆฑใฏ้ฆใๅพใใชใใใใใจใใใใ่ฉฑใ่ใใใจใซใใใๅจใใซใฏ็คพๅกใใพใ ๆฎใฃใฆใใใฎใงใๅคงใใชๅ้กใฏใชใใ ใใใจๆใฃใใใใ ใ
ใไฝใงใใใ๏ผใ็งใซไฝใ็จใงใใ๏ผใ
ๅ ๅฅๆฑใใใ่ใใจใ็ทใฏใซใใใใช็ฌใฟใๆตฎใในใใพใพใๅฝผๅฅณใฎ่ณๅ
ใงๅฐๅฃฐใงๅใใใ
ใๅฎใฏใใใชใใฎๆฆ้ฃใใใฎใใจใงๅฐใใ่ฉฑใใใใใใจใใใใใงใใๅ ดๆใๅคใใฆ่ฉฑใ่ใใฆใใใ ใใพใใใ๏ผใ
ใใฎ่จ่ใซๅ ๅฅๆฑใฏ้ฉใใฎ่กจๆ
ใๆตฎใในใใๅคซใฎใใจใฏ็คพๅ
ใงใฏใใพใ่ฉฑใใใใชใใใใใใ็ทใฎๆงๅญใใใฏใๅคซใฎ่บซใซไฝใใใฃใใฎใงใฏใชใใใจใใไธๅฎใใใใฃใใ
ใๅใใใพใใใใใใใ่ฟใใฎๅซ่ถๅบใซ่กใใพใใใใ
ๅ ๅฅๆฑใใใ่จใใจใ็ทใฏ้ ทใใฆๅฝผๅฅณใฎๅพใซใคใใฆใใใ
ไบไบบใฏ่ฟใใฎๅซ่ถๅบใซๅ
ฅใใๅฅฅใฎๅธญใซ่
ฐใไธใใใใ
ใใใใงใๅคซใฎ่บซใซไฝใใใฃใใใงใใ๏ผใ
ๅ ๅฅๆฑใฏ็ทใฎ้กใ่ฆใชใใใไธๅฎใใซ่ใใใ็ทใฏใณใผใใผใไธๅฃ้ฃฒใใจใๅฝผๅฅณใซๅใใฃใฆ่ฉฑใๅงใใใ
ใๅฎใฏใใใชใใฎๆฆ้ฃใใใฏใใใ็งๅฏใฎ็ต็นใซๆๅฑใใฆใใใใงใใใใฎ็ต็นใฏใไธ็ไธญใซๅฝฑ้ฟๅใๆใคใใใชๅคง่ฆๆจกใช็ต็นใงใๆงใ
ใชๅฝๅฎถใฎ่ฆไบบใๆๅ่
ใใกใๅ ๅ
ฅใใฆใใใจ่จใใใฆใใพใใๆฆ้ฃใใใใใฎไธๅกใงใใใชใ้ซใๅฐไฝใซใใใใใงใใ
ใใฎ่จ่ใซๅ ๅฅๆฑใฏ็ฎใไธธใใใใๅคซใใใใช็ต็นใซๆๅฑใใฆใใใ ใชใใฆใๅ่ณใงใใใ
ใใใใช้ฆฌ้นฟใชใใใกใฎไบบใฏใใ ใฎใตใฉใชใผใใณใงใใใใใใช
"""
Intended Use
Primarily designed for novel generation. Not optimized for:
- Role-playing (RP) scenarios
- Instruction-based responses
- Downloads last month
- 51
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.