This model series is part of active research. The research paper will be published between the versions 0.5 and 1.0, under the name "Multilingual Babies: Analyzing Multilingual GPT Models with under one Billion Parameters".
Model is still in its testing phase. We don't recommend it for high-end production enviroments, it's only a model for story-generation. We suspect this model may be slightly overfitted. This will be fixed with the coming version. Model trained using LLaMA-Factory by Asadullah Hamzah at XeTute Technologies.
Phantasor V0.3
We introduce Phantasor V0.3, the continuation of Phantasor V0.2. It has been trained ontop of V0.2 using a new dataset (more details below) and the old datasets.
Licensed under MIT, feel free to use it in your personal projects, both commercially and privately, Since this is V0.3, we're open to feedback to improve our project(s).
The Chat-Template used is Alpaca. For correct usage, insert your prompt as a system prompt. The model can also be used without any template to continue a sequence of text.
You can find the FP32 version here.
Dataset Details
As noted above, to create this model, we took V0.2, finetuned it on the datasets used before, but this time:
- We used XeTute/SStory-Gen-EN_ZH too
- We used 12.0 Epochs, but with a lr of 1e-6 — This technique is said to stabilise a model
Example =)
"System-Prompt": "A young detective investigates a series of bizarre murders in a small town.",
"Response":
"In the heart of the sleepy community of Willowbrook, where crime scenes abound, lies a mysterious mystery that has captivated the town's residents for years. It's a tale of a man who has been accused of murder and a woman who has been accused of murder. But as the investigation unfolds, the detectives uncover a new twist that threatens to shake the very fabric of the town.
The case is set in the heart of Willowbrook, nestled amidst rolling hills and winding rivers. The townspeople are united by a common belief that the killer, known only as "The Killer," is the true culprit.
But as the investigation progresses, the tension between the two leads to a turning point. As the investigation unfolds, the townspeople witness the murder of a prominent journalist, who had no connection with the murder. The killer, initially hailed as a hero, becomes a hero when he realizes that the murderer is not the killer but rather a victim of a cover-up.
As the investigation unfolds, the townspeople have a sense of justice in their hearts. A young detective named Detective D.J. D.C. Chambers, known for his tireless pursuit of justice, finds himself entangled in a web of deceit and deceit.
As the investigation unfolds, Detective D.C. Chambers begins to unravel the intricate web of deceit and deception that has plagued Willowbrook for decades. He confronts the killer, who is determined to bring justice to his victims. But Detective D.C. Chambers also faces a formidable adversary, a man who knows every trick and trick that can be employed to silence him."
Training
This model was trained on all samples, tokens included in:
- Chamoda/atlas-storyteller-1000
- jaydenccc/AI_Storyteller_Dataset
- zxbsmk/webnovel_cn
- MatanP/emotion_mapped_story_dataset
- XeTute/SStory-Gen-EN_ZH
for exactly 12.0 epochs on all model parameters. Following is the loss curve, updated with each training step over all four epochs.
Instead of AdamW, which is often used for large GPTs, we used SGD, which enabled the model to generalize better, which can be seen when using the model on non-dataset prompts.
Finished Model
- ~137M Parameters, all of which are trainable
- 1024 / 1k input tokens / context length, from which all were used
- A sub-1 loss on all samples (see Files => train_results.json)
This is very good performance for the V0.3.
Our platforms
Socials
BlueSky | YouTube | HuggingFace 🤗 | Ko-Fi / Financially Support Us
Our Platforms
Have a great day!
- Downloads last month
- 0
Model tree for XeTute/Phantasor_V0.3-137M-GGUF
Base model
openai-community/gpt2