ManshoorAI
Overview
This project fine-tunes GPT-2 to generate Persian neo-poetry inspired by the works of Sohrab Sepehri and Forough Farokhzad.
The model is a work in progress. I look forward to hear your thoughts.
LSTM Model
I also trained a simple LSTM model with same data in my Github page Here. you can compare the results to see the power of Transformers!
Model Details
- Base Model: GPT-2 (pretrained by OpenAI)
- intermediate Model: HooshvareLab/gpt2-fa
- Dataset: Curated poems from Sohrab Sepehri and Forough Farokhzad
- Fine-Tuning: PEFT/LoRA
- Language: Persian (Farsi)
- Output: Generates poetry with free verse and metaphorical depth
Installation & Usage
You can load the model using the HuggingFace transformers
library:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
from hazm import Normalizer
model_name = "rahiminia/manshoorai"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
def generate_poetry(prompt, max_length=30):
prompt = Normalizer().normalize(prompt)
generator = pipeline('text-generation', model=model, tokenizer=tokenizer)
output = generator(prompt, max_length=max_length)
print(output['generated_text'])
print(generate_poetry("شب آرام و خاموش"))
You can also use optimum onnxruntime
to use ONNX model checkpoint:
from transformers import AutoTokenizer, pipeline
from optimum.onnxruntime import ORTModelForCausalLM
model = ORTModelForCausalLM.from_pretrained("rahiminia/manshoorai", use_cache=False, use_io_binding=False)
tokenizer = AutoTokenizer.from_pretrained("rahiminia/manshoorai")
onnx = pipeline("text-generation", model=model, tokenizer=tokenizer)
prompt = 'در این شب سیاه'
pred = onnx(prompt)
print(pred[0]['generated_text'])
Training Details
- Tokenizer: Tokenizer with Byte Pair Encoding (BPE) from HooshvareLab/gpt2-fa
- Training: Fine-tuned using PyTorch and the
transformers
library - Hyperparameters: Adjusted learning rate and weight decay
Sample Outputs
Prompt: "باران که میبارد"
Generated Text:
- ManshoorAI
باران که میبارد من، به باغ راه یافته بودم من این دشت را دیدم که پر از درخت است و در آن برگ هایم هیچ گونه سبز نیست
- Base Model (GPT2-fa)
باران که میبارد با خود بگوید که دیگر چه شده بود؟ اگر آن جوان از پشت نردهها به پایین میرفت؛
Prompt: "در این شب سیاه"
Generated Text:
در این شب سیاه
چشمهای سیاه اتاقها
همه دیدههای من هستند
از هر پلک چه میبینم.
و هر چهره روشن دیگر
من را در سکوت خانه فرو برده
Limitations & Biases
- This is a work in progress, with many improvements yet to be made.
- The model may occasionally generate repetitive or incoherent lines.
- It does not strictly follow classical Persian poetry rules but leans towards free verse.
- Biases in the training dataset might influence stylistic preferences.
Contributions & Feedback
If you use this model or have suggestions for improvement, feel free to open an issue or contribute via Hugging Face Spaces.
License
This model is released under the MIT License. Please ensure ethical use and proper attribution when sharing generated works.
- Downloads last month
- 44
Model tree for rahiminia/manshoorai
Base model
HooshvareLab/gpt2-fa