Truncation : Audio OP limitation
Hi I am trying to use this to create audios of about 5 sentences in Bengali and Marathi but i am facing abrupt endings in my audio OPs. Any Solutions to this problem ?
What i have read from the Parler TTS is that the max is OP is 30 seconds . Can we work around the same ?
https://github.com/huggingface/parler-tts/blob/main/training/README.md#3-training
Hi, Indic Parler-TTS can consistently generate sequences of upto 10-12 seconds. We are working on improving the model. But for your particular use case, I suggest splitting the sentences into different items and doing a batch generate instead as that will give the best quality while remaining consistent with the prompt that you have described.
Hey
@avneetsingh
.
I have used a combination of chunking and batch generation to use for large sentences.
PS : chunking introduces quality issues due to losing context.
I am trying tokenizer based strategy / Filter sentence via text LLM to chunk rather than current word count based