File size: 3,852 Bytes
d8bead6
 
 
 
 
 
 
 
 
 
 
ebb0c42
d8bead6
 
 
 
 
66b8802
 
 
 
 
 
c81b1d7
d8bead6
c81b1d7
 
 
 
 
 
 
 
d8bead6
fc467a2
 
c81b1d7
 
 
fc467a2
c81b1d7
d8bead6
c81b1d7
 
d8bead6
c81b1d7
 
 
 
 
 
 
 
d8bead6
c81b1d7
 
 
d8bead6
c81b1d7
d8bead6
c81b1d7
d8bead6
c81b1d7
d8bead6
550b9c5
c81b1d7
 
550b9c5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
---
library_name: transformers
language:
- en
license: mit
base_model: microsoft/speecht5_tts
tags:
- generated_from_trainer
datasets:
- custom
model-index:
- name: 'SpeechT5 TTS technical train2 '
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
| **PAGE**                   | **LINK**                                                                                                      |                        
|-------------------------------------|------------------------------------------------------------------------------------------------------|                   
| **MARATHI TTS GITHUB LINK  LINK**                | [MARATHI TTS REPO](https://github.com/dawarepranav/speechT5_marathi_finetuned-)         |
| **HUGGING FACE ENG TECHNICAL DATA**   | [HUGGING FACE TECHNICAL DATA  ](https://huggingface.co/pranavdaware/speecht5_tts_technical_train2) |
| **HUGGING FACE MARATHI TTS**          | [HUGGING FACE MARATHI TTS ](https://huggingface.co/pranavdaware/speecht5_tts_marathi_train2)       |
| **REPORT**                            | [REPORT](https://github.com/dawarepranav/speecht5_tts_english_technical_data/blob/main/A%20Technical%20Report.docx) |
# 🎀 SpeechT5 TTS Technical Train2

This model is a fine-tuned version of [microsoft/speecht5_tts](https://huggingface.co/microsoft/speecht5_tts) using a custom dataset, specifically trained for *Text-to-Speech (TTS)* tasks. 

🎯 *Key Metric:*
- *Loss* on the evaluation set: 0.3763

πŸ“’ *Listen to the generated sample:*

  The text is " Hello ,few technical terms i used while fine tuning are  API and REST andΒ CUDAΒ andΒ TTS."

<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/66f64964584cae45b5494560/JYJmDNPHnBRLuvqGTJQSu.wav"></audio>

---

## πŸ“ Model Description

The *SpeechT5 TTS Technical Train2* is built on the *SpeechT5* architecture and was fine-tuned for speech synthesis (TTS). The fine-tuning focused on improving the naturalness and clarity of the generated audio from text.

πŸ›  *Base Model*: [Microsoft SpeechT5](https://huggingface.co/microsoft/speecht5_tts)  
πŸ“š *Dataset*: Custom (specific details to be provided)

---

## πŸ”§ Intended Uses & Limitations

### βœ… *Primary Use Cases:*
- *Text-to-Speech (TTS)* for technical Interview Texts .
- *Virtual Assistants*:
  

### ⚠ *Limitations:*
- Best suited for English TTS tasks.
- Require further fine-tuning on Large dataset  .

---

## πŸ“… Training Data

The model was fine-tuned on a *custom dataset*, curated for enhancing TTS outputs. This dataset consists of various types of text that help the model generate more natural speech, making it suitable for TTS applications.

### βš™ *Hyperparameters:*

The model was trained with the following hyperparameters:

- *Learning Rate*: 1e-05
- *Train Batch Size*: 16
- *Eval Batch Size*: 8
- *Seed*: 42
- *Gradient Accumulation Steps*: 2
- *Total Train Batch Size*: 32
- *Optimizer*: AdamW (betas=(0.9, 0.999), epsilon=1e-08)
- *LR Scheduler Type*: Linear
- *Warmup Steps*: 50
- *Training Steps*: 500
- *Mixed Precision Training*: Native AMP

### βš™ *πŸ“Š Training Results:*:
| πŸ‹β€β™‚ Training Loss | πŸ•‘ Epoch | πŸ›€ Step | πŸ“‰ Validation Loss |
|:-------------------:|:-------:|:-------:|:-----------------:|
|        1.1921       | 100.0   | 100     |      0.4136       |
|        0.8435       | 200.0   | 200     |      0.3791       |
|        0.8294       | 300.0   | 300     |      0.3766       |
|        0.7959       | 400.0   | 400     |      0.3744       |
|        0.7918       | 500.0   | 500     |      0.3763       |


### πŸ“¦ Framework Versions

- *Transformers*: 4.46.0.dev0
- *PyTorch*: 2.4.1+cu121
- *Datasets*: 3.0.2
- *Tokenizers*:Β 0.20.1