JBZhang2342
/

speecht5_tts

en_accent,mozilla,t5,common_voice_1_0

Generated from Trainer

Model card Files Files and versions

Metrics Training metrics Community

JBZhang2342 commited on Nov 16, 2023

Commit

ce06adb

·

1 Parent(s): 33693e2

Model save

Files changed (2) hide show

README.md +15 -18
model.safetensors +1 -1

README.md CHANGED Viewed

@@ -1,26 +1,23 @@
 ---
-language:
-- en
 license: mit
 base_model: microsoft/speecht5_tts
 tags:
-- en_accent,mozilla,t5,common_voice_1_0
 - generated_from_trainer
 datasets:
-- mozilla-foundation/common_voice_1_0
 model-index:
-- name: SpeechT5 TTS English Accented
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# SpeechT5 TTS English Accented
-This model is a fine-tuned version of [microsoft/speecht5_tts](https://huggingface.co/microsoft/speecht5_tts) on the Common Voice dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.5189
 ## Model description
@@ -55,16 +52,16 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step  | Validation Loss |
 |:-------------:|:-----:|:-----:|:---------------:|
-| 0.5395        | 16.0  | 1000  | 0.4726          |
-| 0.4727        | 32.0  | 2000  | 0.4819          |
-| 0.4513        | 48.0  | 3000  | 0.4871          |
-| 0.4526        | 64.0  | 4000  | 0.5006          |
-| 0.4474        | 80.0  | 5000  | 0.5022          |
-| 0.4147        | 96.0  | 6000  | 0.5039          |
-| 0.423         | 112.0 | 7000  | 0.5154          |
-| 0.4271        | 128.0 | 8000  | 0.5217          |
-| 0.4232        | 144.0 | 9000  | 0.5198          |
-| 0.4044        | 160.0 | 10000 | 0.5189          |
 ### Framework versions

 ---
 license: mit
 base_model: microsoft/speecht5_tts
 tags:
 - generated_from_trainer
 datasets:
+- common_voice_1_0
 model-index:
+- name: speecht5_tts
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# speecht5_tts
+This model is a fine-tuned version of [microsoft/speecht5_tts](https://huggingface.co/microsoft/speecht5_tts) on the common_voice_1_0 dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.4530
 ## Model description
 | Training Loss | Epoch | Step  | Validation Loss |
 |:-------------:|:-----:|:-----:|:---------------:|
+| 0.5188        | 3.9   | 1000  | 0.4639          |
+| 0.4705        | 7.8   | 2000  | 0.4534          |
+| 0.4559        | 11.7  | 3000  | 0.4463          |
+| 0.4763        | 15.59 | 4000  | 0.4477          |
+| 0.4549        | 19.49 | 5000  | 0.4474          |
+| 0.4684        | 23.39 | 6000  | 0.4580          |
+| 0.4471        | 27.29 | 7000  | 0.4468          |
+| 0.4666        | 31.19 | 8000  | 0.4472          |
+| 0.4338        | 35.09 | 9000  | 0.4515          |
+| 0.4452        | 38.99 | 10000 | 0.4530          |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a8e9555320d1a780c3a88913046babb3624a1dacc5da0381f628454ab2639267
 size 577789320

 version https://git-lfs.github.com/spec/v1
+oid sha256:f6178b49908599563af03eb83d77b093a96815a4e5fa29d849ae2667107e0c2f
 size 577789320