speecht5_finetuned_voxpopuli_lt

This model is a fine-tuned version of microsoft/speecht5_tts on the voxpopuli dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5588

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 1000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.7332 24.8649 100 0.6417
0.658 49.8649 200 0.6113
0.603 74.8649 300 0.5863
0.5626 99.8649 400 0.5698
0.5389 124.8649 500 0.5631
0.5248 149.8649 600 0.5639
0.5105 174.8649 700 0.5564
0.5083 199.8649 800 0.5587
0.5038 224.8649 900 0.5544
0.5029 249.8649 1000 0.5588

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.5.1+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
14
Safetensors
Model size
144M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for hungphan111/speecht5_finetuned_voxpopuli_lt

Finetuned
(993)
this model