YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Rigel Pretrained Model

Base and Fine tuned models

Dataset

  • Size: Total 1921 hours of speech and vocals.
  • Languages:
    • Arabic: ~70 hours
    • Chinese (Mandarin): ~70 hours
    • English: ~800 hours
    • French: ~42 hours
    • German: ~35 hours
    • Hindi: ~30 hours
    • Indonesian: ~53 hours
    • Japanese: ~140 hours
    • Korean: ~80 hours
    • Portuguese: ~40 hours
    • Russian: ~188 hours
    • Singing (all languages): ~190 hours
    • Spanish: ~200 hours
    • Tagalog: ~30 hours
    • Common language: Unknown amount

Sampling Frequency

  • 32kHz (Done)
  • 40kHz (Retraining)

Models

Base Model

  • Data: Total 1921 hours of low-mid quality data.
  • Steps: 3,890,220
  • Batch: 40
  • Precision: FP32
  • Sampling Rate: 32k

Fine-Tuned Model

  • Data: 102 hours of high-quality data.
  • Steps: 2,854,856
  • Batch: 20
  • Precision: FP32
  • Sampling Rate: 32k

Hardware Used

  • CPU: AMD EPYC 9754
  • RAM: 256GB
  • GPUs:
    • 1 x H100
    • 4 x L40s
    • 1 x RTX 4080
    • 1 x RTX 4070 Ti

Expected Release Date

  • July 22nd

image/png

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.