About

This model was created to support experiments for evaluating phonetic transcription with the Buckeye corpus as part of https://github.com/ginic/multipa. This is a version of facebook/wav2vec2-large-xlsr-53 fine tuned on a specific subset of the Buckeye corpus. For details about specific model parameters, please view the config.json here or training scripts in the scripts/buckeye_experiments folder of the GitHub repository.

Experiment Details

Still training with a total amount of data equal to half the full training data (4000 examples), vary the gender split 30/70, but draw examples from all individuals. Do 5 models for each gender split with the same model parameters but different data seeds.

Goals:

  • Determine how different in gender split in training data affects performance

Params to vary:

  • percent female (--percent_female) [0.3, 0.7]
  • training seed (--train_seed)
Downloads last month
1,097
Safetensors
Model size
316M params
Tensor type
F32
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Space using ginic/gender_split_70_female_4_wav2vec2-large-xlsr-53-buckeye-ipa 1