ginic
/

gender_split_70_female_4_wav2vec2-large-xlsr-53-buckeye-ipa

Automatic Speech Recognition

Model card Files Files and versions Community

gender_split_70_female_4_wav2vec2-large-xlsr-53-buckeye-ipa / README.md

ginic's picture

Upload README.md with huggingface_hub

37e577f verified about 1 month ago

|

history blame contribute delete

984 Bytes


	---
	license: mit
	language:
	- en
	pipeline_tag: automatic-speech-recognition
	---
	# About
	This model was created to support experiments for evaluating phonetic transcription
	with the Buckeye corpus as part of https://github.com/ginic/multipa.
	This is a version of facebook/wav2vec2-large-xlsr-53 fine tuned on a specific subset of the Buckeye corpus.
	For details about specific model parameters, please view the config.json here or
	training scripts in the scripts/buckeye_experiments folder of the GitHub repository.

	# Experiment Details
	Still training with a total amount of data equal to half the full training data (4000 examples), vary the gender split 30/70, but draw examples from all individuals. Do 5 models for each gender split with the same model parameters but different data seeds.

	Goals:
	- Determine how different in gender split in training data affects performance

	Params to vary:
	- percent female (--percent_female) [0.3, 0.7]
	- training seed (--train_seed)