AMI
This is an ASR recipe for the AMI corpus. AMI provides recordings from the speaker's headset and lapel microphones, and also 2 array microphones containing 8 channels each. We pool data in the following 4 ways and train a single model on the pooled data:
(i) individual headset microphone (IHM) (ii) IHM with simulated reverb (iii) Single distant microphone (SDM) (iv) GSS-enhanced array microphones
Speed perturbation and MUSAN noise augmentation are additionally performed on the pooled data.
Performance Record
pruned_transducer_stateless7
The following are decoded using modified_beam_search
:
Evaluation set | dev WER | test WER |
---|---|---|
IHM | 18.92 | 17.40 |
SDM | 31.25 | 32.21 |
MDM (GSS-enhanced) | 21.67 | 22.43 |
See the recipe for details.
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no pipeline_tag.