AliMeeting
This is an ASR recipe for the AliMeeting corpus. AliMeeting provides recordings from the speaker's headset microphones and an 8-channel microphone array. We pool data in the following 4 ways and train a single model on the pooled data:
(i) individual headset microphone (IHM) (ii) IHM with simulated reverb (iii) Single distant microphone (SDM) (iv) GSS-enhanced array microphones
Speed perturbation and MUSAN noise augmentation are additionally performed on the pooled data.
Performance Record
pruned_transducer_stateless7
The following are decoded using modified_beam_search
:
Evaluation set | eval CER | test CER |
---|---|---|
IHM | 9.58 | 11.53 |
SDM | 23.37 | 25.85 |
MDM (GSS-enhanced) | 11.82 | 14.22 |
See the recipe for details.
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no pipeline_tag.