README / README.md
hbredin's picture
feat: update benchmark with 3.1 results
5390c17
|
raw
history blame
1.95 kB
metadata
title: README
emoji: πŸš€
colorFrom: yellow
colorTo: green
sdk: static
pinned: false

pyannote.audio is an open-source toolkit for speaker diarization.

Pretrained pipelines reach state-of-the-art performance on most academic benchmarks and are used in production by dozens of companies.

Benchmark v1.1 v2.1 v3.1 Premium
AISHELL-4 - 14.1 12.2 11.9
AliMeeting (channel 1) - 27.4 24.4 22.5
AMI (IHM) 29.7 18.9 18.8 16.6
AMI (SDM) - 27.1 22.4 20.9
AVA-AVD - - 50.0 39.8
DIHARD 3 (full) 29.2 26.9 21.7 17.2
MSDWild - - 25.3 19.8
REPERE (phase2) - 8.2 7.8 7.6
VoxConverse (v0.3) 21.5 11.2 11.3 9.4
Diarization error rate (in %)

Using one Nvidia Tesla V100 SXM2 GPU and one Intel Cascade Lake 6248 CPU,

  • v2.1 takes around 1m30s to process 1h of audio
  • v3.1 takes around 1m20s to process 1h of audio
  • Premium takes less than 1m00s to process 1h of audio