README / README.md
hbredin's picture
Update README.md
2dd2cc5
|
raw
history blame
2.45 kB
metadata
title: README
emoji: πŸš€
colorFrom: yellow
colorTo: green
sdk: static
pinned: false

pyannote.audio is an open-source toolkit for speaker diarization.

Pretrained pipelines reach state-of-the-art performance on most academic benchmarks and are used in production by dozens of companies.

Benchmark v2.1 v3.1 Premium
AISHELL-4 14.1 12.2 11.9
AliMeeting (channel 1) 27.4 24.4 22.5
AMI (IHM) 18.9 18.8 16.6
AMI (SDM) 27.1 22.4 20.9
AVA-AVD 66.3 50.0 39.8
CALLHOME (part 2) 31.6 28.4 22.2
DIHARD 3 (full) 26.9 21.7 17.2
Earnings21 17.0 9.4 9.0
Ego4D (dev.) 61.5 51.2 43.8
MSDWild 32.8 25.3 19.8
RAMC 22.5 22.2 18.4
REPERE (phase2) 8.2 7.8 7.6
VoxConverse (v0.3) 11.2 11.3 9.4
Diarization error rate (in %)

Using one Nvidia Tesla V100 SXM2 GPU and one Intel Cascade Lake 6248 CPU,

  • v2.1 takes around 1m30s to process 1h of audio
  • v3.1 takes around 1m20s to process 1h of audio
  • Premium takes less than 45s to process 1h of audio