Hervé Bredin commited on
Commit
df706b9
·
1 Parent(s): f47dcce

fix: fix README

Browse files
Files changed (1) hide show
  1. README.md +28 -10
README.md CHANGED
@@ -17,7 +17,7 @@ license: mit
17
  inference: false
18
  ---
19
 
20
- # Pretrained speaker segmentation model
21
 
22
  This model relies on `pyannote.audio` 2.0 (which is still in development):
23
 
@@ -29,7 +29,7 @@ $ pip install https://github.com/pyannote/pyannote-audio/archive/develop.zip
29
 
30
  ```python
31
  >>> from pyannote.audio import Inference
32
- >>> inference = Inference("pyannote/Segmentation")
33
  >>> segmentation = inference("audio.wav")
34
  ```
35
 
@@ -40,25 +40,30 @@ $ pip install https://github.com/pyannote/pyannote-audio/archive/develop.zip
40
  ```python
41
  >>> from pyannote.audio.pipelines import VoiceActivityDetection
42
  >>> HYPER_PARAMETERS = {"onset": 0.5, "offset": 0.5, "min_duration_on": 0.0, "min_duration_off": 0.0}
43
- >>> pipeline = VoiceActivityDetection(segmentation="pyannote/Segmentation").instantiate(HYPER_PARAMETERS)
 
44
  >>> vad = pipeline("audio.wav")
45
  ```
46
 
 
 
47
  Dataset | `onset` | `offset` | `min_duration_on` | `min_duration_off`
48
  ----------------|---------|----------|-------------------|-------------------
49
  AMI Mix-Headset | TODO | TODO | TODO | TODO
50
  DIHARD3 | TODO | TODO | TODO | TODO
51
  VoxConverse | TODO | TODO | TODO | TODO
52
 
53
-
54
  ### Overlapped speech detection
55
 
56
  ```python
57
  >>> from pyannote.audio.pipelines import OverlappedSpeechDetection
58
- >>> pipeline = OverlappedSpeechDetection(segmentation="pyannote/Segmentation").instantiate(HYPER_PARAMETERS)
 
59
  >>> osd = pipeline("audio.wav")
60
  ```
61
 
 
 
62
  Dataset | `onset` | `offset` | `min_duration_on` | `min_duration_off`
63
  ----------------|---------|----------|-------------------|-------------------
64
  AMI Mix-Headset | TODO | TODO | TODO | TODO
@@ -70,9 +75,12 @@ VoxConverse | TODO | TODO | TODO | TODO
70
 
71
  ```python
72
  >>> from pyannote.audio.pipelines import Segmentation
73
- >>> pipeline = Segmentation(segmentation="pyannote/Segmentation").instantiate(HYPER_PARAMETERS)
 
74
  >>> seg = pipeline("audio.wav")
75
  ```
 
 
76
 
77
  Dataset | `onset` | `offset` | `min_duration_on` | `min_duration_off`
78
  ----------------|---------|----------|-------------------|-------------------
@@ -84,11 +92,22 @@ VoxConverse | TODO | TODO | TODO | TODO
84
 
85
  ```python
86
  >>> from pyannote.audio.pipelines import Resegmentation
87
- >>> pipeline = Resegmentation(segmentation="pyannote/Segmentation", diarization="baseline")
88
- >>> assert isinstance(baseline, pyannote.core.Annotation)
89
- >>> resegmented_baseline = pipeline({"audio": "audio.wav", "baseline": baseline})
 
 
 
 
 
 
 
 
 
90
  ```
91
 
 
 
92
  Dataset | `onset` | `offset` | `min_duration_on` | `min_duration_off`
93
  ----------------|---------|----------|-------------------|-------------------
94
  AMI Mix-Headset | TODO | TODO | TODO | TODO
@@ -97,7 +116,6 @@ VoxConverse | TODO | TODO | TODO | TODO
97
 
98
  ## Citations
99
 
100
-
101
  ```bibtex
102
  @inproceedings{Bredin2020,
103
  Title = {{pyannote.audio: neural building blocks for speaker diarization}},
 
17
  inference: false
18
  ---
19
 
20
+ # pyannote.audio // speaker segmentation
21
 
22
  This model relies on `pyannote.audio` 2.0 (which is still in development):
23
 
 
29
 
30
  ```python
31
  >>> from pyannote.audio import Inference
32
+ >>> inference = Inference("pyannote/segmentation")
33
  >>> segmentation = inference("audio.wav")
34
  ```
35
 
 
40
  ```python
41
  >>> from pyannote.audio.pipelines import VoiceActivityDetection
42
  >>> HYPER_PARAMETERS = {"onset": 0.5, "offset": 0.5, "min_duration_on": 0.0, "min_duration_off": 0.0}
43
+ >>> pipeline = VoiceActivityDetection(segmentation="pyannote/segmentation")
44
+ >>> pipeline.instantiate(HYPER_PARAMETERS)
45
  >>> vad = pipeline("audio.wav")
46
  ```
47
 
48
+ In order to reproduce results of the paper, one should use the following hyper-parameter values:
49
+
50
  Dataset | `onset` | `offset` | `min_duration_on` | `min_duration_off`
51
  ----------------|---------|----------|-------------------|-------------------
52
  AMI Mix-Headset | TODO | TODO | TODO | TODO
53
  DIHARD3 | TODO | TODO | TODO | TODO
54
  VoxConverse | TODO | TODO | TODO | TODO
55
 
 
56
  ### Overlapped speech detection
57
 
58
  ```python
59
  >>> from pyannote.audio.pipelines import OverlappedSpeechDetection
60
+ >>> pipeline = OverlappedSpeechDetection(segmentation="pyannote/segmentation")
61
+ >>> pipeline.instantiate(HYPER_PARAMETERS)
62
  >>> osd = pipeline("audio.wav")
63
  ```
64
 
65
+ In order to reproduce results of the paper, one should use the following hyper-parameter values:
66
+
67
  Dataset | `onset` | `offset` | `min_duration_on` | `min_duration_off`
68
  ----------------|---------|----------|-------------------|-------------------
69
  AMI Mix-Headset | TODO | TODO | TODO | TODO
 
75
 
76
  ```python
77
  >>> from pyannote.audio.pipelines import Segmentation
78
+ >>> pipeline = Segmentation(segmentation="pyannote/segmentation")
79
+ >>> pipeline.instantiate(HYPER_PARAMETERS)
80
  >>> seg = pipeline("audio.wav")
81
  ```
82
+ In order to reproduce results of the paper, one should use the following hyper-parameter values:
83
+
84
 
85
  Dataset | `onset` | `offset` | `min_duration_on` | `min_duration_off`
86
  ----------------|---------|----------|-------------------|-------------------
 
92
 
93
  ```python
94
  >>> from pyannote.audio.pipelines import Resegmentation
95
+ >>> pipeline = Resegmentation(segmentation="pyannote/segmentation",
96
+ ... diarization="baseline")
97
+ >>> pipeline.instantiate(HYPER_PARAMETERS)
98
+ ```
99
+
100
+ VBx RTTM files are also provided in this repository for convenience:
101
+
102
+ ```python
103
+ >>> from pyannote.database.utils import load_rttm
104
+ >>> vbx = load_rttm("/path/to/vbx.rttm")
105
+ >>> resegmented_vbx = pipeline({"audio": "DH_EVAL_000.wav",
106
+ ... "baseline": vbx["DH_EVAL_000"]})
107
  ```
108
 
109
+ In order to reproduce (VBx) results of the paper, one should use the following hyper-parameter values:
110
+
111
  Dataset | `onset` | `offset` | `min_duration_on` | `min_duration_off`
112
  ----------------|---------|----------|-------------------|-------------------
113
  AMI Mix-Headset | TODO | TODO | TODO | TODO
 
116
 
117
  ## Citations
118
 
 
119
  ```bibtex
120
  @inproceedings{Bredin2020,
121
  Title = {{pyannote.audio: neural building blocks for speaker diarization}},