ginic commited on
Commit
c58a656
·
1 Parent(s): 1b6aef9

Downgrade gradio version to fix table display for examples

Browse files
Files changed (2) hide show
  1. README.md +16 -16
  2. requirements.txt +1 -1
README.md CHANGED
@@ -9,7 +9,7 @@ description: >-
9
  Alphabet (IPA) in a linguistically motivated way. This is useful when
10
  evaluating speech recognition or orthographic to IPA conversion tasks.
11
  sdk: gradio
12
- sdk_version: 3.50.2
13
  app_file: app.py
14
  pinned: false
15
  ---
@@ -17,8 +17,8 @@ pinned: false
17
  # Metric Card for Phone Errors
18
 
19
  ## Metric Description
20
- Error rates in terms of distance between articulatory phonological features can help understand differences between strings in the International Phonetic Alphabet (IPA) in a linguistically motivated way.
21
- This is useful when evaluating speech recognition or orthographic to IPA conversion tasks. These are Levenshtein distances for comparing strings where the smallest unit of measurement is based on phones or articulatory phonological features, rather than Unicode characters.
22
 
23
  ## How to Use
24
 
@@ -31,7 +31,7 @@ phone_errors.compute(predictions=["bob", "ði"], references=["pop", "ðə"])
31
  ### Inputs
32
  - **predictions** (`list` of `str`): Transcriptions to score.
33
  - **references** (`list` of `str`) : Reference strings serving as ground truth.
34
- - **feature_model** (`str`): Set which panphon.distance.Distance feature parsing model is used, choose from `"strict"`, `"permissive"`, `"segment"`. Defaults to `"segment"`.
35
  - **is_normalize_pfer** (`bool`): Set to `True `to normalize PFER by the largest number of phones in the prediction, reference pair. Defaults to `False`. When this is used PFER will no longer obey the triangle inequality.
36
 
37
 
@@ -39,9 +39,9 @@ phone_errors.compute(predictions=["bob", "ði"], references=["pop", "ðə"])
39
  The computation returns a dictionary with the following key and values:
40
  - **phone_error_rates** (`list` of `float`): Phone error rate (PER) gives edit distance in terms of phones for each prediction-reference pair, rather than Unicode characters, since phones can consist of multiple characters. It is normalized by the number of phones of the reference string. The result with have the same length as the input prediction and reference lists.
41
  - **mean_phone_error_rate** (`float`): Overall mean of PER.
42
- - **phone_feature_error_rates** (`list` of `float`): Phone feature error rate (PFER) is Levenshtein distance between strings where distance between individual phones is computed using Hamming distance between phonetic features for each prediction-reference pair. By default it is a metric that obeys the triangle equality, but can also be normalized by number of phones.
43
  - **mean_phone_feature_error_rate** (`float`): Overall mean of PFER.
44
- - **feature_error_rates** (`list` of `float`): Feature error rate (FER) is the edit distance in terms of articulatory features normalized by the number of phones in the reference, computed for each prediction-reference pair.
45
  - **mean_feature_error_rate** (`float`): Overall mean of FER.
46
 
47
 
@@ -50,27 +50,27 @@ The computation returns a dictionary with the following key and values:
50
 
51
  ### Examples
52
 
53
- Simplest use case to compute phone error rates between two IPA strings:
54
  ```python
55
  >>> phone_errors.compute(predictions=["bob", "ði", "spin"], references=["pop", "ðə", "spʰin"])
56
- {'phone_error_rates': [0.6666666666666666, 0.5, 0.25], 'mean_phone_error_rate': 0.47222222222222215,
57
- 'phone_feature_error_rates': [0.08333333333333333, 0.125, 0.041666666666666664], 'mean_phone_feature_error_rate': 0.08333333333333333,
58
  'feature_error_rates': [0.027777777777777776, 0.0625, 0.30208333333333337], 'mean_feature_error_rate': 0.13078703703703706}
59
  ```
60
 
61
- Normalize phone feature error rate by the length of the reference string:
62
  ```python
63
  >>> phone_errors.compute(predictions=["bob", "ði"], references=["pop", "ðə"], is_normalize_pfer=True)
64
- {'phone_error_rates': [0.6666666666666666, 0.5], 'mean_phone_error_rate': 0.5833333333333333,
65
- 'phone_feature_error_rates': [0.027777777777777776, 0.0625], 'mean_phone_feature_error_rate': 0.04513888888888889,
66
  'feature_error_rates': [0.027777777777777776, 0.0625], 'mean_feature_error_rate': 0.04513888888888889}
67
  ```
68
 
69
  Error rates may be greater than 1.0 if the reference string is shorter than the prediction string:
70
  ```python
71
  >>> phone_errors.compute(predictions=["bob"], references=["po"])
72
- {'phone_error_rates': [1.0], 'mean_phone_error_rate': 1.0,
73
- 'phone_feature_error_rates': [1.0416666666666667], 'mean_phone_feature_error_rate': 1.0416666666666667,
74
  'feature_error_rates': [0.020833333333333332], 'mean_feature_error_rate': 0.020833333333333332}
75
  ```
76
 
@@ -85,7 +85,7 @@ ValueError: one or more references are empty strings
85
 
86
  ## Limitations and Bias
87
  - Phone error rate and feature error rate can be greater than 1.0 if the reference string is shorter than the prediction string.
88
- - Since these are error rates, not edit distances, the reference strings cannot be empty.
89
 
90
  ## Citation
91
  ```bibtex
@@ -105,6 +105,6 @@ ValueError: one or more references are empty strings
105
  ```
106
 
107
  ## Further References
108
- - PER and PFER are used as evaluation metrics in [Universal Automatic Phonetic Transcription into the International Phonetic Alphabet (Taguchi et al.)](https://www.isca-archive.org/interspeech_2023/taguchi23_interspeech.html)
109
  - Pierce Darragh's blog post [Introduction to Phonology, Part 3: Phonetic Features](https://pdarragh.github.io/blog/2018/04/26/intro-to-phonology-pt-3/) gives an overview of phonetic features for speech sounds.
110
  - [panphon Github repository](https://github.com/dmort27/panphon)
 
9
  Alphabet (IPA) in a linguistically motivated way. This is useful when
10
  evaluating speech recognition or orthographic to IPA conversion tasks.
11
  sdk: gradio
12
+ sdk_version: 3.19.1
13
  app_file: app.py
14
  pinned: false
15
  ---
 
17
  # Metric Card for Phone Errors
18
 
19
  ## Metric Description
20
+ Error rates in terms of distance between articulatory phonological features can help understand differences between strings in the International Phonetic Alphabet (IPA) in a linguistically motivated way.
21
+ This is useful when evaluating speech recognition or orthographic to IPA conversion tasks. These are Levenshtein distances for comparing strings where the smallest unit of measurement is based on phones or articulatory phonological features, rather than Unicode characters.
22
 
23
  ## How to Use
24
 
 
31
  ### Inputs
32
  - **predictions** (`list` of `str`): Transcriptions to score.
33
  - **references** (`list` of `str`) : Reference strings serving as ground truth.
34
+ - **feature_model** (`str`): Set which panphon.distance.Distance feature parsing model is used, choose from `"strict"`, `"permissive"`, `"segment"`. Defaults to `"segment"`.
35
  - **is_normalize_pfer** (`bool`): Set to `True `to normalize PFER by the largest number of phones in the prediction, reference pair. Defaults to `False`. When this is used PFER will no longer obey the triangle inequality.
36
 
37
 
 
39
  The computation returns a dictionary with the following key and values:
40
  - **phone_error_rates** (`list` of `float`): Phone error rate (PER) gives edit distance in terms of phones for each prediction-reference pair, rather than Unicode characters, since phones can consist of multiple characters. It is normalized by the number of phones of the reference string. The result with have the same length as the input prediction and reference lists.
41
  - **mean_phone_error_rate** (`float`): Overall mean of PER.
42
+ - **phone_feature_error_rates** (`list` of `float`): Phone feature error rate (PFER) is Levenshtein distance between strings where distance between individual phones is computed using Hamming distance between phonetic features for each prediction-reference pair. By default it is a metric that obeys the triangle equality, but can also be normalized by number of phones.
43
  - **mean_phone_feature_error_rate** (`float`): Overall mean of PFER.
44
+ - **feature_error_rates** (`list` of `float`): Feature error rate (FER) is the edit distance in terms of articulatory features normalized by the number of phones in the reference, computed for each prediction-reference pair.
45
  - **mean_feature_error_rate** (`float`): Overall mean of FER.
46
 
47
 
 
50
 
51
  ### Examples
52
 
53
+ Simplest use case to compute phone error rates between two IPA strings:
54
  ```python
55
  >>> phone_errors.compute(predictions=["bob", "ði", "spin"], references=["pop", "ðə", "spʰin"])
56
+ {'phone_error_rates': [0.6666666666666666, 0.5, 0.25], 'mean_phone_error_rate': 0.47222222222222215,
57
+ 'phone_feature_error_rates': [0.08333333333333333, 0.125, 0.041666666666666664], 'mean_phone_feature_error_rate': 0.08333333333333333,
58
  'feature_error_rates': [0.027777777777777776, 0.0625, 0.30208333333333337], 'mean_feature_error_rate': 0.13078703703703706}
59
  ```
60
 
61
+ Normalize phone feature error rate by the length of the reference string:
62
  ```python
63
  >>> phone_errors.compute(predictions=["bob", "ði"], references=["pop", "ðə"], is_normalize_pfer=True)
64
+ {'phone_error_rates': [0.6666666666666666, 0.5], 'mean_phone_error_rate': 0.5833333333333333,
65
+ 'phone_feature_error_rates': [0.027777777777777776, 0.0625], 'mean_phone_feature_error_rate': 0.04513888888888889,
66
  'feature_error_rates': [0.027777777777777776, 0.0625], 'mean_feature_error_rate': 0.04513888888888889}
67
  ```
68
 
69
  Error rates may be greater than 1.0 if the reference string is shorter than the prediction string:
70
  ```python
71
  >>> phone_errors.compute(predictions=["bob"], references=["po"])
72
+ {'phone_error_rates': [1.0], 'mean_phone_error_rate': 1.0,
73
+ 'phone_feature_error_rates': [1.0416666666666667], 'mean_phone_feature_error_rate': 1.0416666666666667,
74
  'feature_error_rates': [0.020833333333333332], 'mean_feature_error_rate': 0.020833333333333332}
75
  ```
76
 
 
85
 
86
  ## Limitations and Bias
87
  - Phone error rate and feature error rate can be greater than 1.0 if the reference string is shorter than the prediction string.
88
+ - Since these are error rates, not edit distances, the reference strings cannot be empty.
89
 
90
  ## Citation
91
  ```bibtex
 
105
  ```
106
 
107
  ## Further References
108
+ - PER and PFER are used as evaluation metrics in [Universal Automatic Phonetic Transcription into the International Phonetic Alphabet (Taguchi et al.)](https://www.isca-archive.org/interspeech_2023/taguchi23_interspeech.html)
109
  - Pierce Darragh's blog post [Introduction to Phonology, Part 3: Phonetic Features](https://pdarragh.github.io/blog/2018/04/26/intro-to-phonology-pt-3/) gives an overview of phonetic features for speech sounds.
110
  - [panphon Github repository](https://github.com/dmort27/panphon)
requirements.txt CHANGED
@@ -1,2 +1,2 @@
1
- evaluate==0.4.3
2
  git+https://github.com/dmort27/panphon.git
 
1
+ evaluate
2
  git+https://github.com/dmort27/panphon.git