omogr commited on
Commit
98a43a7
·
verified ·
1 Parent(s): 410ef56

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +141 -3
README.md CHANGED
@@ -1,3 +1,141 @@
1
- ---
2
- license: cc-by-nc-sa-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+
3
+
4
+ # Omogre
5
+
6
+ ## Russian Accentuator and IPA Transcriptor
7
+
8
+ A library for [`Python 3`](https://www.python.org/). Automatic stress placement and [IPA transcription](https://en.wikipedia.org/wiki/International_Phonetic_Alphabet) for the Russian language.
9
+
10
+ ## Dependencies
11
+
12
+ Installing the library will also install [`Pytorch`](https://pytorch.org/) and [`Numpy`](https://numpy.org/). Additionally, for model downloading, [`tqdm`](https://tqdm.github.io/) and [`requests`](https://pypi.org/project/requests/) will be installed.
13
+
14
+ ## Installation
15
+
16
+ ### Using GIT
17
+
18
+ ```bash
19
+ pip install git+https://github.com/omogr/omogre.git
20
+ ```
21
+
22
+ ### Using pip
23
+
24
+ Download the code from [GitHub](https://github.com/omogr/omogre). In the directory containing [`setup.py`](https://github.com/omogr/omogre/blob/main/setup.py), run:
25
+
26
+ ```bash
27
+ pip install -e .
28
+ ```
29
+
30
+ ### Manually
31
+
32
+ Download the code from [GitHub](https://github.com/omogr/omogre). Install [`Pytorch`](https://pytorch.org/), [`Numpy`](https://numpy.org/), [`tqdm`](https://tqdm.github.io/), and [`requests`](https://pypi.org/project/requests/). Run [`test.py`](https://github.com/omogr/omogre/blob/main/test.py).
33
+
34
+ ## Model downloading
35
+
36
+ By default, data for models will be downloaded on the first run of the library. The script [`download_data.py`](https://github.com/omogr/omogre/blob/main/download_data.py) can also be used to download this data.
37
+
38
+ You can specify a path where the model data should be stored. If data already exists in this directory, it won't be downloaded again.
39
+
40
+ ## Example
41
+
42
+ Script [`test.py`](https://github.com/omogr/omogre/blob/main/test.py).
43
+
44
+ ```python
45
+ from omogre import Accentuator, Transcriptor
46
+
47
+ # Data will be downloaded to the 'omogre_data' directory
48
+ transcriptor = Transcriptor(data_path='omogre_data')
49
+ accentuator = Accentuator(data_path='omogre_data')
50
+
51
+ sentence_list = ['стены замка']
52
+
53
+ print('transcriptor', transcriptor(sentence_list))
54
+ print('accentuator', accentuator(sentence_list))
55
+
56
+ # Alternative call methods, differing only in notation
57
+ print('transcriptor.transcribe', transcriptor.transcribe(sentence_list))
58
+ print('accentuator.accentuate', accentuator.accentuate(sentence_list))
59
+
60
+ print('transcriptor.accentuate', transcriptor.accentuate(sentence_list))
61
+ ```
62
+
63
+ ## Class Parameters
64
+
65
+ ### Transcriptor
66
+
67
+ All initialization parameters for the class are optional.
68
+
69
+ ```python
70
+ class Transcriptor(data_path: str = None,
71
+ download: bool = True,
72
+ device_name: str = None,
73
+ punct: str = '.,!?')
74
+ ```
75
+
76
+ - `data_path`: Directory where the model should be located.
77
+ - `device_name`: Parameter defining GPU usage. Corresponds to the initialization parameter of [torch.device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device). Valid values include `"cpu"`, `"cuda"`, `"cuda:0"`, etc. Defaults to `"cuda"` if GPU is available, otherwise `"cpu"`.
78
+ - `punct`: List of non-letter characters to be carried over from the source text to the transcription. Default is `'.,!?'`.
79
+ - `download`: Whether to download the model from the internet if not found in `data_path`. Default is `True`.
80
+
81
+ Class methods:
82
+
83
+ ```python
84
+ accentuate(sentence_list: list) -> list
85
+ transcribe(sentence_list: list) -> list
86
+ ```
87
+
88
+ `accentuate` places stresses, `transcribe` performs transcription. Both inputs take a list of strings and return a list of strings.
89
+
90
+ ### Accentuator
91
+
92
+ The `Accentuator` class for stress placement is identical to the `Transcriptor` in terms of stress functionality, except it doesn't load transcription data, reducing initialization time and memory usage.
93
+
94
+ All initialization parameters are optional, with the same meanings as for `Transcriptor`.
95
+
96
+ ```python
97
+ class Accentuator(data_path: str = None,
98
+ download: bool = True,
99
+ device_name: str = None)
100
+ ```
101
+
102
+ - `data_path`: Directory where the model should be located.
103
+ - `device_name`: Parameter for GPU usage. See above for details.
104
+ - `download`: Whether to download the model if not found. Default is `True`.
105
+
106
+ Class method:
107
+
108
+ ```python
109
+ accentuate(sentence_list: list) -> list
110
+ ```
111
+
112
+ ## Usage Example
113
+
114
+ The script [`ruslan_markup.py`](https://github.com/omogr/omogre/blob/main/ruslan_markup.py) places stresses and generates transcriptions for markup files of the acoustic corpora [`ruslan`](http://dataset.sova.ai/SOVA-TTS/ruslan/ruslan_dataset.tar) and [`natasha`](http://dataset.sova.ai/SOVA-TTS/natasha/natasha_dataset.tar).
115
+
116
+ These markup files already contain manually placed stresses, which were [done manually](https://habr.com/ru/companies/ashmanov_net/articles/528296/).
117
+
118
+ The script [`ruslan_markup.py`](https://github.com/omogr/omogre/blob/main/ruslan_markup.py) generates its own stress placement for these files, allowing for an evaluation of the accuracy of stress placement.
119
+
120
+ ## Context Awareness and Other Features
121
+
122
+ ### Stresses
123
+
124
+ Stresses are placed considering context. If very long strings are encountered (for the current model, more than 510 tokens), context won't be considered for these. Stresses in these strings will be placed only where possible without context.
125
+
126
+ Stresses are also placed in one-syllable words, which might look unusual but simplifies subsequent transcription determination.
127
+
128
+ ### Transcription
129
+
130
+ During transcription generation, extraneous characters are filtered out. Non-letter characters that are not filtered can be specified by a parameter. By default, four punctuation marks (`.,!?`) are not filtered. Transcription is determined word by word, without context. The following symbols are used for transcription:
131
+
132
+ ```
133
+ ʲ`ɪətrsɐnjvmapkɨʊleɫdizofʂɕbɡxːuʐæɵʉɛ
134
+ ```
135
+
136
+ ## Feedback
137
+ Email for questions, comments and suggestions - `[email protected]`.
138
+
139
+ ---
140
+ license: cc-by-nc-sa-4.0
141
+ ---