Update README.md
Browse files
README.md
CHANGED
@@ -40,6 +40,22 @@ Note that it runs only with 16-bit WAV files, so make sure to convert your input
|
|
40 |
ffmpeg -i input.mp3 -ar 16000 -ac 1 -c:a pcm_s16le output.wav
|
41 |
```
|
42 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
43 |
### Quantized Model
|
44 |
To use the quantized model, download the quantized GGML weights:
|
45 |
|
@@ -52,6 +68,8 @@ Run inference on the sample audio:
|
|
52 |
make -j && ./main -m models/ggml-kotoba-whisper-v1.0-q5_0.bin -f sample_ja_speech.wav --output-file transcription.quantized --output-json
|
53 |
```
|
54 |
|
|
|
|
|
55 |
|
56 |
## Model Details
|
57 |
|
|
|
40 |
ffmpeg -i input.mp3 -ar 16000 -ac 1 -c:a pcm_s16le output.wav
|
41 |
```
|
42 |
|
43 |
+
### Benchmark
|
44 |
+
We measure the inference speed with four different Japanese speech audio on MacBook Pro with the following spec:
|
45 |
+
- Apple M2 Pro
|
46 |
+
- 32GB
|
47 |
+
- 14-inch, 2023
|
48 |
+
- OS Sonoma Version 14.4.1 (23E224)
|
49 |
+
|
50 |
+
|
51 |
+
| audio duration (min)| inference time (sec) |
|
52 |
+
|---------------------|-------------|
|
53 |
+
| 50.3 | 581 |
|
54 |
+
| 5.6 | 41 |
|
55 |
+
| 4.9 | 30 |
|
56 |
+
| 5.6 | 35 |
|
57 |
+
|
58 |
+
|
59 |
### Quantized Model
|
60 |
To use the quantized model, download the quantized GGML weights:
|
61 |
|
|
|
68 |
make -j && ./main -m models/ggml-kotoba-whisper-v1.0-q5_0.bin -f sample_ja_speech.wav --output-file transcription.quantized --output-json
|
69 |
```
|
70 |
|
71 |
+
Note that the benchmark results are almost identical to the raw non-quantized model weight.
|
72 |
+
|
73 |
|
74 |
## Model Details
|
75 |
|