asahi417 commited on
Commit
c5bbccf
·
verified ·
1 Parent(s): 649f58a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -0
README.md CHANGED
@@ -40,6 +40,22 @@ Note that it runs only with 16-bit WAV files, so make sure to convert your input
40
  ffmpeg -i input.mp3 -ar 16000 -ac 1 -c:a pcm_s16le output.wav
41
  ```
42
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
  ### Quantized Model
44
  To use the quantized model, download the quantized GGML weights:
45
 
@@ -52,6 +68,8 @@ Run inference on the sample audio:
52
  make -j && ./main -m models/ggml-kotoba-whisper-v1.0-q5_0.bin -f sample_ja_speech.wav --output-file transcription.quantized --output-json
53
  ```
54
 
 
 
55
 
56
  ## Model Details
57
 
 
40
  ffmpeg -i input.mp3 -ar 16000 -ac 1 -c:a pcm_s16le output.wav
41
  ```
42
 
43
+ ### Benchmark
44
+ We measure the inference speed with four different Japanese speech audio on MacBook Pro with the following spec:
45
+ - Apple M2 Pro
46
+ - 32GB
47
+ - 14-inch, 2023
48
+ - OS Sonoma Version 14.4.1 (23E224)
49
+
50
+
51
+ | audio duration (min)| inference time (sec) |
52
+ |---------------------|-------------|
53
+ | 50.3 | 581 |
54
+ | 5.6 | 41 |
55
+ | 4.9 | 30 |
56
+ | 5.6 | 35 |
57
+
58
+
59
  ### Quantized Model
60
  To use the quantized model, download the quantized GGML weights:
61
 
 
68
  make -j && ./main -m models/ggml-kotoba-whisper-v1.0-q5_0.bin -f sample_ja_speech.wav --output-file transcription.quantized --output-json
69
  ```
70
 
71
+ Note that the benchmark results are almost identical to the raw non-quantized model weight.
72
+
73
 
74
  ## Model Details
75