Update README.md
Browse files
README.md
CHANGED
@@ -62,6 +62,21 @@ Also, currently whisper.cpp and faster-whisper support the [sequential long-form
|
|
62 |
and only Huggingface pipeline supports the [chunked long-form decoding](https://huggingface.co/distil-whisper/distil-large-v3#chunked-long-form), which we empirically
|
63 |
found better than the sequnential long-form decoding.
|
64 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
65 |
### Conversion details
|
66 |
The original model was converted with the following command:
|
67 |
|
@@ -77,23 +92,12 @@ git clone https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0
|
|
77 |
# convert to ggml
|
78 |
python3 ./convert-h5-to-ggml.py ./kotoba-whisper-v1.0/ ../../whisper .
|
79 |
mv ggml-model.bin ggml-kotoba-whisper-v1.0
|
80 |
-
```
|
81 |
-
|
82 |
-
### Quantized Model
|
83 |
-
To use the quantized model, download the quantized GGML weights:
|
84 |
-
|
85 |
-
```bash
|
86 |
-
wget https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0-ggml/resolve/main/ggml-kotoba-whisper-v1.0-q5_0.bin -P ./models
|
87 |
-
```
|
88 |
|
89 |
-
|
90 |
-
|
91 |
-
|
92 |
```
|
93 |
|
94 |
-
Note that the benchmark results are almost identical to the raw non-quantized model weight.
|
95 |
-
|
96 |
-
|
97 |
## Model Details
|
98 |
|
99 |
For more information about the kotoba-whisper-v1.0, refer to the original [model card](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0).
|
|
|
62 |
and only Huggingface pipeline supports the [chunked long-form decoding](https://huggingface.co/distil-whisper/distil-large-v3#chunked-long-form), which we empirically
|
63 |
found better than the sequnential long-form decoding.
|
64 |
|
65 |
+
|
66 |
+
### Quantized Model
|
67 |
+
To use the quantized model, download the quantized GGML weights:
|
68 |
+
|
69 |
+
```bash
|
70 |
+
wget https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0-ggml/resolve/main/ggml-kotoba-whisper-v1.0-q5_0.bin -P ./models
|
71 |
+
```
|
72 |
+
|
73 |
+
Run inference on the sample audio:
|
74 |
+
```bash
|
75 |
+
make -j && ./main -m models/ggml-kotoba-whisper-v1.0-q5_0.bin -f sample_ja_speech.wav --output-file transcription.quantized --output-json
|
76 |
+
```
|
77 |
+
|
78 |
+
Note that the benchmark results are almost identical to the raw non-quantized model weight.
|
79 |
+
|
80 |
### Conversion details
|
81 |
The original model was converted with the following command:
|
82 |
|
|
|
92 |
# convert to ggml
|
93 |
python3 ./convert-h5-to-ggml.py ./kotoba-whisper-v1.0/ ../../whisper .
|
94 |
mv ggml-model.bin ggml-kotoba-whisper-v1.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
95 |
|
96 |
+
# quantize ggml model
|
97 |
+
cd ../
|
98 |
+
./quantize models/ggml-kotoba-whisper-v1.0.bin models/ggml-kotoba-whisper-v1.0-q5_0.bin q5_0
|
99 |
```
|
100 |
|
|
|
|
|
|
|
101 |
## Model Details
|
102 |
|
103 |
For more information about the kotoba-whisper-v1.0, refer to the original [model card](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0).
|