For instance, the [transformers.AutomaticSpeechRecognitionPipeline.__call__] method has a return_timestamps parameter which sounds promising for subtitling videos: | |
transcriber = pipeline(model="openai/whisper-large-v2", return_timestamps=True) | |
transcriber("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac") | |
{'text': ' I have a dream that one day this nation will rise up and live out the true meaning of its creed. |