File size: 1,872 Bytes
8c8ff11
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
import requests
import time
import json
import gradio as gr
secret_key = "6d74a4fac397423a8bd3011180b9b979"

# retrieve transcription results for the task
def get_results(config):
  # endpoint to check status of the transcription task
  endpoint = "https://api.speechtext.ai/results?"
  # use a loop to check if the task is finished
  while True:
    results = requests.get(endpoint, params=config).json()
    if "status" not in results:
      break
    # print("Task status: {}".format(results["status"]))
    if results["status"] == 'failed':
      print("The task is failed: {}".format(results))
      break
    if results["status"] == 'finished':
      break
    # sleep for 15 seconds if the task has the status - 'processing'
    time.sleep(15)
  return results

# loads the audio into memory
def  spt(audio_file):
  with open(audio_file, mode="rb") as file:
    post_body = file.read()

  # endpoint to start a transcription task
  endpoint = "https://api.speechtext.ai/recognize?"
  header = {'Content-Type': "application/octet-stream"}

  # transcription task options
  config = {
    "key" : secret_key,
    "language" : "en-US",
    "punctuation" : True,
    "format" : "m4a"
  }

  # send an audio transcription request
  r = requests.post(endpoint, headers = header, params = config, data = post_body).json()

  # get the id of the speech recognition task
  task = r["id"]
  # print("Task ID: {}".format(task))

  # get transcription results, summary, and highlights
  config = {
    "key" : secret_key,
    "task" : task,
    "summary" : True,
    "summary_size" : 15,
    "highlights" : True,
    "max_keywords" : 10
  }

  transcription = get_results(config)
  p=transcription['results']['transcript'].replace('<kw>','').replace('</kw>','')
  return p
k=gr.Interface(fn=spt, inputs=gr.Audio(source="microphone", type="filepath"), outputs="text")
k.launch()