DocWolle commited on
Commit
e1de7f2
·
verified ·
1 Parent(s): 18ed5c7

Update README.md

Browse files

Explain how to create models for a certain language

Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -36,3 +36,18 @@ pipeline_tag: automatic-speech-recognition
36
  )
37
  return {"sequences": outputs["sequences"]}
38
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
  )
37
  return {"sequences": outputs["sequences"]}
38
 
39
+
40
+
41
+
42
+ In order to force transcription for a certain language set the 1. decoder id as shown below:
43
+
44
+ def transcribe(self, input_features):
45
+ outputs = self.model.generate(
46
+ input_features,
47
+ max_new_tokens=450, # change as needed
48
+ return_dict_in_generate=True,
49
+ forced_decoder_ids=[[1, 50261], [2, 50359], [3, 50363]], # forced to transcribe (50359) German (50261) with no timestamps (50363)
50
+ )
51
+ return {"sequences": outputs["sequences"]}
52
+
53
+ (language codes from here: https://github.com/woheller69/whisperIME/blob/master/app/src/main/java/com/whispertflite/utils/InputLang.java)