TencentGameMate
/

chinese-hubert-large

Feature Extraction

Model card Files Files and versions Community

TencentGameMate commited on Jun 6, 2022

Commit

229b61b

·

1 Parent(s): 63cbc61

Update README.md

Files changed (1) hide show

README.md +45 -0

README.md CHANGED Viewed

@@ -1,3 +1,48 @@
 ---
 license: mit
 ---

 ---
 license: mit
 ---
+This model does not have a tokenizer as it was pretrained on audio alone.
+In order to use this model speech recognition, a tokenizer should be created and the model should be fine-tuned on labeled text data.
+python package:
+transformers==4.16.2
+```python
+import torch
+import torch.nn.functional as F
+import soundfile as sf
+from fairseq import checkpoint_utils
+from transformers import (
+    Wav2Vec2FeatureExtractor,
+    HubertModel,
+)
+from transformers.models.wav2vec2.modeling_wav2vec2 import _compute_mask_indices
+model_path=""
+wav_path=""
+feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained(model_path)
+model = HubertModel.from_pretrained(model_path)
+# for pretrain: Wav2Vec2ForPreTraining
+# model = Wav2Vec2ForPreTraining.from_pretrained(model_path)
+model = model.to(device)
+model = model.half()
+model.eval()
+wav, sr = sf.read(wav_path)
+input_values = feature_extractor(wav, return_tensors="pt").input_values
+input_values = input_values.half()
+input_values = input_values.to(device)
+with torch.no_grad():
+    outputs = model(input_values)
+    last_hidden_state = outputs.last_hidden_state
+```