Model Description

Fine-tuned Whisper-tiny on SwissDial-ZH dataset for Swiss German dialects.

Model Details

Training

  • Duration: 4 hours
  • Hardware: NVIDIA RTX 3080
  • Batch Size: 32
  • Train/Test Split: 90%/10% (specific sentence selection)

Performance

  • WER: ~37% on test set

Usage

from transformers import WhisperForConditionalGeneration, WhisperProcessor

model_name = "nizarmichaud/whisper-tiny-swiss-german"
model = WhisperForConditionalGeneration.from_pretrained(model_name)
processor = WhisperProcessor.from_pretrained(model_name)

audio_input = ...  # Your audio input here
inputs = processor(audio_input, return_tensors="pt", sampling_rate=16000)
generated_ids = model.generate(inputs["input_features"])
transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)

print(transcription)

license: mit

Downloads last month
166
Safetensors
Model size
37.8M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.