Fine-Tuned Parakeet RNNT 0.6B (Urdu)

This repository contains the fine-tuned version of the Parakeet RNNT 0.6B model for Urdu Automatic Speech Recognition (ASR). The base model, developed by NVIDIA NeMo and Suno.ai, was fine-tuned on the Urdu dataset from Mozilla's Common Voice 12.0. This fine-tuning enables the model to perform speech-to-text tasks in Urdu with improved accuracy and domain-specific adaptation.


Model Overview

The Parakeet RNNT is an XL version of the FastConformer Transducer with 600 million parameters, optimized for ASR tasks. The fine-tuned model supports Urdu transcription, enabling applications such as subtitling, speech analytics, and voice-assisted interfaces.

Base model details can be found on 🤗 Hugging Face.


Training Details

Dataset

The fine-tuning was performed using the Urdu dataset from Mozilla's Common Voice 12.0. This dataset provides diverse speech samples in Urdu, ensuring robust training.

Hardware

  • Google Colab Pro
  • NVIDIA A100 GPU

Results

The model achieved a Word Error Rate (WER) of 25.513% on the test split of the Common Voice Urdu dataset. While this may seem high, the model demonstrates impressive accuracy in many transcriptions:

  • Reference: کچھ بھی ہو سکتا ہے۔
    Predicted: کچھ بھی ہو سکتا ہے۔

  • Reference: اورکوئی جمہوریت کو کوس رہا ہے۔
    Predicted: اور کوئ جمہوریت کو کو س رہا ہے۔

This WER is slightly higher than OpenAI's Whisper model, which achieved 23% without fine-tuning (reference), but demonstrates the potential of the Parakeet RNNT with further fine-tuning.


How to Use this Model

Loading the Model

You can load the fine-tuned model using NVIDIA NeMo:

import nemo.collections.asr as nemo_asr
asr_model = nemo_asr.models.EncDecRNNTBPEModel.from_pretrained(model_name="hash2004/parakeet-fine-tuned-urdu")

How to Fine Tune this Model

You can find all resources on fine-tuning the Parakeet RNNT (0.6B) model on this GitHub Repository.

Downloads last month
6
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Dataset used to train hash2004/parakeet-fine-tuned-urdu

Evaluation results