Model description
This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an Indonesian-English CoVoST2 dataset.
Intended uses & limitations
This model is used to predict the translation of Indonesian Transcription.
How to Use
This is how to use the model with Faster-Whisper.
Convert the model into the CTranslate2 format with float16 quantization.
!ct2-transformers-converter --model cobrayyxx/nllb-indo-en-covost2 --quantization float16 --output_dir ct2/ct2-nllb-indo-en-float16
Load the converted model using
ctranslate2
library.from faster_whisper import WhisperModel import os ct2_model_name = "ct2-nllb-indo-en-float16" ct_model_path = os.path.join("ct2", ct2_model_name) translator = ctranslate2.Translator(ct_model_path, device=device)
Download the SentencePiece model
!wget https://s3.amazonaws.com/opennmt-models/nllb-200/flores200_sacrebleu_tokenizer_spm.model
Load the SentencePiece model
import sentencepiece as spm sp_model_path = os.path.join(directory, "flores200_sacrebleu_tokenizer_spm.model") sp = spm.SentencePieceProcessor() sp.load(sp_model_path)
Now, the loaded model can be used.
src_lang = "ind_Latn" tgt_lang = "eng_Latn" beam_size = 5 source_sentences = lst_of_sentences source_sents = [sent.strip() for sent in source_sentences] target_prefix = [[tgt_lang]] * len(source_sents) # Chunk source sentences into subword source_sents_subworded = sp.encode_as_pieces(source_sents) source_sents_subworded = [[src_lang] + sent + ["</s>"] for sent in source_sents_subworded] # Translate the source sentences translations = translator.translate_batch(source_sents_subworded, batch_type="tokens", max_batch_size=2024, beam_size=beam_size, target_prefix=target_prefix) translations = [translation.hypotheses[0] for translation in translations] # Merge all of the subword in the target sentences translations_desubword = sp.decode(translations) translations_desubword = [sent[len(tgt_lang):].strip() for sent in translations_desubword]
Note: If you faced the kernel error everytime running the code above. You have to install
nvidia-cublas
andnvidia-cudnn
apt update apt install libcudnn9-cuda-12
and Install the library using pip. Read The Documentation for more.
pip install nvidia-cublas-cu12 nvidia-cudnn-cu12==9.* export LD_LIBRARY_PATH=`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__))'`
Big shout out to Yasmin Moslem, for solving this issue.
Training procedure
Training Results
Epoch | Training Loss | Validation Loss | BLEU |
---|---|---|---|
1 | 0.119100 | 0.048539 | 60.267190 |
2 | 0.020900 | 0.044844 | 59.821654 |
3 | 0.014600 | 0.048637 | 60.185481 |
4 | 0.007200 | 0.052005 | 60.150045 |
5 | 0.005100 | 0.054909 | 59.938441 |
6 | 0.004500 | 0.056668 | 60.032409 |
7 | 0.003800 | 0.058903 | 60.176242 |
8 | 0.002900 | 0.059880 | 60.168394 |
9 | 0.002400 | 0.060914 | 60.280851 |
Model Evaluation
The performance of the baseline and fine-tuned model were evaluated using the BLEU and CHRF++ metrics on the validation dataset. This fine-tuned model shows some improvement over the baseline model.
Evaluation details
- BLEU: Measures the overlap between predicted and reference text based on n-grams.
- CHRF: Uses character n-grams for evaluation, making it particularly suitable for morphologically rich languages.
Credits
Huge thanks to Yasmin Moslem for mentoring me.
- Downloads last month
- 20
Model tree for cobrayyxx/nllb-indo-en-covost2
Base model
facebook/nllb-200-distilled-600M