--- library_name: transformers license: mit datasets: - bilalfaye/english-wolof-french-translation - bilalfaye/english-wolof-french-translation-bis language: - fr - wo metrics: - bleu - chrf base_model: - facebook/nllb-200-distilled-600M pipeline_tag: translation --- ## Model Description This model is a fine-tuned version of **nllb-200-distilled-600M**, specifically adapted for French-Wolof and Wolof-French translation. It was trained using the **bilalfaye/english-wolof-french-translation** and **bilalfaye/english-wolof-french-translation-bis** datasets, which underwent significant preprocessing to enhance translation quality. The model supports bidirectional translation: - **Wolof to French** - **French to Wolof** ### How to Use **1. Inference Manually** Install the required library: ```bash !pip install transformers ``` Python code for translation: ```python from transformers import NllbTokenizer, AutoModelForSeq2SeqLM import torch device = "cuda" if torch.cuda.is_available() else "cpu" model_load_name = 'bilalfaye/nllb-200-distilled-600M-wolof-french' # Load model and tokenizer model = AutoModelForSeq2SeqLM.from_pretrained(model_load_name).to(device) tokenizer = NllbTokenizer.from_pretrained(model_load_name) def translate( text, src_lang='wol_Latn', tgt_lang='french_Latn', a=32, b=3, max_input_length=1024, num_beams=4, **kwargs ): """Turn a text or a list of texts into a list of translations""" tokenizer.src_lang = src_lang tokenizer.tgt_lang = tgt_lang inputs = tokenizer( text, return_tensors='pt', padding=True, truncation=True, max_length=max_input_length ) model.eval() result = model.generate( **inputs.to(model.device), forced_bos_token_id=tokenizer.convert_tokens_to_ids(tgt_lang), max_new_tokens=int(a + b * inputs.input_ids.shape[1]), num_beams=num_beams, **kwargs ) return tokenizer.batch_decode(result, skip_special_tokens=True) # Example usage print(translate("Ndax mën nga ko waxaat su la neexee?", src_lang="wol_Latn", tgt_lang="french_Latn")[0]) print(translate("Bonjour, où allez-vous?", src_lang="fra_Latn", tgt_lang="wol_Latn")[0]) ``` **2. Inference with Pipeline** Install the required library: ```bash !pip install transformers ``` Python code using the pipeline: ```python from transformers import pipeline model_name = 'bilalfaye/nllb-200-distilled-600M-wolof-french' device = "cuda" if torch.cuda.is_available() else "cpu" translator = pipeline("translation", model=model_name, device=device) print(translator("Ndax mën nga ko waxaat su la neexee?", src_lang="wol_Latn", tgt_lang="fra_Latn")[0]['translation_text']) print(translator("Bonjour, où allez-vous?", src_lang="fra_Latn", tgt_lang="wol_Latn")[0]['translation_text']) ``` --- ## Package Versions This model was developed and tested using the following package versions: - **transformers**: 4.41.2 - **torch**: 2.4.0+cu121 - **datasets**: 3.2.0 - **sentencepiece**: 0.2.0 - **sacrebleu**: 2.5.1 --- ## Author **Bila Faye** Feel free to reach out for questions or improvements!