---
library_name: transformers
license: mit
datasets:
- bilalfaye/english-wolof-french-translation
- bilalfaye/english-wolof-french-translation-bis
language:
- fr
- wo
metrics:
- bleu
- chrf
base_model:
- facebook/nllb-200-distilled-600M
pipeline_tag: translation
---
## Model Description
This model is a fine-tuned version of **nllb-200-distilled-600M**, specifically adapted for French-Wolof and Wolof-French translation. It was trained using the **bilalfaye/english-wolof-french-translation** and **bilalfaye/english-wolof-french-translation-bis** datasets, which underwent significant preprocessing to enhance translation quality.

The model supports bidirectional translation:
- **Wolof to French**
- **French to Wolof**

### How to Use  
**1. Inference Manually**  
Install the required library:  
```bash
!pip install transformers
```

Python code for translation:  
```python
from transformers import NllbTokenizer, AutoModelForSeq2SeqLM
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"
model_load_name = 'bilalfaye/nllb-200-distilled-600M-wolof-french'

# Load model and tokenizer
model = AutoModelForSeq2SeqLM.from_pretrained(model_load_name).to(device)
tokenizer = NllbTokenizer.from_pretrained(model_load_name)

def translate(
    text, src_lang='wol_Latn', tgt_lang='french_Latn',
    a=32, b=3, max_input_length=1024, num_beams=4, **kwargs
):
    """Turn a text or a list of texts into a list of translations"""
    tokenizer.src_lang = src_lang
    tokenizer.tgt_lang = tgt_lang
    inputs = tokenizer(
        text, return_tensors='pt', padding=True, truncation=True,
        max_length=max_input_length
    )
    model.eval()
    result = model.generate(
        **inputs.to(model.device),
        forced_bos_token_id=tokenizer.convert_tokens_to_ids(tgt_lang),
        max_new_tokens=int(a + b * inputs.input_ids.shape[1]),
        num_beams=num_beams, **kwargs
    )
    return tokenizer.batch_decode(result, skip_special_tokens=True)

# Example usage
print(translate("Ndax mën nga ko waxaat su la neexee?", src_lang="wol_Latn", tgt_lang="french_Latn")[0])
print(translate("Bonjour, où allez-vous?", src_lang="fra_Latn", tgt_lang="wol_Latn")[0])
```

**2. Inference with Pipeline**  
Install the required library:  
```bash
!pip install transformers
```

Python code using the pipeline:  
```python
from transformers import pipeline

model_name = 'bilalfaye/nllb-200-distilled-600M-wolof-french'
device = "cuda" if torch.cuda.is_available() else "cpu"

translator = pipeline("translation", model=model_name, device=device)

print(translator("Ndax mën nga ko waxaat su la neexee?", src_lang="wol_Latn", tgt_lang="fra_Latn")[0]['translation_text'])
print(translator("Bonjour, où allez-vous?", src_lang="fra_Latn", tgt_lang="wol_Latn")[0]['translation_text'])
```

---

## Package Versions  
This model was developed and tested using the following package versions:  
- **transformers**: 4.41.2  
- **torch**: 2.4.0+cu121  
- **datasets**: 3.2.0  
- **sentencepiece**: 0.2.0  
- **sacrebleu**: 2.5.1  

---

## Author  
**Bila Faye**  

Feel free to reach out for questions or improvements!