|
--- |
|
license: mit |
|
tags: |
|
- unsloth |
|
- trl |
|
- sft |
|
datasets: |
|
- olafdil/French_MultiSpeaker_Diarization |
|
language: |
|
- fr |
|
base_model: |
|
- unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit |
|
pipeline_tag: text-generation |
|
--- |
|
# Fine-Tuned Model: Meta-Llama-3.1-8B-Instruct-bnb-4bit |
|
|
|
This is a fine-tuned version of the [Meta-Llama-3.1-8B-Instruct-bnb-4bit](https://huggingface.co/unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit) model, adapted for French multi-speaker diarization tasks. Below, you'll find details about the fine-tuning process, dataset, and how to use this model. |
|
|
|
--- |
|
|
|
## Model Details |
|
|
|
- **Base Model**: Meta-Llama-3.1-8B-Instruct-bnb-4bit |
|
- **Quantization**: 4-bit quantization for reduced memory usage |
|
- **Purpose**: Fine-tuned for multi-speaker diarization in French. |
|
- **Techniques**: |
|
- LoRA (Low-Rank Adaptation) for efficient fine-tuning. |
|
- Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`. |
|
- Rank: `16` |
|
- LoRA alpha: `16` |
|
- Gradient checkpointing: Enabled. |
|
|
|
--- |
|
|
|
## Dataset |
|
|
|
The model was fine-tuned on the `French_MultiSpeaker_Diarization` dataset, hosted on the Hugging Face Hub: |
|
|
|
- **Dataset Name**: [French_MultiSpeaker_Diarization](https://huggingface.co/datasets/olafdil/French_MultiSpeaker_Diarization) |
|
- **Split Used**: Train |
|
- **Dataset Content**: |
|
- Multispeaker conversational data in French. |
|
- Includes labeled diarization information to improve diarization capabilities. |
|
|
|
--- |
|
|
|
## Training Configuration |
|
|
|
### Hyperparameters |
|
|
|
- **Max Sequence Length**: `120,000` |
|
- **LoRA Dropout**: `0` |
|
- **Bias**: `none` |
|
- **Use Gradient Checkpointing**: Enabled for efficiency. |
|
- **Custom Prompting**: Chat templates applied for formatting prompts (e.g., `llama-3.1` template). |
|
|
|
### Training Workflow |
|
|
|
1. **Model Loading**: |
|
- Loaded the base model using `FastLanguageModel.from_pretrained()`. |
|
- Applied 4-bit quantization for memory efficiency. |
|
|
|
2. **Dataset Preparation**: |
|
- The dataset was tokenized using a custom chat template from the `unsloth.chat_templates` library. |
|
- Prompts formatted with `apply_chat_template()` to suit the diarization task. |
|
|
|
3. **Fine-Tuning**: |
|
- LoRA applied to specific layers for adaptation. |
|
- Gradient checkpointing enabled to reduce memory overhead during training. |
|
|
|
--- |
|
|
|
## Usage |
|
|
|
### Load the Model |
|
You can load this model directly from Hugging Face: |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_name = "olafdil/FrDiarization-Llama-3.1-8B-4bit" |
|
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto") |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
``` |
|
### Inference Example |
|
```python |
|
template = """ |
|
I have an audio transcription where multiple speakers are involved in a conversation. |
|
Your task is to distinguish the different speakers and diarize the text accordingly. |
|
Each speaker's dialogue should be clearly labeled, such as 'Speaker 1:', 'Speaker 2:', etc. |
|
Ensure that the labels remain consistent throughout the transcription and that the text is formatted neatly. |
|
Here's the transcription: |
|
""" |
|
transciption = "Your input transcription here" |
|
prompt = template + transcription |
|
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
outputs = model.generate(**inputs) |
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
``` |
|
|
|
## Dependencies |
|
|
|
The following libraries were used: |
|
|
|
- `transformers` |
|
- `datasets` |
|
- `unsloth` |
|
- `torch` |
|
|
|
To install the dependencies, you can use: |
|
|
|
```bash |
|
pip install transformers datasets torch unsloth |
|
``` |
|
|
|
## Limitations |
|
|
|
- The model has been fine-tuned specifically for French multi-speaker diarization tasks and may not generalize well to other tasks or languages. |
|
- 4-bit quantization reduces memory usage but may slightly affect precision. |
|
|
|
--- |
|
|
|
## Citation |
|
|
|
If you use this model, please consider citing the base model and the dataset: |
|
|
|
- **Base Model**: Meta-Llama-3.1-8B-Instruct-bnb-4bit |
|
- **Dataset**: French MultiSpeaker Diarization |