File size: 2,435 Bytes
e2f45ec |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
# DistilBERT Fine-Tuned on IMDB for Masked Language Modeling
## Model Description
This model is a fine-tuned version of [**`distilbert-base-uncased`**](https://huggingface.co/distilbert/distilbert-base-uncased) for the masked language modeling task. It has been trained on the IMDb dataset.
## Model Training Details
### Training Dataset
- **Dataset:** [IMDB dataset](https://huggingface.co/datasets/imdb) from Hugging Face
- **Dataset Split:**
- Train: 25,000 samples
- Test: 25,000 samples
- Unsupervised: 50,000 samples
- **Training and Unsupervised Data Concatenation:** Training performed on a combined dataset of train and unsupervised splits.
### Training Arguments
The following parameters were used during fine-tuning:
- **Number of Training Epochs:** `10`
- **Overwrite Output Directory:** `True`
- **Evaluation Strategy:** `steps`
- **Evaluation Steps:** `500`
- **Checkpoint Save Strategy:** `steps`
- **Save Steps:** `500`
- **Load Best Model at End:** `True`
- **Metric for Best Model:** `eval_loss`
- **Direction:** Lower `eval_loss` is better (`greater_is_better = False`).
- **Learning Rate:** `2e-5`
- **Weight Decay:** `0.01`
- **Per-Device Batch Size (Training):** `32`
- **Per-Device Batch Size (Evaluation):** `32`
- **Warmup Steps:** `1,000`
- **Mixed Precision Training:** Enabled (`fp16 = True`)
- **Logging Steps:** `100`
- **Gradient Accumulation Steps:** `2`
### Early Stopping
- The model was configured with **early stopping** to prevent overfitting.
- Training stopped after **5.87 epochs** (21,000 steps), as there was no significant improvement in `eval_loss`.
## Evaluation Results
- **Metric Used:** `eval_loss`
- **Final Perplexity:** `8.34`
- **Best Checkpoint:** Model saved at the end of early stopping (step `21,000`).
## Model Usage
The model can be used for masked language modeling tasks using the `fill-mask` pipeline from Hugging Face. Example:
```python
from transformers import pipeline
mask_filler = pipeline("fill-mask", model="Prikshit7766/distilbert-finetuned-imdb-mlm")
text = "This is a great [MASK]."
predictions = mask_filler(text)
for pred in predictions:
print(f">>> {pred['sequence']}")
```
**Output Example:**
```text
>>> This is a great movie.
>>> This is a great film.
>>> This is a great show.
>>> This is a great documentary.
>>> This is a great story.
```
|