File size: 2,435 Bytes
e2f45ec
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
# DistilBERT Fine-Tuned on IMDB for Masked Language Modeling

## Model Description

This model is a fine-tuned version of [**`distilbert-base-uncased`**](https://huggingface.co/distilbert/distilbert-base-uncased) for the masked language modeling task. It has been trained on the IMDb dataset.


## Model Training Details

### Training Dataset

- **Dataset:** [IMDB dataset](https://huggingface.co/datasets/imdb) from Hugging Face
- **Dataset Split:**
  - Train: 25,000 samples
  - Test: 25,000 samples
  - Unsupervised: 50,000 samples
- **Training and Unsupervised Data Concatenation:** Training performed on a combined dataset of train and unsupervised splits.

### Training Arguments

The following parameters were used during fine-tuning:

- **Number of Training Epochs:** `10`
- **Overwrite Output Directory:** `True`
- **Evaluation Strategy:** `steps`
  - **Evaluation Steps:** `500`
- **Checkpoint Save Strategy:** `steps`
  - **Save Steps:** `500`
- **Load Best Model at End:** `True`
- **Metric for Best Model:** `eval_loss`
  - **Direction:** Lower `eval_loss` is better (`greater_is_better = False`).
- **Learning Rate:** `2e-5`
- **Weight Decay:** `0.01`
- **Per-Device Batch Size (Training):** `32`
- **Per-Device Batch Size (Evaluation):** `32`
- **Warmup Steps:** `1,000`
- **Mixed Precision Training:** Enabled (`fp16 = True`)
- **Logging Steps:** `100`
- **Gradient Accumulation Steps:** `2`

### Early Stopping

- The model was configured with **early stopping** to prevent overfitting.
- Training stopped after **5.87 epochs** (21,000 steps), as there was no significant improvement in `eval_loss`.

## Evaluation Results

- **Metric Used:** `eval_loss`
- **Final Perplexity:** `8.34`
- **Best Checkpoint:** Model saved at the end of early stopping (step `21,000`).

## Model Usage

The model can be used for masked language modeling tasks using the `fill-mask` pipeline from Hugging Face. Example:

```python

from transformers import pipeline



mask_filler = pipeline("fill-mask", model="Prikshit7766/distilbert-finetuned-imdb-mlm")



text = "This is a great [MASK]."

predictions = mask_filler(text)



for pred in predictions:

    print(f">>> {pred['sequence']}")

```

**Output Example:**

```text

>>> This is a great movie.

>>> This is a great film.

>>> This is a great show.

>>> This is a great documentary.

>>> This is a great story.

```