Prikshit7766 commited on
Commit
e2f45ec
·
verified ·
1 Parent(s): aaf8ded

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +77 -0
README.md ADDED
@@ -0,0 +1,77 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # DistilBERT Fine-Tuned on IMDB for Masked Language Modeling
2
+
3
+ ## Model Description
4
+
5
+ This model is a fine-tuned version of [**`distilbert-base-uncased`**](https://huggingface.co/distilbert/distilbert-base-uncased) for the masked language modeling task. It has been trained on the IMDb dataset.
6
+
7
+
8
+ ## Model Training Details
9
+
10
+ ### Training Dataset
11
+
12
+ - **Dataset:** [IMDB dataset](https://huggingface.co/datasets/imdb) from Hugging Face
13
+ - **Dataset Split:**
14
+ - Train: 25,000 samples
15
+ - Test: 25,000 samples
16
+ - Unsupervised: 50,000 samples
17
+ - **Training and Unsupervised Data Concatenation:** Training performed on a combined dataset of train and unsupervised splits.
18
+
19
+ ### Training Arguments
20
+
21
+ The following parameters were used during fine-tuning:
22
+
23
+ - **Number of Training Epochs:** `10`
24
+ - **Overwrite Output Directory:** `True`
25
+ - **Evaluation Strategy:** `steps`
26
+ - **Evaluation Steps:** `500`
27
+ - **Checkpoint Save Strategy:** `steps`
28
+ - **Save Steps:** `500`
29
+ - **Load Best Model at End:** `True`
30
+ - **Metric for Best Model:** `eval_loss`
31
+ - **Direction:** Lower `eval_loss` is better (`greater_is_better = False`).
32
+ - **Learning Rate:** `2e-5`
33
+ - **Weight Decay:** `0.01`
34
+ - **Per-Device Batch Size (Training):** `32`
35
+ - **Per-Device Batch Size (Evaluation):** `32`
36
+ - **Warmup Steps:** `1,000`
37
+ - **Mixed Precision Training:** Enabled (`fp16 = True`)
38
+ - **Logging Steps:** `100`
39
+ - **Gradient Accumulation Steps:** `2`
40
+
41
+ ### Early Stopping
42
+
43
+ - The model was configured with **early stopping** to prevent overfitting.
44
+ - Training stopped after **5.87 epochs** (21,000 steps), as there was no significant improvement in `eval_loss`.
45
+
46
+ ## Evaluation Results
47
+
48
+ - **Metric Used:** `eval_loss`
49
+ - **Final Perplexity:** `8.34`
50
+ - **Best Checkpoint:** Model saved at the end of early stopping (step `21,000`).
51
+
52
+ ## Model Usage
53
+
54
+ The model can be used for masked language modeling tasks using the `fill-mask` pipeline from Hugging Face. Example:
55
+
56
+ ```python
57
+ from transformers import pipeline
58
+
59
+ mask_filler = pipeline("fill-mask", model="Prikshit7766/distilbert-finetuned-imdb-mlm")
60
+
61
+ text = "This is a great [MASK]."
62
+ predictions = mask_filler(text)
63
+
64
+ for pred in predictions:
65
+ print(f">>> {pred['sequence']}")
66
+ ```
67
+
68
+ **Output Example:**
69
+
70
+ ```text
71
+ >>> This is a great movie.
72
+ >>> This is a great film.
73
+ >>> This is a great show.
74
+ >>> This is a great documentary.
75
+ >>> This is a great story.
76
+ ```
77
+