Tushar Roy
Update README.md
a8ed9ea
---
license: apache-2.0
base_model: distilbert-base-uncased
tags:
- generated_from_trainer
datasets:
- imdb
model-index:
- name: distilbert-base-uncased-finetuned-imdb-mlm-accelerate
results: []
metrics:
- perplexity
pipeline_tag: fill-mask
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# distilbert-base-uncased-finetuned-imdb-mlm-acclerate
This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the imdb dataset.
It achieves the following results on the evaluation set:
- Perplexity: 11.0482
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
Accelerate
### Training hyperparameters
The following hyperparameters were used during training:
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: AdamW with lr=5e-5
- lr_scheduler_type: linear
- num_epochs: 3.0
### Training results
| Training Loss | Epoch | Validation Loss | Perplexity |
|:-------------------:|:-----:|:------------------:|:------------------:|
| 2.657470791203201 | 1.0 | 2.462477684020996 | 11.733848321632662 |
| 2.5095403741119773 | 2.0 | 2.4211950302124023 | 11.259306489817392 |
| 2.4733242700054388 | 3.0 | 2.4022672176361084 | 11.048196673044224 |
Run history:
Perplexity β–ˆβ–ƒβ–
eval/loss β–ˆβ–‚β–β–β–
eval/runtime β–ˆβ–β–β–β–
eval/samples_per_second β–β–‡β–ˆβ–‡β–ˆ
eval/steps_per_second β–β–‡β–ˆβ–‡β–ˆ
train/epoch β–β–β–„β–…β–ˆβ–ˆβ–ˆβ–ˆ
train/global_step β–β–ƒβ–ƒβ–†β–†β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
train/learning_rate β–ˆβ–„β–
train/loss β–ˆβ–ƒβ–
train/total_flos ▁
train/train_loss ▁
train/train_runtime ▁
train/train_samples_per_second ▁
train/train_steps_per_second ▁
Run summary:
Perplexity 11.0482
eval/loss 2.41189
eval/runtime 1.923
eval/samples_per_second 520.03
eval/steps_per_second 8.32
train/epoch 3.0
train/global_step 471
train/learning_rate 0.0
train/loss 2.5354
train/total_flos 994208670720000.0
train/train_loss 2.60498
train/train_runtime 159.5259
train/train_samples_per_second 188.057
train/train_steps_per_second 2.952
View run classic-pond-2 at: https://wandb.ai/tchoud8/distilbert-base-uncased-finetuned-imdb-accelerate/runs/a7hw7i1u
### Framework versions
- Transformers 4.32.0.dev0
- Pytorch 2.0.1+cu118
- Datasets 2.14.4
- Tokenizers 0.13.3