Rami commited on
Commit
3f70dbd
·
1 Parent(s): e68929e

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -83
README.md CHANGED
@@ -13,9 +13,9 @@ should probably proofread and complete it, then remove this comment. -->
13
 
14
  This model is a fine-tuned version of [neuralmagic/oBERT-12-upstream-pruned-unstructured-97](https://huggingface.co/neuralmagic/oBERT-12-upstream-pruned-unstructured-97) on the None dataset.
15
  It achieves the following results on the evaluation set:
16
- - Loss: 0.1301
17
- - Micro f1: 0.5159
18
- - Macro f1: 0.0352
19
 
20
  ## Model description
21
 
@@ -40,28 +40,16 @@ The following hyperparameters were used during training:
40
  - seed: 42
41
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
  - lr_scheduler_type: linear
43
- - num_epochs: 15
44
  - mixed_precision_training: Native AMP
45
 
46
  ### Training results
47
 
48
  | Training Loss | Epoch | Step | Validation Loss | Micro f1 | Macro f1 |
49
  |:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|
50
- | No log | 1.0 | 25 | 0.3994 | 0.3783 | 0.0172 |
51
- | No log | 2.0 | 50 | 0.2846 | 0.3791 | 0.0172 |
52
- | No log | 3.0 | 75 | 0.2159 | 0.3791 | 0.0172 |
53
- | No log | 4.0 | 100 | 0.1802 | 0.3791 | 0.0172 |
54
- | No log | 5.0 | 125 | 0.1618 | 0.3791 | 0.0172 |
55
- | No log | 6.0 | 150 | 0.1515 | 0.3791 | 0.0172 |
56
- | No log | 7.0 | 175 | 0.1452 | 0.3791 | 0.0172 |
57
- | No log | 8.0 | 200 | 0.1411 | 0.3931 | 0.0202 |
58
- | No log | 9.0 | 225 | 0.1379 | 0.4413 | 0.0277 |
59
- | No log | 10.0 | 250 | 0.1350 | 0.4694 | 0.0309 |
60
- | No log | 11.0 | 275 | 0.1327 | 0.4993 | 0.0336 |
61
- | No log | 12.0 | 300 | 0.1309 | 0.5084 | 0.0344 |
62
- | No log | 13.0 | 325 | 0.1297 | 0.5147 | 0.0349 |
63
- | No log | 14.0 | 350 | 0.1291 | 0.5060 | 0.0343 |
64
- | No log | 15.0 | 375 | 0.1287 | 0.5107 | 0.0346 |
65
 
66
 
67
  ### Framework versions
@@ -70,67 +58,3 @@ The following hyperparameters were used during training:
70
  - Pytorch 1.13.0+cu116
71
  - Datasets 2.8.0
72
  - Tokenizers 0.13.2
73
- # Day 1
74
-
75
- 1. Tried to use the Neural Magic Model "neuralmagic/oBERT-12-upstream-pruned-unstructured-97". The macro and micro f1 scores were much smaller at the
76
- beginning of the model; the initial step did not increase much. However, it did outperform in the same epoch by .159 difference in the f1 score.
77
- 2. Modification of the code was more significant was able to add errors in my program to move to the CPU if there was an error in my program
78
- ``` Python
79
- import gc
80
- '''
81
- Try and Catch the block when training the model using more memory than the GPU, it will produce an error.
82
- 1. Check the Amount of GPU memory used
83
- 2. Move the model to the CPU
84
- 3. Call the garbage collector
85
- 4. Free the GPU memory in the cache
86
- 5. Check the amount of GPU memory used to see if it is freed
87
- '''
88
- def check_gpu_memory():
89
- print(torch.cuda.memory_allocated()/1e9)
90
- return torch.cuda.memory_allocated()/1e9
91
- try:
92
- trainer.train()
93
- except RuntimeError as e:
94
- if "CUDA out of memory" in str(e):
95
- print("CUDA out of memory")
96
- print("Let's free some GPU memory and re-allocate")
97
- check_gpu_memory()
98
- ## Move the model to CPU
99
- model.to("cpu")
100
- gc.collect()
101
- ## Free the GPU memory
102
- torch.cuda.empty_cache()
103
- check_gpu_memory()
104
- else:
105
- raise e
106
- ```
107
- 4. Able to check if there was a number of support my model can support in my model
108
- ``` Python
109
- from transformers import Trainer, TrainingArguments
110
- def is_on_colab():
111
- if 'google.colab' in sys.modules:
112
- return True
113
- return False
114
-
115
- training_args_fine_tune = TrainingArguments(
116
- output_dir = "./multi-label-class-classification-on-github-issues" ,
117
- num_train_epochs = 15,
118
- learning_rate = 3e-5,
119
- per_device_train_batch_size = 64 ,
120
- evaluation_strategy = "epoch" ,
121
- save_strategy="epoch" ,
122
- load_best_model_at_end=True,
123
- metric_for_best_model='micro f1',
124
- save_total_limit=1,
125
- log_level='error',
126
- push_to_hub = True if is_on_colab else False ,
127
- )
128
- if torch.cuda.is_available():
129
- ## check if the Cuda GPU can bfloat16
130
- if torch.cuda.is_bf16_supported():
131
- print("Cuda GPU can support bfloat16")
132
- training_args_fine_tune.fp16 = True
133
- else:
134
- print("Cuda GPU cannot support bfloat16 so instead we will use float16 ")
135
- training_args_fine_tune.fp16 = True
136
- ```
 
13
 
14
  This model is a fine-tuned version of [neuralmagic/oBERT-12-upstream-pruned-unstructured-97](https://huggingface.co/neuralmagic/oBERT-12-upstream-pruned-unstructured-97) on the None dataset.
15
  It achieves the following results on the evaluation set:
16
+ - Loss: 0.2718
17
+ - Micro f1: 0.3779
18
+ - Macro f1: 0.0172
19
 
20
  ## Model description
21
 
 
40
  - seed: 42
41
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
  - lr_scheduler_type: linear
43
+ - num_epochs: 30
44
  - mixed_precision_training: Native AMP
45
 
46
  ### Training results
47
 
48
  | Training Loss | Epoch | Step | Validation Loss | Micro f1 | Macro f1 |
49
  |:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|
50
+ | No log | 1.0 | 49 | 0.2715 | 0.3791 | 0.0172 |
51
+ | No log | 2.0 | 98 | 0.1682 | 0.3791 | 0.0172 |
52
+ | No log | 3.0 | 147 | 0.1425 | 0.3791 | 0.0172 |
 
 
 
 
 
 
 
 
 
 
 
 
53
 
54
 
55
  ### Framework versions
 
58
  - Pytorch 1.13.0+cu116
59
  - Datasets 2.8.0
60
  - Tokenizers 0.13.2