Rami commited on
Commit
fb416a5
·
1 Parent(s): 6794b02

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -0
README.md CHANGED
@@ -70,3 +70,67 @@ The following hyperparameters were used during training:
70
  - Pytorch 1.13.0+cu116
71
  - Datasets 2.8.0
72
  - Tokenizers 0.13.2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
70
  - Pytorch 1.13.0+cu116
71
  - Datasets 2.8.0
72
  - Tokenizers 0.13.2
73
+ # Day 1
74
+
75
+ 1. Tried to use the Neural Magic Model "neuralmagic/oBERT-12-upstream-pruned-unstructured-97". The macro and micro f1 scores were much smaller at the
76
+ beginning of the model; the initial step did not increase much. However, it did outperform in the same epoch by .159 difference in the f1 score.
77
+ 2. Modification of the code was more significant was able to add errors in my program to move to the CPU if there was an error in my program
78
+ ``` Python
79
+ import gc
80
+ '''
81
+ Try and Catch the block when training the model using more memory than the GPU, it will produce an error.
82
+ 1. Check the Amount of GPU memory used
83
+ 2. Move the model to the CPU
84
+ 3. Call the garbage collector
85
+ 4. Free the GPU memory in the cache
86
+ 5. Check the amount of GPU memory used to see if it is freed
87
+ '''
88
+ def check_gpu_memory():
89
+ print(torch.cuda.memory_allocated()/1e9)
90
+ return torch.cuda.memory_allocated()/1e9
91
+ try:
92
+ trainer.train()
93
+ except RuntimeError as e:
94
+ if "CUDA out of memory" in str(e):
95
+ print("CUDA out of memory")
96
+ print("Let's free some GPU memory and re-allocate")
97
+ check_gpu_memory()
98
+ ## Move the model to CPU
99
+ model.to("cpu")
100
+ gc.collect()
101
+ ## Free the GPU memory
102
+ torch.cuda.empty_cache()
103
+ check_gpu_memory()
104
+ else:
105
+ raise e
106
+ ```
107
+ 4. Able to check if there was a number of support my model can support in my model
108
+ ``` Python
109
+ from transformers import Trainer, TrainingArguments
110
+ def is_on_colab():
111
+ if 'google.colab' in sys.modules:
112
+ return True
113
+ return False
114
+
115
+ training_args_fine_tune = TrainingArguments(
116
+ output_dir = "./multi-label-class-classification-on-github-issues" ,
117
+ num_train_epochs = 15,
118
+ learning_rate = 3e-5,
119
+ per_device_train_batch_size = 64 ,
120
+ evaluation_strategy = "epoch" ,
121
+ save_strategy="epoch" ,
122
+ load_best_model_at_end=True,
123
+ metric_for_best_model='micro f1',
124
+ save_total_limit=1,
125
+ log_level='error',
126
+ push_to_hub = True if is_on_colab else False ,
127
+ )
128
+ if torch.cuda.is_available():
129
+ ## check if the Cuda GPU can bfloat16
130
+ if torch.cuda.is_bf16_supported():
131
+ print("Cuda GPU can support bfloat16")
132
+ training_args_fine_tune.fp16 = True
133
+ else:
134
+ print("Cuda GPU cannot support bfloat16 so instead we will use float16 ")
135
+ training_args_fine_tune.fp16 = True
136
+ ```