metadata

tags:
  - generated_from_trainer
model-index:
  - name: multi-label-class-classification-on-github-issues
    results: []

multi-label-class-classification-on-github-issues

This model is a fine-tuned version of neuralmagic/oBERT-12-upstream-pruned-unstructured-97 on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1301
Micro f1: 0.5159
Macro f1: 0.0352

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 64
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 15
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Micro f1	Macro f1
No log	1.0	25	0.3994	0.3783	0.0172
No log	2.0	50	0.2846	0.3791	0.0172
No log	3.0	75	0.2159	0.3791	0.0172
No log	4.0	100	0.1802	0.3791	0.0172
No log	5.0	125	0.1618	0.3791	0.0172
No log	6.0	150	0.1515	0.3791	0.0172
No log	7.0	175	0.1452	0.3791	0.0172
No log	8.0	200	0.1411	0.3931	0.0202
No log	9.0	225	0.1379	0.4413	0.0277
No log	10.0	250	0.1350	0.4694	0.0309
No log	11.0	275	0.1327	0.4993	0.0336
No log	12.0	300	0.1309	0.5084	0.0344
No log	13.0	325	0.1297	0.5147	0.0349
No log	14.0	350	0.1291	0.5060	0.0343
No log	15.0	375	0.1287	0.5107	0.0346

Framework versions

Transformers 4.25.1
Pytorch 1.13.0+cu116
Datasets 2.8.0
Tokenizers 0.13.2

Day 1

Tried to use the Neural Magic Model "neuralmagic/oBERT-12-upstream-pruned-unstructured-97". The macro and micro f1 scores were much smaller at the beginning of the model; the initial step did not increase much. However, it did outperform in the same epoch by .159 difference in the f1 score.
Modification of the code was more significant was able to add errors in my program to move to the CPU if there was an error in my program

import gc
'''
Try and Catch the block when training the model using more memory than the GPU, it will produce an error.
1. Check the Amount of GPU memory used
2. Move the model to the CPU
3. Call the garbage collector 
4. Free the GPU memory in the cache
5. Check the amount of GPU memory used to see if it is freed
'''
def check_gpu_memory():
    print(torch.cuda.memory_allocated()/1e9)
    return torch.cuda.memory_allocated()/1e9
try:
    trainer.train()
except RuntimeError as e:
    if "CUDA out of memory" in str(e):
        print("CUDA out of memory")
        print("Let's free some GPU memory and re-allocate")
        check_gpu_memory()
        ## Move the model to CPU
        model.to("cpu")
        gc.collect()
        ## Free the GPU memory
        torch.cuda.empty_cache()
        check_gpu_memory()
    else:
        raise e

Able to check if there was a number of support my model can support in my model

from transformers import Trainer, TrainingArguments
def is_on_colab():
    if 'google.colab' in sys.modules:
        return True
    return False

training_args_fine_tune = TrainingArguments(
    output_dir  = "./multi-label-class-classification-on-github-issues" , 
    num_train_epochs = 15,
    learning_rate = 3e-5,
    per_device_train_batch_size = 64 , 
    evaluation_strategy = "epoch" , 
    save_strategy="epoch"  , 
    load_best_model_at_end=True, 
    metric_for_best_model='micro f1',
    save_total_limit=1, 
    log_level='error', 
    push_to_hub = True  if is_on_colab else False ,
    )
if torch.cuda.is_available():
    ## check if the Cuda GPU can bfloat16
    if torch.cuda.is_bf16_supported():
        print("Cuda GPU can support bfloat16")
        training_args_fine_tune.fp16 = True
    else:
        print("Cuda GPU cannot support bfloat16 so instead we will use float16 ")
        training_args_fine_tune.fp16 = True