07/03/2023 01:07:03 - WARNING - __main__ - Process rank: -1, device: cuda:0, n_gpu: 1distributed training: False, 16-bits training: True 07/03/2023 01:07:03 - INFO - __main__ - Training/evaluation parameters TrainingArguments( _n_gpu=1, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_pin_memory=True, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=None, disable_tqdm=False, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_steps=391, evaluation_strategy=steps, fp16=True, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'fsdp_min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=1, gradient_checkpointing=False, greater_is_better=True, group_by_length=False, half_precision_backend=auto, hub_model_id=None, hub_private_repo=False, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_inputs_for_metrics=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=5e-05, length_column_name=length, load_best_model_at_end=True, local_rank=-1, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/nvme1/yujiepan/token-dropping-logs/logs-final/train-imdb-baseline/0703-lr5e-05_warm0.1_epoch6/runs/Jul03_01-06-59_csr-dgx1-01, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=500, logging_strategy=steps, lr_scheduler_type=linear, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=accuracy, mp_parameters=, no_cuda=False, num_train_epochs=6.0, optim=adamw_torch, optim_args=None, output_dir=/nvme1/yujiepan/token-dropping-logs/logs-final/train-imdb-baseline/0703-lr5e-05_warm0.1_epoch6, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=64, per_device_train_batch_size=32, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=[], resume_from_checkpoint=None, run_name=/nvme1/yujiepan/token-dropping-logs/logs-final/train-imdb-baseline/0703-lr5e-05_warm0.1_epoch6, save_on_each_node=False, save_steps=391, save_strategy=steps, save_total_limit=1, seed=42, sharded_ddp=[], skip_memory_metrics=True, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_ipex=False, use_legacy_prediction_loop=False, use_mps_device=False, warmup_ratio=0.1, warmup_steps=0, weight_decay=0.0, xpu_backend=None, ) 07/03/2023 01:07:03 - INFO - datasets.builder - No config specified, defaulting to the single config: imdb/plain_text 07/03/2023 01:07:03 - INFO - datasets.info - Loading Dataset Infos from /home/yujiepan/.cache/huggingface/modules/datasets_modules/datasets/imdb/d613c88cf8fa3bab83b4ded3713f1f74830d1100e171db75bbddb80b3345c9c0 07/03/2023 01:07:03 - INFO - datasets.builder - Overwrite dataset info from restored data version if exists. 07/03/2023 01:07:03 - INFO - datasets.info - Loading Dataset info from /dev/shm/huggingface/datasets/imdb/plain_text/1.0.0/d613c88cf8fa3bab83b4ded3713f1f74830d1100e171db75bbddb80b3345c9c0 07/03/2023 01:07:03 - WARNING - datasets.builder - Found cached dataset imdb (/dev/shm/huggingface/datasets/imdb/plain_text/1.0.0/d613c88cf8fa3bab83b4ded3713f1f74830d1100e171db75bbddb80b3345c9c0) 07/03/2023 01:07:03 - INFO - datasets.info - Loading Dataset info from /dev/shm/huggingface/datasets/imdb/plain_text/1.0.0/d613c88cf8fa3bab83b4ded3713f1f74830d1100e171db75bbddb80b3345c9c0 07/03/2023 01:07:05 - INFO - datasets.arrow_dataset - Caching processed dataset at /dev/shm/huggingface/datasets/imdb/plain_text/1.0.0/d613c88cf8fa3bab83b4ded3713f1f74830d1100e171db75bbddb80b3345c9c0/cache-449826aec475610d.arrow 07/03/2023 01:07:11 - INFO - datasets.arrow_dataset - Caching processed dataset at /dev/shm/huggingface/datasets/imdb/plain_text/1.0.0/d613c88cf8fa3bab83b4ded3713f1f74830d1100e171db75bbddb80b3345c9c0/cache-ee7d3591a7ef8a71.arrow 07/03/2023 01:07:19 - INFO - datasets.arrow_dataset - Caching processed dataset at /dev/shm/huggingface/datasets/imdb/plain_text/1.0.0/d613c88cf8fa3bab83b4ded3713f1f74830d1100e171db75bbddb80b3345c9c0/cache-74a80ed80f1f2db9.arrow 07/03/2023 01:07:34 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /home/yujiepan/imdb-long/cache-13f54ac8f7bff8cf.arrow 07/03/2023 01:07:34 - INFO - __main__ - Sample 20952 of the training set: {'text': 'Arguably this is a very good "sequel", better than the first live action film 101 Dalmatians. It has good dogs, good actors, good jokes and all right slapstick!

Cruella DeVil, who has had some rather major therapy, is now a lover of dogs and very kind to them. Many, including Chloe Simon, owner of one of the dogs that Cruella once tried to kill, do not believe this. Others, like Kevin Shepherd (owner of 2nd Chance Dog Shelter) believe that she has changed.

Meanwhile, Dipstick, with his mate, have given birth to three cute dalmatian puppies! Little Dipper, Domino and Oddball...

Starring Eric Idle as Waddlesworth (the hilarious macaw), Glenn Close as Cruella herself and Gerard Depardieu as Le Pelt (another baddie, the name should give a clue), this is a good family film with excitement and lots more!! One downfall of this film is that is has a lot of painful slapstick, but not quite as excessive as the last film. This is also funnier than the last film.

Enjoy "102 Dalmatians"! :-)', 'label': 1, 'input_ids': [101, 15835, 2023, 2003, 1037, 2200, 2204, 1000, 8297, 1000, 1010, 2488, 2084, 1996, 2034, 2444, 2895, 2143, 7886, 17488, 18900, 7066, 1012, 2009, 2038, 2204, 6077, 1010, 2204, 5889, 1010, 2204, 13198, 1998, 2035, 2157, 14308, 21354, 999, 1026, 7987, 1013, 1028, 1026, 7987, 1013, 1028, 10311, 2721, 6548, 1010, 2040, 2038, 2018, 2070, 2738, 2350, 7242, 1010, 2003, 2085, 1037, 7089, 1997, 6077, 1998, 2200, 2785, 2000, 2068, 1012, 2116, 1010, 2164, 9318, 4079, 1010, 3954, 1997, 2028, 1997, 1996, 6077, 2008, 10311, 2721, 2320, 2699, 2000, 3102, 1010, 2079, 2025, 2903, 2023, 1012, 2500, 1010, 2066, 4901, 11133, 1006, 3954, 1997, 3416, 3382, 3899, 7713, 1007, 2903, 2008, 2016, 2038, 2904, 1012, 1026, 7987, 1013, 1028, 1026, 7987, 1013, 1028, 5564, 1010, 16510, 21354, 1010, 2007, 2010, 6775, 1010, 2031, 2445, 4182, 2000, 2093, 10140, 17488, 18900, 2937, 26781, 13046, 999, 2210, 16510, 4842, 1010, 23968, 1998, 5976, 7384, 1012, 1012, 1012, 1026, 7987, 1013, 1028, 1026, 7987, 1013, 1028, 4626, 4388, 18373, 2004, 11333, 20338, 12255, 1006, 1996, 26316, 6097, 10376, 1007, 1010, 9465, 2485, 2004, 10311, 2721, 2841, 1998, 11063, 2139, 19362, 10265, 2226, 2004, 3393, 21877, 7096, 1006, 2178, 2919, 10265, 1010, 1996, 2171, 2323, 2507, 1037, 9789, 1007, 1010, 2023, 2003, 1037, 2204, 2155, 2143, 2007, 8277, 1998, 7167, 2062, 999, 999, 2028, 22252, 1997, 2023, 2143, 2003, 2008, 2003, 2038, 1037, 2843, 1997, 9145, 14308, 21354, 1010, 2021, 2025, 3243, 2004, 11664, 2004, 1996, 2197, 2143, 1012, 2023, 2003, 2036, 4569, 14862, 2084, 1996, 2197, 2143, 1012, 1026, 7987, 1013, 1028, 1026, 7987, 1013, 1028, 5959, 1000, 9402, 17488, 18900, 7066, 1000, 999, 1024, 1011, 1007, 102], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}. 07/03/2023 01:07:34 - INFO - __main__ - Sample 3648 of the training set: {'text': "It's a good thing I didn't watch this while i was pregnant.I definitely would have cried my eyes out and/or vomit. It was Kind of gruesome mainly disturbing. I personally thought the baby was adorable in its own twisted little way.However as a mom I cringed when Beth stabbed herself in the stomach and when Virgina aborted the child during her 3rd trimester with rusty utensils no less.Also,as an animal lover i almost cried when she scratched the cat to a bloody pulp.However,As creepy and sinister as the baby was I was rooting for it to live.And as twisted as the movie was I am extremely intrigued to see the sequel...... ......... ....... ......... ......... ....... ...... .....", 'label': 0, 'input_ids': [101, 2009, 1005, 1055, 1037, 2204, 2518, 1045, 2134, 1005, 1056, 3422, 2023, 2096, 1045, 2001, 6875, 1012, 1045, 5791, 2052, 2031, 6639, 2026, 2159, 2041, 1998, 1013, 2030, 23251, 1012, 2009, 2001, 2785, 1997, 24665, 15808, 8462, 3701, 14888, 1012, 1045, 7714, 2245, 1996, 3336, 2001, 23677, 1999, 2049, 2219, 6389, 2210, 2126, 1012, 2174, 2004, 1037, 3566, 1045, 23952, 2043, 7014, 13263, 2841, 1999, 1996, 4308, 1998, 2043, 6261, 2050, 11113, 15613, 1996, 2775, 2076, 2014, 3822, 12241, 20367, 2007, 13174, 21183, 6132, 12146, 2053, 2625, 1012, 2036, 1010, 2004, 2019, 4111, 7089, 1045, 2471, 6639, 2043, 2016, 15047, 1996, 4937, 2000, 1037, 6703, 16016, 1012, 2174, 1010, 2004, 17109, 1998, 16491, 2004, 1996, 3336, 2001, 1045, 2001, 7117, 2075, 2005, 2009, 2000, 2444, 1012, 1998, 2004, 6389, 2004, 1996, 3185, 2001, 1045, 2572, 5186, 18896, 2000, 2156, 1996, 8297, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 1012, 102], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}. 07/03/2023 01:07:34 - INFO - __main__ - Sample 819 of the training set: {'text': "This has to be the worst movie I have seen. Madsen fans don't be drawn into this like I was. He is only in it for a maximum of five minutes. This movie is so bad that the only reason why you would watch it is if all the rest of the movies on earth as well as t.v. had been destroyed.", 'label': 0, 'input_ids': [101, 2023, 2038, 2000, 2022, 1996, 5409, 3185, 1045, 2031, 2464, 1012, 5506, 5054, 4599, 2123, 1005, 1056, 2022, 4567, 2046, 2023, 2066, 1045, 2001, 1012, 2002, 2003, 2069, 1999, 2009, 2005, 1037, 4555, 1997, 2274, 2781, 1012, 2023, 3185, 2003, 2061, 2919, 2008, 1996, 2069, 3114, 2339, 2017, 2052, 3422, 2009, 2003, 2065, 2035, 1996, 2717, 1997, 1996, 5691, 2006, 3011, 2004, 2092, 2004, 1056, 1012, 1058, 1012, 2018, 2042, 3908, 1012, 102], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}. 07/03/2023 01:07:34 - WARNING - evaluate.loading - Using the latest cached version of the module from /home/yujiepan/.cache/huggingface/modules/evaluate_modules/metrics/evaluate-metric--accuracy/f887c0aab52c2d38e1f8a215681126379eca617f96c447638f751434e8e65b14 (last modified on Wed Sep 28 14:58:53 2022) since it couldn't be found locally at evaluate-metric--accuracy, or remotely on the Hugging Face Hub. {'eval_loss': 0.4330781102180481, 'eval_accuracy': 0.91216, 'eval_runtime': 65.4808, 'eval_samples_per_second': 381.791, 'eval_steps_per_second': 5.971, 'epoch': 0.5} {'loss': 0.0385, 'learning_rate': 4.9680246328754146e-05, 'epoch': 0.64} {'eval_loss': 0.3079942464828491, 'eval_accuracy': 0.91504, 'eval_runtime': 64.589, 'eval_samples_per_second': 387.063, 'eval_steps_per_second': 6.054, 'epoch': 1.0} {'loss': 0.0654, 'learning_rate': 4.37588820464235e-05, 'epoch': 1.28} {'eval_loss': 0.3285704255104065, 'eval_accuracy': 0.91636, 'eval_runtime': 64.8695, 'eval_samples_per_second': 385.389, 'eval_steps_per_second': 6.027, 'epoch': 1.5} {'loss': 0.0636, 'learning_rate': 3.783751776409285e-05, 'epoch': 1.92} {'eval_loss': 0.42962637543678284, 'eval_accuracy': 0.9184, 'eval_runtime': 130.3196, 'eval_samples_per_second': 191.836, 'eval_steps_per_second': 3.0, 'epoch': 2.0} {'eval_loss': 0.5365941524505615, 'eval_accuracy': 0.91704, 'eval_runtime': 133.611, 'eval_samples_per_second': 187.11, 'eval_steps_per_second': 2.926, 'epoch': 2.5} {'loss': 0.0344, 'learning_rate': 3.19161534817622e-05, 'epoch': 2.56} {'eval_loss': 0.45805788040161133, 'eval_accuracy': 0.91724, 'eval_runtime': 130.2627, 'eval_samples_per_second': 191.92, 'eval_steps_per_second': 3.002, 'epoch': 3.0} {'loss': 0.0285, 'learning_rate': 2.5994789199431553e-05, 'epoch': 3.2} {'eval_loss': 0.5552997589111328, 'eval_accuracy': 0.91644, 'eval_runtime': 130.5012, 'eval_samples_per_second': 191.569, 'eval_steps_per_second': 2.996, 'epoch': 3.5} {'loss': 0.0163, 'learning_rate': 2.008526764566556e-05, 'epoch': 3.84} {'eval_loss': 0.5054952502250671, 'eval_accuracy': 0.91936, 'eval_runtime': 132.4835, 'eval_samples_per_second': 188.703, 'eval_steps_per_second': 2.951, 'epoch': 4.0} {'loss': 0.0104, 'learning_rate': 1.4163903363334913e-05, 'epoch': 4.48} {'eval_loss': 0.5432029366493225, 'eval_accuracy': 0.92644, 'eval_runtime': 130.5202, 'eval_samples_per_second': 191.541, 'eval_steps_per_second': 2.996, 'epoch': 4.5} {'eval_loss': 0.5357362627983093, 'eval_accuracy': 0.92684, 'eval_runtime': 130.5356, 'eval_samples_per_second': 191.519, 'eval_steps_per_second': 2.995, 'epoch': 5.0} {'loss': 0.0034, 'learning_rate': 8.242539081004264e-06, 'epoch': 5.12} {'eval_loss': 0.5456807613372803, 'eval_accuracy': 0.92488, 'eval_runtime': 132.2657, 'eval_samples_per_second': 189.014, 'eval_steps_per_second': 2.956, 'epoch': 5.5} {'loss': 0.0046, 'learning_rate': 2.3211747986736146e-06, 'epoch': 5.75} {'eval_loss': 0.5325844883918762, 'eval_accuracy': 0.92816, 'eval_runtime': 130.2637, 'eval_samples_per_second': 191.918, 'eval_steps_per_second': 3.002, 'epoch': 6.0} {'train_runtime': 3645.2659, 'train_samples_per_second': 41.149, 'train_steps_per_second': 1.287, 'train_loss': 0.02827812901810002, 'epoch': 6.0} ***** train metrics ***** epoch = 6.0 train_loss = 0.0283 train_runtime = 1:00:45.26 train_samples = 25000 train_samples_per_second = 41.149 train_steps_per_second = 1.287 07/03/2023 02:08:22 - INFO - __main__ - *** Evaluate *** ***** eval metrics ***** epoch = 6.0 eval_accuracy = 0.9282 eval_loss = 0.5326 eval_runtime = 0:02:10.43 eval_samples = 25000 eval_samples_per_second = 191.674 eval_steps_per_second = 2.998 ***** eval metrics ***** epoch = 6.0 eval_accuracy = 0.8909 eval_loss = 0.7935 eval_runtime = 0:00:32.16 eval_samples = 5958 eval_samples_per_second = 185.204 eval_steps_per_second = 2.922