|
2024-11-12 08:35:57,085 INFO MainThread:1335408 [wandb_setup.py:_flush():79] Current SDK version is 0.18.3 |
|
2024-11-12 08:35:57,085 INFO MainThread:1335408 [wandb_setup.py:_flush():79] Configure stats pid to 1335408 |
|
2024-11-12 08:35:57,085 INFO MainThread:1335408 [wandb_setup.py:_flush():79] Loading settings from /home/align-anything/.config/wandb/settings |
|
2024-11-12 08:35:57,085 INFO MainThread:1335408 [wandb_setup.py:_flush():79] Loading settings from /data/align-anything/hantao/align-anything/scripts/wandb/settings |
|
2024-11-12 08:35:57,085 INFO MainThread:1335408 [wandb_setup.py:_flush():79] Loading settings from environment variables: {'api_key': '***REDACTED***'} |
|
2024-11-12 08:35:57,085 INFO MainThread:1335408 [wandb_setup.py:_flush():79] Applying setup settings: {'mode': None, '_disable_service': None} |
|
2024-11-12 08:35:57,085 WARNING MainThread:1335408 [wandb_setup.py:_flush():79] Could not find program at -m align_anything.trainers.text_image_to_text_image.sft |
|
2024-11-12 08:35:57,085 INFO MainThread:1335408 [wandb_setup.py:_flush():79] Inferring run settings from compute environment: {'program_relpath': None, 'program': '-m align_anything.trainers.text_image_to_text_image.sft'} |
|
2024-11-12 08:35:57,085 INFO MainThread:1335408 [wandb_setup.py:_flush():79] Applying login settings: {} |
|
2024-11-12 08:35:57,085 INFO MainThread:1335408 [wandb_init.py:_log_setup():532] Logging user logs to ../outputs/sft_tf_cham_1015_5ep_t2i_22.5k/wandb/run-20241112_083557-komthdmh/logs/debug.log |
|
2024-11-12 08:35:57,085 INFO MainThread:1335408 [wandb_init.py:_log_setup():533] Logging internal logs to ../outputs/sft_tf_cham_1015_5ep_t2i_22.5k/wandb/run-20241112_083557-komthdmh/logs/debug-internal.log |
|
2024-11-12 08:35:57,085 INFO MainThread:1335408 [wandb_init.py:init():617] calling init triggers |
|
2024-11-12 08:35:57,086 INFO MainThread:1335408 [wandb_init.py:init():624] wandb.init called with sweep_config: {} |
|
config: {'train_cfgs': {'ds_cfgs': 'ds_z3_config.json', 'epochs': 5.0, 'seed': 42, 'per_device_train_batch_size': 4.0, 'per_device_eval_batch_size': 4.0, 'gradient_accumulation_steps': 2.0, 'gradient_checkpointing': True, 'learning_rate': 5e-05, 'lr_scheduler_type': 'cosine', 'lr_warmup_ratio': 0.03, 'weight_decay': 0.0, 'adam_betas': [0.9, 0.95], 'adam_epsilon': 1e-08, 'bf16': True, 'fp16': False, 'eval_strategy': 'steps', 'eval_interval': 1000, 'freeze_language_model': False, 'max_grad_norm': 1.0}, 'data_cfgs': {'train_datasets': '/data/align-anything/hantao/align-anything/projects/text_image_to_text_image/outputs', 'train_template': 'AA_textfeedback', 'train_size': None, 'train_split': 'train', 'train_subset': None, 'train_data_files': 'llf_t2i_22.5k_tokenized.pt', 'train_optional_args': [], 'eval_datasets': None, 'eval_template': None, 'eval_size': None, 'eval_split': None, 'eval_subset': None, 'eval_data_files': None, 'eval_optional_args': []}, 'logger_cfgs': {'log_type': 'wandb', 'log_project': 'align-anything', 'log_run_name': 'sft', 'output_dir': '../outputs/sft_tf_cham_1015_5ep_t2i_22.5k', 'cache_dir': None, 'save_interval': 400.0}, 'model_cfgs': {'model_name_or_path': '/data/align-anything/hantao/models/0917_cham_ti2t_sft', 'trust_remote_code': True, 'model_max_length': 4096}, 'special_tokens': None} |
|
2024-11-12 08:35:57,086 INFO MainThread:1335408 [wandb_init.py:init():667] starting backend |
|
2024-11-12 08:35:57,086 INFO MainThread:1335408 [wandb_init.py:init():671] sending inform_init request |
|
2024-11-12 08:35:57,092 INFO MainThread:1335408 [backend.py:_multiprocessing_setup():104] multiprocessing start_methods=fork,spawn,forkserver, using: spawn |
|
2024-11-12 08:35:57,092 INFO MainThread:1335408 [wandb_init.py:init():684] backend started and connected |
|
2024-11-12 08:35:57,097 INFO MainThread:1335408 [wandb_init.py:init():779] updated telemetry |
|
2024-11-12 08:35:57,113 INFO MainThread:1335408 [wandb_init.py:init():812] communicating run to backend with 90.0 second timeout |
|
2024-11-12 08:35:57,827 INFO MainThread:1335408 [wandb_init.py:init():863] starting run threads in backend |
|
2024-11-12 08:35:57,973 INFO MainThread:1335408 [wandb_run.py:_console_start():2465] atexit reg |
|
2024-11-12 08:35:57,974 INFO MainThread:1335408 [wandb_run.py:_redirect():2313] redirect: wrap_raw |
|
2024-11-12 08:35:57,974 INFO MainThread:1335408 [wandb_run.py:_redirect():2378] Wrapping output streams. |
|
2024-11-12 08:35:57,974 INFO MainThread:1335408 [wandb_run.py:_redirect():2403] Redirects installed. |
|
2024-11-12 08:35:57,976 INFO MainThread:1335408 [wandb_init.py:init():907] run started, returning control to user process |
|
2024-11-12 11:16:52,981 INFO MainThread:1335408 [wandb_run.py:_finish():2164] finishing run htlou/align-anything/komthdmh |
|
2024-11-12 11:16:52,983 INFO MainThread:1335408 [wandb_run.py:_atexit_cleanup():2428] got exitcode: 0 |
|
2024-11-12 11:16:52,984 INFO MainThread:1335408 [wandb_run.py:_restore():2410] restore |
|
2024-11-12 11:16:52,984 INFO MainThread:1335408 [wandb_run.py:_restore():2416] restore done |
|
2024-11-12 11:16:55,871 INFO MainThread:1335408 [wandb_run.py:_footer_history_summary_info():4049] rendering history |
|
2024-11-12 11:16:55,873 INFO MainThread:1335408 [wandb_run.py:_footer_history_summary_info():4081] rendering summary |
|
2024-11-12 11:16:55,884 INFO MainThread:1335408 [wandb_run.py:_footer_sync_info():4008] logging synced files |
|
|