OpenCLIP
This is a fork of OpenCLIP used to fine-tune CLIP models with PinPoint counterfactuals. Refer to the original repository for more details on open_clip.
Installation
pip install open_clip_torch
Pretrained models
For LAION-pretrained models, download and place them in the ./pretrained_models (this can be done with open_clip CLI interface)/
Sample single-process running code:
To finetune CLIP models on CC3M:
python -m open_clip_train.main \
--save-frequency 1 \
--zeroshot-frequency 1 \
--report-to tensorboard \
--train-data="..path_to_image_list.csv" \
--csv-img-key="Image_ID" \
--csv-caption-key="Caption" \
--val-data="/path/to/validation_data.csv" \
--imagenet-val="/path/to/imagenet/root/val/" \
--warmup 10000 \
--batch-size=128 \
--accum_freq=10 \
--lr=5e-06 \
--wd=0.1 \
--epochs=410 \
--workers=8 \
--pretrained_model="pretrained_models/vit_b16_laion2b.pth" \
--model ViT-B-16
Note: imagenet-val
is the path to the validation set of ImageNet for zero-shot evaluation, not the training set!
You can remove this argument if you do not want to perform zero-shot evaluation on ImageNet throughout training. Note that the val
folder should contain subfolders. If it does not, please use this script.
Note: the train_data
should point to a *.csv file that contains the filelist with generated images in the following format:
ÌMAGE_ID IMAGE_CAPTION
, separated by '\t'. You can find the lists for our in-painted data under ./annotations
.