Spaces:

WwYc
/

explain-LXMERT

Running

App Files Files Community

explain-LXMERT / lxmert /run /README.md

WwYc

Upload 61 files

08d7644 verified over 1 year ago

preview code

raw

history blame

2.46 kB

	# Running Script Arguments

	```
	Data Splits:
	--train [str,str,...]: use the splits (separated by comma) in training.
	--valid [str,str,...]: use the splits (separated by comma) in validation.
	--test [str,str,...]: use the splits (separated by comma) in testing.
	Model Architecture:
	--llayers [int]: number of layers in language encoder.
	--xlayers [int]: number of layers in cross-modality encoder.
	--rlayers [int]: number of layers in object relationship encoder.
	Load Weights:
	--load [str='path/to/saved_model']: load fine-tuned model path/to/saved_model.pth.
	--loadLXMERT [str='path/to/saved_model']: load pre-trained model without answer heads from path/to/saved_model_LXRT.pth.
	--loadLXMERTQA [str='path/to/saved_model']: load pre-trained model with answer head path/to/saved_model_LXRT.pth.
	--fromScratch: If none of the above loading parameters are set, the default mode would
	load the pre-trained BERT weights.
	As we promised to EMNLP reviewers, the language encoder would be re-initialized with this one-line argument to test the performance without BERT weights.
	Training Hyper Parameters:
	--batchSize [int]: batch size.
	--optim [str]: optimizers.
	--lr [float]: peak learning rate.
	--epochs [int]: training epochs.
	Debugging:
	--tiny: Load 512 images for each data split. (Note: number of images might be changed due to dataset specification)
	--fast: Load 5000 images for each data split. (Note: number of images might be changed due to dataset specification)
	```

	# Pre-training-Specific Arguments
	```
	Pre-training Tasks:
	--taskMaskLM: use the masked language model task.
	--taskObjPredict: use the masked object prediction task.
	--taskMatched: use the cross-modality matched task.
	--taskQA: use the image QA task.
	Visual Pre-training Losses (Tasks):
	--visualLosses [str,str,...]: The sub-tasks in pre-training visual modality. Each one is from 'obj,attr,feat'.
	obj: detected-object-label classification.
	attr: detected-object-attribute classification.
	feat: RoI-feature regression.
	Mask Rate in Pre-training:
	--wordMaskRate [float]: The prob of masking a word.
	--objMaskRate [float]: The prob of masking an object.
	Initialization:
	--fromScratch: The default mode would load the pre-trained BERT weights into the model.
	As we promised to EMNLP reviewers, this option would re-initialize the language encoder.
	```