Regarding Finetunning
Thanks for your great work!
I noticed that the tokenizer currently uses left padding, and the padding token is set as <|endoftext|>. For fine-tuning, can I continue using these settings directly, or should I switch to right padding with the standard pad tokens?
Thank you!
Hi @yifehuang97
I am trying (learning) to fine-tune this model and was wondering how did you do it? Is it like lower level torch/transformer stuff or are there high level fine-tuning library that supports this?
Would really appreciate your advice. Also any good materials for learning (fine tuning multimodal embedding)?
Thanks.
Hi @ququwowo
In my case I fine-tune the model just like any standard Hugging Face model:
Model wrapper
I subclass Qwen2VLForConditionalGeneration (adding a small projection head for my downstream task).Custom trainer
I extend transformers.Trainer and override compute_loss().