[EMNLP 2024] RWKV-CLIP: A Robust Vision-Language Representation Learner

This model is RWKV-CLIP-B/32 training on YFCC15M. Please refer to https://github.com/deepglint/RWKV-CLIP for more detailed information.

If you find this model useful, please use the following BibTeX entry for citation.

@misc{gu2024rwkvclip,
      title={RWKV-CLIP: A Robust Vision-Language Representation Learner}, 
      author={Tiancheng Gu and Kaicheng Yang and Xiang An and Ziyong Feng and Dongnan Liu and Weidong Cai and Jiankang Deng},
      year={2024},
      eprint={2406.06973},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.