LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 113
EXAONE-3.5 Collection EXAONE 3.5 language model series including instruction-tuned models of 2.4B, 7.8B, and 32B. • 10 items • Updated Dec 10, 2024 • 92
view post Post 1832 🤗 transformers pipelines now support vision language models for easy local inference 🫰🏻 h/t @yonigozlan for shipping this 🎩👏you can also use inference API to infer hosted vision LMs (via Python, JS and cURL) https://huggingface.co/docs/api-inference/en/tasks/image-text-to-text 🚀 9 9 + Reply