Chen Jin's picture
1

Chen Jin

lxasqjc

AI & ML interests

None yet

Recent Activity

updated a collection about 4 hours ago
Lavender
updated a collection about 8 hours ago
Lavender
updated a collection about 8 hours ago
Lavender
View all activity

Organizations

UCL CMIC's profile picture

Posts 1

view post
Post
1474
โšก Can Stable Diffusion's visual expertise enhance Llama-3.2?
๐Ÿš€ Lavender: efficiently fine-tunes advanced vision-language models by aligning their text-vision attention with Stable Diffusion.
Paper: Diffusion Instruction Tuning (2502.06814)
๐Ÿ”‘ Key Highlights:
โœ… Significant Gains: +30% on 20 tasks, +68% on OOD WorldMedQA
โœ… Data-Efficient: Needs only 0.13M samples (~2.5% of typical VLM datasets)
โœ… Low Compute: Finetunes in ~1 day on 8 NVIDIA A10G GPUs
โœ… Model-Agnostic: Works with Llama-3.2-11B, MiniCPM-Llama3-v2.5 & more
โœ… Precise Alignment: Transfers strong text-vision alignment from Stable Diffusion
โœ… Open-Source: Code, data & finetuned models will be available

๐Ÿ‘ฅ Discuss live at: https://www.alphaxiv.org/abs/2502.06814
๐Ÿ”— Project Page: https://astrazeneca.github.io/vlm/

models

None public yet

datasets

None public yet