File size: 742 Bytes
f31bc36 bedddae f31bc36 bedddae f31bc36 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
---
base_model:
- MAmmoTH-VL/MAmmoTH-VL-8B
datasets:
- TIGER-Lab/VisualWebInstruct
language:
- en
library_name: transformers
license: apache-2.0
pipeline_tag: image-text-to-text
---
# Introduction
MAmmoTH-VL2, the model trained with VisualWebInstruct.
# Links
[Github](https://github.com/TIGER-AI-Lab/VisualWebInstruct)|
[Paper](https://arxiv.org/abs/2503.10582)|
[Website](https://tiger-ai-lab.github.io/VisualWebInstruct/)
# Citation
```
@article{visualwebinstruct,
title={VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search},
author = {Jia, Yiming and Li, Jiachen and Yue, Xiang and Li, Bo and Nie, Ping and Zou, Kai and Chen, Wenhu},
journal={arXiv preprint arXiv:2503.10582},
year={2025}
}
``` |