Haplo Model Card

This work presents a simple yet efficient method to construct a baseline for the native and end-to-end large multi-modal model in a single transformer. The proposed model demonstrates superior performance compared to other LMMs using one transformer and significantly narrows the performance gap with compositional LMMs.

Model date: Haplo models were trained in September 2024.

Paper or resources for more information: https://haplo-vl.github.io/

Performace

Model SEEDB POPE RWQA MMB MMStar VQAv2 GQA SQA
Haplo-8B-672 75.1 88.6 61.4 73.6 57.2 81.0 65.5 95.3
Haplo-8B-MI-672 75.5 88.2 62.0 75.0 57.6 80.7 65.0 94.4

Intended use

Primary intended uses: The primary use of Haplo is research on large multimodal models and chatbots.

Primary intended users: The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence.

Downloads last month
12
Safetensors
Model size
8.34B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Collection including rayruiyang/Haplo-8B-MI-672