Article 4 Introducing ConTextual: How well can your Multimodal model jointly reason over text and image in text-rich scenes?
hbXNov/qwen_2p5_1p5b_instruct_distill_qwen_1p5b_gpt_4o_verify_1e-5_3072_e6-checkpoint-7536-merged Updated 16 days ago • 2.04k
hbXNov/qwen_2p5_1p5b_instruct_distill_qwen_1p5b_gpt_4o_verify_5e-7_3072_merged Updated 16 days ago • 5
hbXNov/llama3.1-8b_train_gpt_4o_verifications_e3_lr5e-7-add-special-true-len3072-19233-merged Updated Dec 27, 2024 • 5