nectec
/

Pathumma-llm-vision-2.0.0-preview

Visual Question Answering

Model card Files Files and versions Community

Thirawarit commited on Jan 31

Commit

1bd40ee

·

verified ·

1 Parent(s): bcb5b96

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ base_model:
 pipeline_tag: visual-question-answering
 ---
-# Pathumma-llm-vision-1.0.0
 ## Model Overview
 Pathumma-llm-vision-2.0.0-preview is a multi-modal language model fine-tuned for Visual Question Answering (VQA) and Image Captioning tasks. It contains 8 billion parameters and leverages both image and text processing to understand and generate multi-modal content.

 pipeline_tag: visual-question-answering
 ---
+# Pathumma-llm-vision-2.0.0-preview
 ## Model Overview
 Pathumma-llm-vision-2.0.0-preview is a multi-modal language model fine-tuned for Visual Question Answering (VQA) and Image Captioning tasks. It contains 8 billion parameters and leverages both image and text processing to understand and generate multi-modal content.