OpenGVLab
/

InternVL-Chat-V1-1

Image-Text-to-Text

feature-extraction

Model card Files Files and versions Community

czczup commited on Apr 20, 2024

Commit

c1d4ea1

·

verified ·

1 Parent(s): 0f3cf67

Update README.md

Files changed (1) hide show

README.md +2 -3

README.md CHANGED Viewed

@@ -21,10 +21,9 @@ InternVL scales up the ViT to _**6B parameters**_ and aligns it with LLM.
 ## Model Details
 - **Model Type:** multimodal large language model (MLLM)
 - **Model Stats:**
-  - Architecture: [InternViT-6B-448px](https://huggingface.co/OpenGVLab/InternViT-6B-448px) + MLP + LLaMA2-13B (One of our internal SFT versions)
   - Params: 19B
-  - Image size: 448 x 448
-  - Number of visual tokens: 256
 - **Training Strategy:**
   - Pretraining Stage

 ## Model Details
 - **Model Type:** multimodal large language model (MLLM)
 - **Model Stats:**
+  - Architecture: [InternViT-6B-448px](https://huggingface.co/OpenGVLab/InternViT-6B-448px) + MLP + LLaMA2-13B (Our internal SFT versions)
+  - Image size: 448 x 448 (256 tokens)
   - Params: 19B
 - **Training Strategy:**
   - Pretraining Stage