Update README.md
Browse files
README.md
CHANGED
@@ -183,7 +183,41 @@ RuntimeError: Error(s) in loading state_dict for CLIPVisionModel:
|
|
183 |
size mismatch for vision_model.embeddings.position_embedding.weight: copying a param with shape torch.Size([729, 1152]) from checkpoint, the shape in current model is torch.Size([730, 1152]).
|
184 |
You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.
|
185 |
```
|
186 |
-
If you meet this error, you can fix this error following the guidelines
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
187 |
|
188 |
**Load Model**
|
189 |
```python
|
|
|
183 |
size mismatch for vision_model.embeddings.position_embedding.weight: copying a param with shape torch.Size([729, 1152]) from checkpoint, the shape in current model is torch.Size([730, 1152]).
|
184 |
You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.
|
185 |
```
|
186 |
+
If you meet this error, you can fix this error following the guidelines as below:
|
187 |
+
|
188 |
+
<details>
|
189 |
+
<summary>Error handling guideline</summary>
|
190 |
+
|
191 |
+
This is a logical error encountered when loading the vision tower from the local path. To fix this issue, you can prepare the environment in any of the following ways.
|
192 |
+
|
193 |
+
**Option 1: Install from our fork of LLaVA-NeXT:**
|
194 |
+
|
195 |
+
```shell
|
196 |
+
pip install git+https://github.com/inst-it/LLaVA-NeXT.git
|
197 |
+
```
|
198 |
+
|
199 |
+
**Option 2: Install from LLaVA-NeXT and manually modify its code:**
|
200 |
+
* step 1: clone source code
|
201 |
+
```shell
|
202 |
+
git clone https://github.com/LLaVA-VL/LLaVA-NeXT.git
|
203 |
+
```
|
204 |
+
* step 2: before installing LLaVA-NeXT, you need to modify `line 17` of [llava/model/multimodal_encoder/builder.py](https://github.com/LLaVA-VL/LLaVA-NeXT/blob/main/llava/model/multimodal_encoder/builder.py#L17).
|
205 |
+
```python
|
206 |
+
# Before modification:
|
207 |
+
if is_absolute_path_exists or vision_tower.startswith("openai") or vision_tower.startswith("laion") or "ShareGPT4V" in vision_tower:
|
208 |
+
|
209 |
+
# After modification:
|
210 |
+
if "clip" in vision_tower or vision_tower.startswith("openai") or vision_tower.startswith("laion") or "ShareGPT4V" in vision_tower:
|
211 |
+
```
|
212 |
+
* step 3: install LLaVA-NeXT from source:
|
213 |
+
```shell
|
214 |
+
cd LLaVA-NeXT
|
215 |
+
pip install --upgrade pip # Enable PEP 660 support.
|
216 |
+
pip install -e ".[train]"
|
217 |
+
```
|
218 |
+
|
219 |
+
We recommend the first way because it is simple.
|
220 |
+
</details>
|
221 |
|
222 |
**Load Model**
|
223 |
```python
|