wjpoom commited on
Commit
f642936
·
verified ·
1 Parent(s): 40e0faa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -1
README.md CHANGED
@@ -183,7 +183,41 @@ RuntimeError: Error(s) in loading state_dict for CLIPVisionModel:
183
  size mismatch for vision_model.embeddings.position_embedding.weight: copying a param with shape torch.Size([729, 1152]) from checkpoint, the shape in current model is torch.Size([730, 1152]).
184
  You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.
185
  ```
186
- If you meet this error, you can fix this error following the guidelines in [this issue](https://github.com/inst-it/inst-it/issues/3).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
187
 
188
  **Load Model**
189
  ```python
 
183
  size mismatch for vision_model.embeddings.position_embedding.weight: copying a param with shape torch.Size([729, 1152]) from checkpoint, the shape in current model is torch.Size([730, 1152]).
184
  You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.
185
  ```
186
+ If you meet this error, you can fix this error following the guidelines as below:
187
+
188
+ <details>
189
+ <summary>Error handling guideline</summary>
190
+
191
+ This is a logical error encountered when loading the vision tower from the local path. To fix this issue, you can prepare the environment in any of the following ways.
192
+
193
+ **Option 1: Install from our fork of LLaVA-NeXT:**
194
+
195
+ ```shell
196
+ pip install git+https://github.com/inst-it/LLaVA-NeXT.git
197
+ ```
198
+
199
+ **Option 2: Install from LLaVA-NeXT and manually modify its code:**
200
+ * step 1: clone source code
201
+ ```shell
202
+ git clone https://github.com/LLaVA-VL/LLaVA-NeXT.git
203
+ ```
204
+ * step 2: before installing LLaVA-NeXT, you need to modify `line 17` of [llava/model/multimodal_encoder/builder.py](https://github.com/LLaVA-VL/LLaVA-NeXT/blob/main/llava/model/multimodal_encoder/builder.py#L17).
205
+ ```python
206
+ # Before modification:
207
+ if is_absolute_path_exists or vision_tower.startswith("openai") or vision_tower.startswith("laion") or "ShareGPT4V" in vision_tower:
208
+
209
+ # After modification:
210
+ if "clip" in vision_tower or vision_tower.startswith("openai") or vision_tower.startswith("laion") or "ShareGPT4V" in vision_tower:
211
+ ```
212
+ * step 3: install LLaVA-NeXT from source:
213
+ ```shell
214
+ cd LLaVA-NeXT
215
+ pip install --upgrade pip # Enable PEP 660 support.
216
+ pip install -e ".[train]"
217
+ ```
218
+
219
+ We recommend the first way because it is simple.
220
+ </details>
221
 
222
  **Load Model**
223
  ```python