tcm03
/

tsbir

Patsorn commited on Oct 21, 2022

Commit

609371e

2 Parent(s): 3679f10 e22ba9d

Merge branch 'main' of https://github.com/janesjanes/tsbir

Files changed (2) hide show

README.md CHANGED Viewed

@@ -1,7 +1,9 @@
 # Image Retrieval with Text and Sketch
-This code is for our 2022 ECCV paper [[A Sketch Is Worth a Thousand Words: Image Retrieval with Text and Sketch]](https://patsorn.me/projects/tsbir/)
-<img src="https://patsorn.me/projects/tsbir/img/teaser_web_mini.jpg" width="900px"/>
 ---------------------
 folder structure
@@ -9,24 +11,25 @@ folder structure
     |---model/       : Contain the trained model*
     |---sketches/    : Contain example query sketch
     |---images/      : Contain 100 randomly sampled images from COCO TBIR benchmark
-    |---notebooks/   : Contain the demo ipynb notebook (can run via Colab)
     |---code/
         |---training/model_configs/      : Contain model config file for the network
         |---clip/                        : Contain source code for running the notebook
-*model can be downloaded from https://patsorn.me/projects/tsbir/data/tsbir_model_final.pt
-This repo is based on open_clip implementation from https://github.com/mlfoundations/open_clip
 ## Prerequisites
 - Pytorch
 ## Getting Started
-Simply run notebooks/Retrieval_Demo.ipynb, you can use your own set of images and sketches by modifying the images/ and sketches/ folder accordingly.
 ## Download Models
-Pre-trained models
 - <a href='https://patsorn.me/projects/tsbir/data/tsbir_model_final.pt' > Pre-trained models </a>
 ## Citation

 # Image Retrieval with Text and Sketch
+This code is for our 2022 ECCV paper [A Sketch Is Worth a Thousand Words: Image Retrieval with Text and Sketch](https://patsorn.me/projects/tsbir/)
+<img src="https://patsorn.me/projects/tsbir/img/teaser_web_mini.jpg" width="800px"/>
+This repo is based on open_clip implementation from https://github.com/mlfoundations/open_clip
 ---------------------
 folder structure
     |---model/       : Contain the trained model*
     |---sketches/    : Contain example query sketch
     |---images/      : Contain 100 randomly sampled images from COCO TBIR benchmark
+    |---notebooks/   : Contain the demo ipynb notebook
     |---code/
         |---training/model_configs/      : Contain model config file for the network
         |---clip/                        : Contain source code for running the notebook
+*need to be downloaded first
 ## Prerequisites
 - Pytorch
 ## Getting Started
+- Simply open jupyter notebook in `notebooks/Retrieval_Demo.ipynb` for an example of how to retrieve images using our model,
+- You can use your own set of images and sketches by modifying the `images/` and `sketches/` folder accordingly.
+- Colab version of the notebook is available [[here]](https://colab.research.google.com/)
 ## Download Models
 - <a href='https://patsorn.me/projects/tsbir/data/tsbir_model_final.pt' > Pre-trained models </a>
 ## Citation

code/clip/model.py CHANGED Viewed

@@ -237,10 +237,6 @@ class VisualTransformer(nn.Module):
         return x
-from x_transformers.autoregressive_wrapper import AutoregressiveWrapper
-from x_transformers import ViTransformerWrapper, TransformerWrapper, Encoder, Decoder
 class CLIP(nn.Module):
     def __init__(self,
                  embed_dim: int,
@@ -503,4 +499,4 @@ def build_model(state_dict: dict, weight_sharing: bool, feature_fusion: str, num
     convert_weights(model)
     #TODO: only do strict=false when loading from state with 'visual2' branch
     model.load_state_dict(state_dict, strict=False)
-    return model.eval()

         return x
 class CLIP(nn.Module):
     def __init__(self,
                  embed_dim: int,
     convert_weights(model)
     #TODO: only do strict=false when loading from state with 'visual2' branch
     model.load_state_dict(state_dict, strict=False)
+    return model.eval()