JingyiLiu commited on
Commit
ab85cff
Β·
verified Β·
1 Parent(s): 02cf6a9

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -25,14 +25,14 @@ tags:
25
 
26
  πŸ“– [Paper](https://arxiv.org/abs/2410.17241) | 🏠 [Home](https://github.com/ai4colonoscopy/IntelliScope)
27
 
28
- > This is the merged weights of [ColonGPT-v1-phi1.5-siglip-lora](https://drive.google.com/drive/folders/1Emi7o7DpN0zlCPIYqsCfNMr9LTPt3SCT?usp=sharing), including vision encoder (siglip) + language model (phi-1.5), and other fine-tuned weights on our ColonINST.
29
 
30
  Our ColonGPT is a standard multimodal language model, which contains four basic components: a language tokenizer, an visual encoder (πŸ€— [SigLIP-SO](https://huggingface.co/google/siglip-so400m-patch14-384)), a multimodal connector, and a language model (πŸ€— [Phi1.5](https://huggingface.co/microsoft/phi-1_5)). In this huggingface page, we provide a quick start for convenient of new users. For further details about ColonGPT, we highly recommend visiting our [homepage](https://github.com/BAAI-DCAI/Bunny). There, you'll find comprehensive usage instructions for our model and the latest advancements in intelligent colonoscopy technology.
31
 
32
 
33
  # Quick start
34
 
35
- Here is a code snippet to show you how to quickly try-on our ColonGPT model with transformers. For convenience, we manually combined some configuration and code files and merged the weights. Please note that this is a quick code, we recommend you installing [ColonGPT's source code](https://github.com/ai4colonoscopy/IntelliScope/blob/main/docs/guideline-for-ColonGPT.md) to explore more.
36
 
37
  - Before running the snippet, you only need to install the following minimium dependencies.
38
  ```shell
@@ -83,12 +83,12 @@ Here is a code snippet to show you how to quickly try-on our ColonGPT model with
83
  return True
84
  return False
85
 
86
- prompt = "Describe what you see in the image."
87
  text = f"USER: <image>\n{prompt} ASSISTANT:"
88
  text_chunks = [tokenizer(chunk).input_ids for chunk in text.split('<image>')]
89
  input_ids = torch.tensor(text_chunks[0] + [-200] + text_chunks[1], dtype=torch.long).unsqueeze(0).to(device)
90
 
91
- image = Image.open('cache/examples/example2.png')
92
  image_tensor = model.process_images([image], model.config).to(dtype=model.dtype, device=device)
93
 
94
  stop_str = "<|endoftext|>"
 
25
 
26
  πŸ“– [Paper](https://arxiv.org/abs/2410.17241) | 🏠 [Home](https://github.com/ai4colonoscopy/IntelliScope)
27
 
28
+ > This is the merged weights of [ColonGPT-v1-phi1.5-siglip-lora-stg2](https://drive.google.com/file/d/1xAAaVKu16czWO_jgnf-2jCgj2hf14BwM/view?usp=sharing), including vision encoder (siglip) + language model (phi-1.5), and other fine-tuned weights on our ColonINST.
29
 
30
  Our ColonGPT is a standard multimodal language model, which contains four basic components: a language tokenizer, an visual encoder (πŸ€— [SigLIP-SO](https://huggingface.co/google/siglip-so400m-patch14-384)), a multimodal connector, and a language model (πŸ€— [Phi1.5](https://huggingface.co/microsoft/phi-1_5)). In this huggingface page, we provide a quick start for convenient of new users. For further details about ColonGPT, we highly recommend visiting our [homepage](https://github.com/BAAI-DCAI/Bunny). There, you'll find comprehensive usage instructions for our model and the latest advancements in intelligent colonoscopy technology.
31
 
32
 
33
  # Quick start
34
 
35
+ Here is a code snippet to show you how to quickly try-on our ColonGPT model with transformers. The model focuses on three downstream tasks: image classification (CLS), referring expression generation (REG), and referring expression comprehension (REC). If you need a caption generator, please refer to [ColonGPT-V1-stg1](https://huggingface.co/ai4colonoscopy/ColonGPT-v1-stg1). For convenience, we manually combined some configuration and code files and merged the weights. Please note that this is a quick code, we recommend you installing [ColonGPT's source code](https://github.com/ai4colonoscopy/IntelliScope/blob/main/docs/guideline-for-ColonGPT.md) to explore more.
36
 
37
  - Before running the snippet, you only need to install the following minimium dependencies.
38
  ```shell
 
83
  return True
84
  return False
85
 
86
+ prompt = "Categorize the object."
87
  text = f"USER: <image>\n{prompt} ASSISTANT:"
88
  text_chunks = [tokenizer(chunk).input_ids for chunk in text.split('<image>')]
89
  input_ids = torch.tensor(text_chunks[0] + [-200] + text_chunks[1], dtype=torch.long).unsqueeze(0).to(device)
90
 
91
+ image = Image.open('/home/projects/u7248002/Project/ColonGPT-tmp/cache/examples/example2.png')
92
  image_tensor = model.process_images([image], model.config).to(dtype=model.dtype, device=device)
93
 
94
  stop_str = "<|endoftext|>"