CLIPSeg / README.md
qc903113684's picture
Update README.md
c87e950 verified
---
license: other
license_name: aplux-model-farm-license
license_link: https://aiot.aidlux.com/api/v1/files/license/model_farm_license_en.pdf
pipeline_tag: image-segmentation
tags:
- AIoT
- QNN
---
![](https://aiot.aidlux.com/_next/image?url=%2Fapi%2Fv1%2Ffiles%2Fmodel%2Fcover%2F20250320024823_%25E5%259B%25BE1(6).png&w=640&q=75)
## CLIPSeg: Semantic Segmentation
CLIPSeg is an open-vocabulary image segmentation model that combines the CLIP model's powerful vision-language alignment with a Transformer-based segmentation architecture. Developed by Heidelberg University, it allows users to segment objects in an image by simply providing natural language prompts like "a dog" or "the red car." CLIPSeg excels in zero-shot generalization and does not require category-specific training, making it suitable for diverse and dynamic scenarios. It is widely used in applications such as image editing, human-computer interaction, and robotics, where both language understanding and visual perception are essential.
### Source model
- Input shape: [[1,3,352,352]],[[1,77],[1,77]], [[1,485,768],[1,485,768], [1,485,768],[1,512]]
- Number of parameters: 68.31M, 60.49M, 1.07M
- Model size: 275.98M, 245.77M, 4.99M
- Output shape: [[1,485,768],[1,485,768],[1,485,768]], [[1,512]], [[1,352,352]]
The source model can be found [here](https://github.com/timojl/clipseg)
## Performance Reference
Please search model by model name in [Model Farm](https://aiot.aidlux.com/en/models)
## Inference & Model Conversion
Please search model by model name in [Model Farm](https://aiot.aidlux.com/en/models)
## License
- Source Model: [MIT](https://github.com/timojl/clipseg/blob/master/LICENSE)
- Deployable Model: [APLUX-MODEL-FARM-LICENSE](https://aiot.aidlux.com/api/v1/files/license/model_farm_license_en.pdf)