Swin-Base: Image Classification
Swin-Base is the base version of the Swin Transformer family, a hierarchical Vision Transformer that excels at image representation tasks. It introduces a shifted window attention mechanism, enabling efficient computation while capturing both local and global image context. Swin-Base is widely used in tasks such as image classification, object detection, and semantic segmentation. As a mid-sized model, it strikes a strong balance between accuracy and inference efficiency, offering better generalization compared to conventional CNN-based architectures, and is well-suited for various computer vision applications.
Source model
- Input shape: 1x3x224x224
- Number of parameters: 83.70M
- Model size: 340.3M
- Output shape: 1x1000
The source model can be found here
Performance Reference
Please search model by model name in Model Farm
Inference & Model Conversion
Please search model by model name in Model Farm
License
Source Model: BSD-3-CLAUSE
Deployable Model: APLUX-MODEL-FARM-LICENSE