aplux/Swin-Base · Hugging Face

Swin-Base: Image Classification

Swin-Base is the base version of the Swin Transformer family, a hierarchical Vision Transformer that excels at image representation tasks. It introduces a shifted window attention mechanism, enabling efficient computation while capturing both local and global image context. Swin-Base is widely used in tasks such as image classification, object detection, and semantic segmentation. As a mid-sized model, it strikes a strong balance between accuracy and inference efficiency, offering better generalization compared to conventional CNN-based architectures, and is well-suited for various computer vision applications.

Source model

Input shape: 1x3x224x224
Number of parameters: 83.70M
Model size: 340.3M
Output shape: 1x1000

The source model can be found here

Performance Reference

Please search model by model name in Model Farm

Inference & Model Conversion

Please search model by model name in Model Farm

License

Source Model: BSD-3-CLAUSE
Deployable Model: APLUX-MODEL-FARM-LICENSE