Swin-Base: Image Classification

Swin-Base is the base version of the Swin Transformer family, a hierarchical Vision Transformer that excels at image representation tasks. It introduces a shifted window attention mechanism, enabling efficient computation while capturing both local and global image context. Swin-Base is widely used in tasks such as image classification, object detection, and semantic segmentation. As a mid-sized model, it strikes a strong balance between accuracy and inference efficiency, offering better generalization compared to conventional CNN-based architectures, and is well-suited for various computer vision applications.

Source model

  • Input shape: 1x3x224x224
  • Number of parameters: 83.70M
  • Model size: 340.3M
  • Output shape: 1x1000

The source model can be found here

Performance Reference

Please search model by model name in Model Farm

Inference & Model Conversion

Please search model by model name in Model Farm

License

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support