facebook
/

sapiens-pretrain-0.3b-torchscript

rawalkhirodkar commited on Sep 9, 2024

Commit

75f1709

verified ·

1 Parent(s): d89dd46

Update model card for Sapiens with architecture details

Files changed (1) hide show

README.md CHANGED Viewed

@@ -3,7 +3,7 @@ language: en
 license: cc-by-nc-4.0
 ---
-# Sapiens-0.3B-torchscript
 ## Model Card for Sapiens
@@ -12,16 +12,18 @@ Sapiens is a family of vision transformers pretrained on 300 million human image
 ## Model Details
 ### Model Description
-Sapiens-0.3B natively support 1K high-resolution inference and are extremely easy to adapt for individual tasks by simply fine-tuning models pretrained on over 300 million in-the-wild human images. The resulting models exhibit remarkable generalization to in-the-wild data, even when labeled data is scarce or entirely synthetic. Our simple model design also brings scalability - model performance across tasks improves as we scale the parameters from 0.3 to 2 billion. Sapiens consistently surpasses existing baselines across various human-centric benchmarks.
 - **Developed by:** Meta
 - **Model type:** Vision Transformers
 - **License:** Creative Commons Attribution-NonCommercial 4.0
-- **Model Size:** 0.3B
 - **Task:** pretrain
 - **Format:** torchscript
 - **File:** sapiens_0.3b_epoch_1600_torchscript.pt2
 ### Model Sources
 - **Repository:** [https://github.com/facebookresearch/sapiens](https://github.com/facebookresearch/sapiens)
@@ -29,5 +31,5 @@ Sapiens-0.3B natively support 1K high-resolution inference and are extremely eas
 ## Uses
-Pretrained 0.3B model can be used for feature extraction, fine-tuning, or as a starting point for training new models.

 license: cc-by-nc-4.0
 ---
+# Sapiens-0.3b-torchscript
 ## Model Card for Sapiens
 ## Model Details
 ### Model Description
+Sapiens-0.3b natively support 1K high-resolution inference and are extremely easy to adapt for individual tasks by simply fine-tuning models pretrained on over 300 million in-the-wild human images. The resulting models exhibit remarkable generalization to in-the-wild data, even when labeled data is scarce or entirely synthetic. Our simple model design also brings scalability - model performance across tasks improves as we scale the parameters from 0.3 to 2 billion. Sapiens consistently surpasses existing baselines across various human-centric benchmarks.
 - **Developed by:** Meta
 - **Model type:** Vision Transformers
 - **License:** Creative Commons Attribution-NonCommercial 4.0
+- **Model Size:** 0.3b
 - **Task:** pretrain
 - **Format:** torchscript
 - **File:** sapiens_0.3b_epoch_1600_torchscript.pt2
 ### Model Sources
 - **Repository:** [https://github.com/facebookresearch/sapiens](https://github.com/facebookresearch/sapiens)
 ## Uses
+Pretrained 0.3b model can be used for feature extraction, fine-tuning, or as a starting point for training new models.