qaihm-bot commited on
Commit
0b3aa24
·
verified ·
1 Parent(s): ba5473e

See https://github.com/quic/ai-hub-models/releases/v0.29.1 for changelog.

README.md CHANGED
@@ -8,7 +8,7 @@ pipeline_tag: unconditional-image-generation
8
 
9
  ---
10
 
11
- ![](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/stable_diffusion_v2_1_quantized/web-assets/model_demo.png)
12
 
13
  # Stable-Diffusion-v2.1: Optimized for Mobile Deployment
14
  ## State-of-the-art generative AI model used to generate detailed images conditioned on text descriptions
@@ -21,7 +21,7 @@ This model is an implementation of Stable-Diffusion-v2.1 found [here](https://gi
21
 
22
  This repository provides scripts to run Stable-Diffusion-v2.1 on Qualcomm® devices.
23
  More details on model performance across various devices, can be found
24
- [here](https://aihub.qualcomm.com/models/stable_diffusion_v2_1_quantized).
25
 
26
 
27
  ### Model Details
@@ -36,51 +36,51 @@ More details on model performance across various devices, can be found
36
 
37
  | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
38
  |---|---|---|---|---|---|---|---|---|
39
- | TextEncoderQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 15.87 ms | 0 - 9 MB | NPU | Use Export Script |
40
- | TextEncoderQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 6.665 ms | 0 - 3 MB | NPU | Use Export Script |
41
- | TextEncoderQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 6.814 ms | 0 - 9 MB | NPU | Use Export Script |
42
- | TextEncoderQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 15.87 ms | 0 - 9 MB | NPU | Use Export Script |
43
- | TextEncoderQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 6.881 ms | 0 - 2 MB | NPU | Use Export Script |
44
- | TextEncoderQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 6.673 ms | 0 - 2 MB | NPU | Use Export Script |
45
- | TextEncoderQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 6.814 ms | 0 - 9 MB | NPU | Use Export Script |
46
- | TextEncoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 6.687 ms | 0 - 2 MB | NPU | Use Export Script |
47
- | TextEncoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 6.911 ms | 0 - 387 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
48
- | TextEncoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 4.673 ms | 0 - 18 MB | NPU | Use Export Script |
49
- | TextEncoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 5.152 ms | 0 - 19 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
50
- | TextEncoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 4.068 ms | 0 - 14 MB | NPU | Use Export Script |
51
- | TextEncoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 4.645 ms | 0 - 17 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
52
- | TextEncoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 6.825 ms | 0 - 0 MB | NPU | Use Export Script |
53
- | TextEncoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 6.871 ms | 379 - 379 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
54
- | UnetQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 241.356 ms | 0 - 8 MB | NPU | Use Export Script |
55
- | UnetQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 97.392 ms | 0 - 2 MB | NPU | Use Export Script |
56
- | UnetQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 92.092 ms | 0 - 8 MB | NPU | Use Export Script |
57
- | UnetQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 241.356 ms | 0 - 8 MB | NPU | Use Export Script |
58
- | UnetQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 97.131 ms | 0 - 3 MB | NPU | Use Export Script |
59
- | UnetQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 96.898 ms | 0 - 2 MB | NPU | Use Export Script |
60
- | UnetQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 92.092 ms | 0 - 8 MB | NPU | Use Export Script |
61
- | UnetQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 97.553 ms | 0 - 2 MB | NPU | Use Export Script |
62
- | UnetQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 98.826 ms | 0 - 899 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
63
- | UnetQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 68.634 ms | 0 - 18 MB | NPU | Use Export Script |
64
- | UnetQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 69.452 ms | 0 - 15 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
65
- | UnetQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 54.891 ms | 0 - 14 MB | NPU | Use Export Script |
66
- | UnetQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 55.714 ms | 0 - 14 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
67
- | UnetQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 98.95 ms | 0 - 0 MB | NPU | Use Export Script |
68
- | UnetQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 99.028 ms | 842 - 842 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
69
- | VaeDecoderQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 720.854 ms | 1 - 10 MB | NPU | Use Export Script |
70
- | VaeDecoderQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 277.796 ms | 0 - 3 MB | NPU | Use Export Script |
71
- | VaeDecoderQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 250.265 ms | 0 - 12 MB | NPU | Use Export Script |
72
- | VaeDecoderQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 720.854 ms | 1 - 10 MB | NPU | Use Export Script |
73
- | VaeDecoderQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 266.863 ms | 0 - 2 MB | NPU | Use Export Script |
74
- | VaeDecoderQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 267.2 ms | 0 - 2 MB | NPU | Use Export Script |
75
- | VaeDecoderQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 250.265 ms | 0 - 12 MB | NPU | Use Export Script |
76
- | VaeDecoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 273.257 ms | 0 - 2 MB | NPU | Use Export Script |
77
- | VaeDecoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 274.053 ms | 0 - 68 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
78
- | VaeDecoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 204.145 ms | 0 - 18 MB | NPU | Use Export Script |
79
- | VaeDecoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 207.419 ms | 3 - 22 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
80
- | VaeDecoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 192.667 ms | 0 - 15 MB | NPU | Use Export Script |
81
- | VaeDecoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 188.928 ms | 3 - 17 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
82
- | VaeDecoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 266.015 ms | 0 - 0 MB | NPU | Use Export Script |
83
- | VaeDecoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 266.931 ms | 63 - 63 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
84
 
85
 
86
 
@@ -90,7 +90,7 @@ More details on model performance across various devices, can be found
90
 
91
  Install the package via pip:
92
  ```bash
93
- pip install "qai-hub-models[stable-diffusion-v2-1-quantized]"
94
  ```
95
 
96
 
@@ -114,7 +114,7 @@ The package contains a simple end-to-end demo that downloads pre-trained
114
  weights and runs this model on a sample input.
115
 
116
  ```bash
117
- python -m qai_hub_models.models.stable_diffusion_v2_1_quantized.demo
118
  ```
119
 
120
  The above demo runs a reference implementation of pre-processing, model
@@ -123,7 +123,7 @@ inference, and post processing.
123
  **NOTE**: If you want running in a Jupyter Notebook or Google Colab like
124
  environment, please add the following to your cell (instead of the above).
125
  ```
126
- %run -m qai_hub_models.models.stable_diffusion_v2_1_quantized.demo
127
  ```
128
 
129
 
@@ -136,7 +136,7 @@ device. This script does the following:
136
  * Accuracy check between PyTorch and on-device outputs.
137
 
138
  ```bash
139
- python -m qai_hub_models.models.stable_diffusion_v2_1_quantized.export
140
  ```
141
  ```
142
  Profiling Results
@@ -162,8 +162,8 @@ Compute Unit(s) : npu (5783 ops) gpu (0 ops) cpu (0 ops)
162
  VaeDecoderQuantizable
163
  Device : cs_8275 (ANDROID 14)
164
  Runtime : QNN
165
- Estimated inference time (ms) : 720.9
166
- Estimated peak memory usage (MB): [1, 10]
167
  Total # Ops : 189
168
  Compute Unit(s) : npu (189 ops) gpu (0 ops) cpu (0 ops)
169
  ```
@@ -187,7 +187,7 @@ provides instructions on how to use the `.so` shared library in an Android appl
187
 
188
 
189
  ## View on Qualcomm® AI Hub
190
- Get more details on Stable-Diffusion-v2.1's performance across various devices [here](https://aihub.qualcomm.com/models/stable_diffusion_v2_1_quantized).
191
  Explore all available models on [Qualcomm® AI Hub](https://aihub.qualcomm.com/)
192
 
193
 
 
8
 
9
  ---
10
 
11
+ ![](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/stable_diffusion_v2_1/web-assets/model_demo.png)
12
 
13
  # Stable-Diffusion-v2.1: Optimized for Mobile Deployment
14
  ## State-of-the-art generative AI model used to generate detailed images conditioned on text descriptions
 
21
 
22
  This repository provides scripts to run Stable-Diffusion-v2.1 on Qualcomm® devices.
23
  More details on model performance across various devices, can be found
24
+ [here](https://aihub.qualcomm.com/models/stable_diffusion_v2_1).
25
 
26
 
27
  ### Model Details
 
36
 
37
  | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
38
  |---|---|---|---|---|---|---|---|---|
39
+ | TextEncoderQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 15.888 ms | 0 - 9 MB | NPU | Use Export Script |
40
+ | TextEncoderQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 6.611 ms | 0 - 2 MB | NPU | Use Export Script |
41
+ | TextEncoderQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 6.807 ms | 0 - 9 MB | NPU | Use Export Script |
42
+ | TextEncoderQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 15.888 ms | 0 - 9 MB | NPU | Use Export Script |
43
+ | TextEncoderQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 6.599 ms | 0 - 3 MB | NPU | Use Export Script |
44
+ | TextEncoderQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 6.635 ms | 0 - 3 MB | NPU | Use Export Script |
45
+ | TextEncoderQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 6.807 ms | 0 - 9 MB | NPU | Use Export Script |
46
+ | TextEncoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 6.618 ms | 0 - 3 MB | NPU | Use Export Script |
47
+ | TextEncoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 6.887 ms | 0 - 390 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
48
+ | TextEncoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 4.831 ms | 0 - 18 MB | NPU | Use Export Script |
49
+ | TextEncoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 4.857 ms | 0 - 20 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
50
+ | TextEncoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 4.185 ms | 0 - 14 MB | NPU | Use Export Script |
51
+ | TextEncoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 4.657 ms | 0 - 14 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
52
+ | TextEncoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 6.862 ms | 0 - 0 MB | NPU | Use Export Script |
53
+ | TextEncoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 6.914 ms | 378 - 378 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
54
+ | UnetQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 241.358 ms | 0 - 8 MB | NPU | Use Export Script |
55
+ | UnetQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 97.251 ms | 0 - 2 MB | NPU | Use Export Script |
56
+ | UnetQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 92.063 ms | 0 - 8 MB | NPU | Use Export Script |
57
+ | UnetQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 241.358 ms | 0 - 8 MB | NPU | Use Export Script |
58
+ | UnetQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 98.29 ms | 0 - 2 MB | NPU | Use Export Script |
59
+ | UnetQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 97.645 ms | 0 - 2 MB | NPU | Use Export Script |
60
+ | UnetQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 92.063 ms | 0 - 8 MB | NPU | Use Export Script |
61
+ | UnetQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 97.33 ms | 0 - 3 MB | NPU | Use Export Script |
62
+ | UnetQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 95.035 ms | 0 - 898 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
63
+ | UnetQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 68.883 ms | 0 - 19 MB | NPU | Use Export Script |
64
+ | UnetQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 68.426 ms | 0 - 20 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
65
+ | UnetQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 54.715 ms | 0 - 14 MB | NPU | Use Export Script |
66
+ | UnetQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 52.685 ms | 0 - 19 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
67
+ | UnetQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 98.906 ms | 0 - 0 MB | NPU | Use Export Script |
68
+ | UnetQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 96.026 ms | 842 - 842 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
69
+ | VaeDecoderQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 720.584 ms | 0 - 9 MB | NPU | Use Export Script |
70
+ | VaeDecoderQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 268.909 ms | 0 - 3 MB | NPU | Use Export Script |
71
+ | VaeDecoderQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 250.346 ms | 0 - 13 MB | NPU | Use Export Script |
72
+ | VaeDecoderQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 720.584 ms | 0 - 9 MB | NPU | Use Export Script |
73
+ | VaeDecoderQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 271.774 ms | 0 - 2 MB | NPU | Use Export Script |
74
+ | VaeDecoderQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 269.046 ms | 0 - 2 MB | NPU | Use Export Script |
75
+ | VaeDecoderQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 250.346 ms | 0 - 13 MB | NPU | Use Export Script |
76
+ | VaeDecoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 278.455 ms | 0 - 2 MB | NPU | Use Export Script |
77
+ | VaeDecoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 268.091 ms | 0 - 67 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
78
+ | VaeDecoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 207.671 ms | 0 - 18 MB | NPU | Use Export Script |
79
+ | VaeDecoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 203.759 ms | 3 - 23 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
80
+ | VaeDecoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 193.089 ms | 0 - 15 MB | NPU | Use Export Script |
81
+ | VaeDecoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 175.174 ms | 3 - 17 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
82
+ | VaeDecoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 266.223 ms | 0 - 0 MB | NPU | Use Export Script |
83
+ | VaeDecoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 264.72 ms | 63 - 63 MB | NPU | [Stable-Diffusion-v2.1.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/blob/main/Stable-Diffusion-v2.1_w8a16.onnx) |
84
 
85
 
86
 
 
90
 
91
  Install the package via pip:
92
  ```bash
93
+ pip install "qai-hub-models[stable-diffusion-v2-1]"
94
  ```
95
 
96
 
 
114
  weights and runs this model on a sample input.
115
 
116
  ```bash
117
+ python -m qai_hub_models.models.stable_diffusion_v2_1.demo
118
  ```
119
 
120
  The above demo runs a reference implementation of pre-processing, model
 
123
  **NOTE**: If you want running in a Jupyter Notebook or Google Colab like
124
  environment, please add the following to your cell (instead of the above).
125
  ```
126
+ %run -m qai_hub_models.models.stable_diffusion_v2_1.demo
127
  ```
128
 
129
 
 
136
  * Accuracy check between PyTorch and on-device outputs.
137
 
138
  ```bash
139
+ python -m qai_hub_models.models.stable_diffusion_v2_1.export
140
  ```
141
  ```
142
  Profiling Results
 
162
  VaeDecoderQuantizable
163
  Device : cs_8275 (ANDROID 14)
164
  Runtime : QNN
165
+ Estimated inference time (ms) : 720.6
166
+ Estimated peak memory usage (MB): [0, 9]
167
  Total # Ops : 189
168
  Compute Unit(s) : npu (189 ops) gpu (0 ops) cpu (0 ops)
169
  ```
 
187
 
188
 
189
  ## View on Qualcomm® AI Hub
190
+ Get more details on Stable-Diffusion-v2.1's performance across various devices [here](https://aihub.qualcomm.com/models/stable_diffusion_v2_1).
191
  Explore all available models on [Qualcomm® AI Hub](https://aihub.qualcomm.com/)
192
 
193
 
TextEncoderQuantizable.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:1d8ac53e1a7da5926376ba87cbb95eb06abe4f49ecc10ec0d11c8c9496735367
3
- size 395956256
 
 
 
 
TextEncoderQuantizable_w8a16.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:ced2b2596f2e1b33494543ed66be6ca43f375e19a929e66083556f164ab63b9f
3
- size 395958872
 
 
 
 
TextEncoderQuantizable_w8a16.onnx.zip DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:0f6a789b94bc3c5669b497a4133afed12a296b7f9c55dc44aaad3131b4b0fcf5
3
- size 299348590
 
 
 
 
TextEncoder_Quantized.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:1dfb9c7386b34ab99bf07d1a5ab4bf5182bfbc522107b0e309c510011996488b
3
- size 396149600
 
 
 
 
UNet_Quantized.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:b28ded888d500930c9f68e6a7c3b7081a82820221a7cbd9b34d68044745daa05
3
- size 878546608
 
 
 
 
UnetQuantizable.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:c4a43f307213522f02884bd682ab31fff057203c446fbcaab9343f8a0382ceb5
3
- size 879370456
 
 
 
 
UnetQuantizable_w8a16.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:527acfde5491002b1ed7d453ba18bbe742207b54701e517efc5ba7926171e635
3
- size 881393200
 
 
 
 
VAEDecoder_Quantized.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:64a9385e48fcfc36ea21e03015291562b292acddb79c9b04798ddb7dcc7ecfed
3
- size 59518976
 
 
 
 
VaeDecoderQuantizable.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:cf970014ef1a35e8b40dfe331bbe9833c15eab47579b96a24601ccf40d1b5f05
3
- size 64693320
 
 
 
 
VaeDecoderQuantizable.so DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:f09d36631935bd84e334f342e091886feaa9c26a193758e74ff18647f9bc4986
3
- size 50386176
 
 
 
 
VaeDecoderQuantizable_w8a16.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:ce21e93870788062a5f12ee568b36e15e0dacb6bf4d3477fa1b2a7c71231badf
3
- size 64701512