Update README.md
Browse files
README.md
CHANGED
@@ -1,6 +1,186 @@
|
|
1 |
-
---
|
2 |
-
license: other
|
3 |
-
license_name: sla0044
|
4 |
-
license_link: >-
|
5 |
-
https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/LICENSE.md
|
6 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: other
|
3 |
+
license_name: sla0044
|
4 |
+
license_link: >-
|
5 |
+
https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/LICENSE.md
|
6 |
+
---
|
7 |
+
# Fd-MobileNet
|
8 |
+
|
9 |
+
## **Use case** : `Image classification`
|
10 |
+
|
11 |
+
# Model description
|
12 |
+
Fd-MobileNet stands for Fast-downsampling MobileNet. It was initially introduced in this [paper](https://arxiv.org/pdf/1802.03750.pdf).
|
13 |
+
This family of networks, inspired from Mobilenet, provides a good accuracy on various image classification tasks for very limited computational budgets.
|
14 |
+
Thus it is an interesting solution for deep learning at the edge.
|
15 |
+
As stated by the authors, the key idea is to apply a fast downsampling strategy to MobileNet framework with only half the layers of the original MobileNet. This design remarkably reduces the computational cost as well as the inference time.
|
16 |
+
|
17 |
+
The hyperparameter 'alpha' controls the width of the network, also denoted as width multiplier. It proportionally adjusts each layer width.
|
18 |
+
Authorized values for 'alpha' are 0.25, 0.5, 0.75, 1.0.
|
19 |
+
The model is quantized in int8 using Tensorflow Lite converter.
|
20 |
+
|
21 |
+
Performances of a ST custom model derived from Fd-MobileNet is also proposed below. It is named ST FdMobileNet v1.
|
22 |
+
It is inspired from original FdMobilenet. Instead of having one unique 'alpha' dimensioning the width of the network, we
|
23 |
+
use a list of 'alpha' values in order to give more or less importance to each of the individual sub-blocks.
|
24 |
+
It is slightly more complex than FdMobilenet 0.25 due to higher number of channels for some sub-blocks but provides
|
25 |
+
better accuracies. We believe it is a good compromise between size, complexity and accuracy for this family of networks.
|
26 |
+
|
27 |
+
## Network information
|
28 |
+
| Network Information | Value |
|
29 |
+
|-------------------------|--------------------------------------|
|
30 |
+
| Framework | TensorFlow Lite |
|
31 |
+
| Params alpha=0.25 | 125477 |
|
32 |
+
| Quantization | int8 |
|
33 |
+
| Paper | https://arxiv.org/pdf/1802.03750.pdf |
|
34 |
+
|
35 |
+
The models are quantized using tensorflow lite converter.
|
36 |
+
|
37 |
+
## Network inputs / outputs
|
38 |
+
For an image resolution of NxM and P classes and 0.25 alpha parameter :
|
39 |
+
|
40 |
+
| Input Shape | Description |
|
41 |
+
|---------------|----------------------------------------------------------|
|
42 |
+
| (1, N, M, 3) | Single NxM RGB image with UINT8 values between 0 and 255 |
|
43 |
+
|
44 |
+
| Output Shape | Description |
|
45 |
+
|---------------|----------------------------------------------------------|
|
46 |
+
| (1, P) | Per-class confidence for P classes |
|
47 |
+
|
48 |
+
|
49 |
+
## Recommended platform
|
50 |
+
| Platform | Supported | Recommended |
|
51 |
+
|----------|-----------|-------------|
|
52 |
+
| STM32L0 | [] | [] |
|
53 |
+
| STM32L4 | [x] | [] |
|
54 |
+
| STM32U5 | [x] | [] |
|
55 |
+
| STM32H7 | [x] | [x] |
|
56 |
+
| STM32MP1 | [x] | [x] |
|
57 |
+
| STM32MP2 | [x] | [] |
|
58 |
+
| STM32N6 | [x] | [] |
|
59 |
+
|
60 |
+
---
|
61 |
+
# Performances
|
62 |
+
|
63 |
+
## Metrics
|
64 |
+
Measures are done with default STM32Cube.AI configuration with enabled input / output allocated option.
|
65 |
+
|
66 |
+
|
67 |
+
### Reference **NPU** memory footprint on food-101 dataset (see Accuracy for details on dataset)
|
68 |
+
|Model | Format | Resolution | Series | Internal RAM (KiB)| External RAM (KiB)| Weights Flash (KiB)| STM32Cube.AI version | STEdgeAI Core version |
|
69 |
+
|----------|--------|-------------|------------------|------------------|---------------------|-------|----------------------|-------------------------|
|
70 |
+
| [FdMobileNet 0.25 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/fdmobilenet/ST_pretrainedmodel_public_dataset/food-101/fdmobilenet_0.25_224_tfs/fdmobilenet_0.25_224_tfs_int8.tflite) | Int8 | 224x224x3 | STM32N6 | 294 |0.0| 209.92 | 10.0.0 | 2.0.0 |
|
71 |
+
| [ST FdMobileNet v1 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/fdmobilenet/ST_pretrainedmodel_public_dataset/food-101/st_fdmobilenet_v1_224_tfs/st_fdmobilenet_v1_224_tfs_int8.tflite) | Int8 | 224x224x3 | STM32N6 | 294 | 0.0 | 236.49 | 10.0.0 | 2.0.0 |
|
72 |
+
| [FdMobileNet 0.25 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/fdmobilenet/ST_pretrainedmodel_public_dataset/food-101/fdmobilenet_0.25_128_tfs/fdmobilenet_0.25_128_tfs_int8.tflite) | Int8 | 128x128x3 | STM32N6 | 96 | 0.0 | 209.92 | 10.0.0 | 2.0.0 |
|
73 |
+
| [ST FdMobileNet v1 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/fdmobilenet/ST_pretrainedmodel_public_dataset/food-101/st_fdmobilenet_v1_128_tfs/st_fdmobilenet_v1_128_tfs_int8.tflite) | Int8 | 128x128x3 | STM32N6 | 96 | 0.0 | 236.49 | 10.0.0 | 2.0.0 |
|
74 |
+
|
75 |
+
|
76 |
+
### Reference **NPU** inference time on food-101 dataset (see Accuracy for details on dataset)
|
77 |
+
| Model | Format | Resolution | Board | Execution Engine | Inference time (ms) | Inf / sec | STM32Cube.AI version | STEdgeAI Core version |
|
78 |
+
|--------|--------|-------------|------------------|------------------|---------------------|-------|----------------------|-------------------------|
|
79 |
+
| [FdMobileNet 0.25 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/fdmobilenet/ST_pretrainedmodel_public_dataset/food-101/fdmobilenet_0.25_224_tfs/fdmobilenet_0.25_224_tfs_int8.tflite) | Int8 | 224x224x3 | STM32N6570-DK | NPU/MCU | 1.46 | 684.93 | 10.0.0 | 2.0.0 |
|
80 |
+
| [ST FdMobileNet v1 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/fdmobilenet/ST_pretrainedmodel_public_dataset/food-101/st_fdmobilenet_v1_224_tfs/st_fdmobilenet_v1_224_tfs_int8.tflite) | Int8 | 224x224x3 | STM32N6570-DK | NPU/MCU | 1.81 | 552.49 | 10.0.0 | 2.0.0 |
|
81 |
+
| [FdMobileNet 0.25 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/fdmobilenet/ST_pretrainedmodel_public_dataset/food-101/fdmobilenet_0.25_128_tfs/fdmobilenet_0.25_128_tfs_int8.tflite) | Int8 | 128x128x3 | STM32N6570-DK | NPU/MCU | 0.93 | 1075.27 | 10.0.0 | 2.0.0 |
|
82 |
+
| [ST FdMobileNet v1 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/fdmobilenet/ST_pretrainedmodel_public_dataset/food-101/st_fdmobilenet_v1_128_tfs/st_fdmobilenet_v1_128_tfs_int8.tflite) | Int8 | 128x128x3 | STM32N6570-DK | NPU/MCU | 1.07 | 934.58 | 10.0.0 | 2.0.0 |
|
83 |
+
|
84 |
+
|
85 |
+
### Reference **MCU** memory footprints based on Flowers dataset (see Accuracy for details on dataset)
|
86 |
+
| Model | Format | Resolution | Series | Activation RAM | Runtime RAM | Weights Flash | Code Flash | Total RAM | Total Flash | STM32Cube.AI version |
|
87 |
+
|-----------------------|--------|--------------|---------|----------------|-------------|---------------|------------|------------|-------------|----------------------|
|
88 |
+
| FdMobileNet 0.25 tfs | Int8 | 224x224x3 | STM32H7 | 157.03 KiB | 14.25 KiB | 128.32 KiB | 58.66 KiB | 171.28 KiB | 186.98 KiB | 10.0.0 |
|
89 |
+
| ST FdMobileNet v1 tfs | Int8 | 224x224x3 | STM32H7 | 211.64 KiB | 14.25 KiB | 144.93 KiB | 60.17 KiB | 225.89 KiB | 205.1 KiB | 10.0.0 |
|
90 |
+
| FdMobileNet 0.25 tfs | Int8 | 128x128x3 | STM32H7 | 56.16 KiB | 14.2 KiB | 128.32 KiB | 58.16 KiB | 70.36 KiB | 186.95 KiB | 10.0.0 |
|
91 |
+
| ST FdMobileNet v1 tfs | Int8 | 128x128x3 | STM32H7 | 74.23 KiB | 14.2 KiB | 144.93 KiB | 60.12 KiB | 88.43 KiB | 205.05 KiB | 10.0.0 |
|
92 |
+
|
93 |
+
|
94 |
+
### Reference **MCU** inference time based on Flowers dataset (see Accuracy for details on dataset)
|
95 |
+
| Model | Format | Resolution | Board | Execution Engine | Frequency | Inference time (ms) | STM32Cube.AI version |
|
96 |
+
|-----------------------|--------|--------------|------------------|------------------|---------------|---------------------|----------------------|
|
97 |
+
| FdMobileNet 0.25 tfs | Int8 | 224x224x3 | STM32H747I-DISCO | 1 CPU | 400 MHz | 53.52 ms | 10.0.0 |
|
98 |
+
| ST FdMobileNet v1 tfs | Int8 | 224x224x3 | STM32H747I-DISCO | 1 CPU | 400 MHz | 102 ms | 10.0.0 |
|
99 |
+
| FdMobileNet 0.25 tfs | Int8 | 128x128x3 | STM32H747I-DISCO | 1 CPU | 400 MHz | 17.73 ms | 10.0.0 |
|
100 |
+
| ST FdMobileNet v1 tfs | Int8 | 128x128x3 | STM32H747I-DISCO | 1 CPU | 400 MHz | 32.14 ms | 10.0.0 |
|
101 |
+
| ST FdMobileNet v1 tfs | Int8 | 224x224x3 | STM32F769I-DISCO | 1 CPU | 216 MHz | 176.5 ms | 10.0.0 |
|
102 |
+
| ST FdMobileNet v1 tfs | Int8 | 128x128x3 | STM32F769I-DISCO | 1 CPU | 216 MHz | 59.29 ms | 10.0.0 |
|
103 |
+
|
104 |
+
|
105 |
+
### Reference **MPU** inference time based on Flowers dataset (see Accuracy for details on dataset)
|
106 |
+
| Model | Format | Resolution | Quantization | Board | Execution Engine | Frequency | Inference time (ms) | %NPU | %GPU | %CPU | X-LINUX-AI version | Framework |
|
107 |
+
|-----------------------|--------|------------|---------------|-------------------|------------------|-----------|---------------------|-------|-------|------|--------------------|-----------------------|
|
108 |
+
| FdMobileNet 0.25 tfs | Int8 | 224x224x3 | per-channel** | STM32MP257F-DK2 | NPU/GPU | 800 MHz | 6.60 ms | 12.28 | 87.72 | 0 | v5.1.0 | OpenVX |
|
109 |
+
| ST FdMobileNet v1 tfs | Int8 | 224x224x3 | per-channel** | STM32MP257F-DK2 | NPU/GPU | 800 MHz | 7.84 ms | 10.82 | 89.19 | 0 | v5.1.0 | OpenVX |
|
110 |
+
| FdMobileNet 0.25 tfs | Int8 | 128x128x3 | per-channel** | STM32MP257F-DK2 | NPU/GPU | 800 MHz | 2.17 ms | 15.66 | 84.34 | 0 | v5.1.0 | OpenVX |
|
111 |
+
| ST FdMobileNet v1 tfs | Int8 | 128x128x3 | per-channel** | STM32MP257F-DK2 | NPU/GPU | 800 MHz | 2.85 ms | 12.75 | 87.25 | 0 | v5.1.0 | OpenVX |
|
112 |
+
| FdMobileNet 0.25 tfs | Int8 | 224x224x3 | per-channel | STM32MP157F-DK2 | 2 CPU | 800 MHz | 22.76 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
|
113 |
+
| ST FdMobileNet v1 tfs | Int8 | 224x224x3 | per-channel | STM32MP157F-DK2 | 2 CPU | 800 MHz | 33.93 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
|
114 |
+
| FdMobileNet 0.25 tfs | Int8 | 128x128x3 | per-channel | STM32MP157F-DK2 | 2 CPU | 800 MHz | 8.08 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
|
115 |
+
| ST FdMobileNet v1 tfs | Int8 | 128x128x3 | per-channel | STM32MP157F-DK2 | 2 CPU | 800 MHz | 13.16 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
|
116 |
+
| FdMobileNet 0.25 tfs | Int8 | 224x224x3 | per-channel | STM32MP135F-DK2 | 1 CPU | 1000 MHz | 33.50 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
|
117 |
+
| ST FdMobileNet v1 tfs | Int8 | 224x224x3 | per-channel | STM32MP135F-DK2 | 1 CPU | 1000 MHz | 61.00 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
|
118 |
+
| FdMobileNet 0.25 tfs | Int8 | 128x128x3 | per-channel | STM32MP135F-DK2 | 1 CPU | 1000 MHz | 10.86 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
|
119 |
+
| ST FdMobileNet v1 tfs | Int8 | 128x128x3 | per-channel | STM32MP135F-DK2 | 1 CPU | 1000 MHz | 19.43 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
|
120 |
+
|
121 |
+
** **To get the most out of MP25 NPU hardware acceleration, please use per-tensor quantization**
|
122 |
+
|
123 |
+
### Accuracy with Flowers dataset
|
124 |
+
Dataset details: http://download.tensorflow.org/example_images/flower_photos.tgz , License CC - BY 2.0
|
125 |
+
Number of classes: 5, 3670 files
|
126 |
+
|
127 |
+
| Model | Format | Resolution | Top 1 Accuracy (%) |
|
128 |
+
|-----------------------|--------|--------------|----------------------|
|
129 |
+
| FdMobileNet 0.25 tfs | Float | 224x224x3 | 86.92 |
|
130 |
+
| FdMobileNet 0.25 tfs | Int8 | 224x224x3 | 87.06 |
|
131 |
+
| ST FdMobileNet v1 tfs | Float | 224x224x3 | 89.51 |
|
132 |
+
| ST FdMobileNet v1 tfs | Int8 | 224x224x3 | 88.83 |
|
133 |
+
| FdMobileNet 0.25 tfs | Float | 128x128x3 | 84.6 |
|
134 |
+
| FdMobileNet 0.25 tfs | Int8 | 128x128x3 | 84.2 |
|
135 |
+
| ST FdMobileNet v1 tfs | Float | 128x128x3 | 87.87 |
|
136 |
+
| ST FdMobileNet v1 tfs | Int8 | 128x128x3 | 87.6 |
|
137 |
+
|
138 |
+
|
139 |
+
### Accuracy with Plant dataset
|
140 |
+
Dataset details: https://data.mendeley.com/datasets/tywbtsjrjv/1 , License CC0 1.0
|
141 |
+
Number of classes: 39, number of files: 55448
|
142 |
+
|
143 |
+
| Model | Format | Resolution | Top 1 Accuracy (%) |
|
144 |
+
|-----------------------|--------|--------------|----------------------|
|
145 |
+
| FdMobileNet 0.25 tfs | Float | 224x224x3 | 99.9 |
|
146 |
+
| FdMobileNet 0.25 tfs | Int8 | 224x224x3 | 99.8 |
|
147 |
+
| ST FdMobileNet v1 tfs | Float | 224x224x3 | 99.59 |
|
148 |
+
| ST FdMobileNet v1 tfs | Int8 | 224x224x3 | 99.4 |
|
149 |
+
| FdMobileNet 0.25 tfs | Float | 128x128x3 | 99.05 |
|
150 |
+
| FdMobileNet 0.25 tfs | Int8 | 128x128x3 | 98.55 |
|
151 |
+
| ST FdMobileNet v1 tfs | Float | 128x128x3 | 99.58 |
|
152 |
+
| ST FdMobileNet v1 tfs | Int8 | 128x128x3 | 99.8 |
|
153 |
+
|
154 |
+
|
155 |
+
### Accuracy with Food-101 dataset
|
156 |
+
Dataset details: https://data.vision.ee.ethz.ch/cvl/datasets_extra/food-101/,
|
157 |
+
Number of classes: 101, number of files: 101000
|
158 |
+
|
159 |
+
| Model | Format | Resolution | Top 1 Accuracy (%) |
|
160 |
+
|-----------------------|--------|--------------|----------------------|
|
161 |
+
| FdMobileNet 0.25 tfs | Float | 224x224x3 | 60.41 |
|
162 |
+
| FdMobileNet 0.25 tfs | Int8 | 224x224x3 | 58.78 |
|
163 |
+
| ST FdMobileNet v1 tfs | Float | 224x224x3 | 66.19 |
|
164 |
+
| ST FdMobileNet v1 tfs | Int8 | 224x224x3 | 64.71 |
|
165 |
+
| FdMobileNet 0.25 tfs | Float | 128x128x3 | 45.54 |
|
166 |
+
| FdMobileNet 0.25 tfs | Int8 | 128x128x3 | 44.86 |
|
167 |
+
| ST FdMobileNet v1 tfs | Float | 128x128x3 | 54.19 |
|
168 |
+
| ST FdMobileNet v1 tfs | Int8 | 128x128x3 | 53.74 |
|
169 |
+
|
170 |
+
|
171 |
+
## Retraining and Integration in a simple example:
|
172 |
+
|
173 |
+
Please refer to the stm32ai-modelzoo-services GitHub [here](https://github.com/STMicroelectronics/stm32ai-modelzoo-services)
|
174 |
+
|
175 |
+
|
176 |
+
# References
|
177 |
+
|
178 |
+
<a id="1">[1]</a>
|
179 |
+
"Tf_flowers : tensorflow datasets," TensorFlow. [Online]. Available: https://www.tensorflow.org/datasets/catalog/tf_flowers.
|
180 |
+
|
181 |
+
<a id="2">[2]</a>
|
182 |
+
J, ARUN PANDIAN; GOPAL, GEETHARAMANI (2019), "Data for: Identification of Plant Leaf Diseases Using a 9-layer Deep Convolutional Neural Network", Mendeley Data, V1, doi: 10.17632/tywbtsjrjv.1
|
183 |
+
|
184 |
+
<a id="3">[3]</a>
|
185 |
+
L. Bossard, M. Guillaumin, and L. Van Gool, "Food-101 -- Mining Discriminative Components with Random Forests." European Conference on Computer Vision, 2014.
|
186 |
+
|