FBAGSTM commited on
Commit
01ba80d
·
verified ·
1 Parent(s): 79187e8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +186 -6
README.md CHANGED
@@ -1,6 +1,186 @@
1
- ---
2
- license: other
3
- license_name: sla0044
4
- license_link: >-
5
- https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/LICENSE.md
6
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: sla0044
4
+ license_link: >-
5
+ https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/LICENSE.md
6
+ ---
7
+ # Fd-MobileNet
8
+
9
+ ## **Use case** : `Image classification`
10
+
11
+ # Model description
12
+ Fd-MobileNet stands for Fast-downsampling MobileNet. It was initially introduced in this [paper](https://arxiv.org/pdf/1802.03750.pdf).
13
+ This family of networks, inspired from Mobilenet, provides a good accuracy on various image classification tasks for very limited computational budgets.
14
+ Thus it is an interesting solution for deep learning at the edge.
15
+ As stated by the authors, the key idea is to apply a fast downsampling strategy to MobileNet framework with only half the layers of the original MobileNet. This design remarkably reduces the computational cost as well as the inference time.
16
+
17
+ The hyperparameter 'alpha' controls the width of the network, also denoted as width multiplier. It proportionally adjusts each layer width.
18
+ Authorized values for 'alpha' are 0.25, 0.5, 0.75, 1.0.
19
+ The model is quantized in int8 using Tensorflow Lite converter.
20
+
21
+ Performances of a ST custom model derived from Fd-MobileNet is also proposed below. It is named ST FdMobileNet v1.
22
+ It is inspired from original FdMobilenet. Instead of having one unique 'alpha' dimensioning the width of the network, we
23
+ use a list of 'alpha' values in order to give more or less importance to each of the individual sub-blocks.
24
+ It is slightly more complex than FdMobilenet 0.25 due to higher number of channels for some sub-blocks but provides
25
+ better accuracies. We believe it is a good compromise between size, complexity and accuracy for this family of networks.
26
+
27
+ ## Network information
28
+ | Network Information | Value |
29
+ |-------------------------|--------------------------------------|
30
+ | Framework | TensorFlow Lite |
31
+ | Params alpha=0.25 | 125477 |
32
+ | Quantization | int8 |
33
+ | Paper | https://arxiv.org/pdf/1802.03750.pdf |
34
+
35
+ The models are quantized using tensorflow lite converter.
36
+
37
+ ## Network inputs / outputs
38
+ For an image resolution of NxM and P classes and 0.25 alpha parameter :
39
+
40
+ | Input Shape | Description |
41
+ |---------------|----------------------------------------------------------|
42
+ | (1, N, M, 3) | Single NxM RGB image with UINT8 values between 0 and 255 |
43
+
44
+ | Output Shape | Description |
45
+ |---------------|----------------------------------------------------------|
46
+ | (1, P) | Per-class confidence for P classes |
47
+
48
+
49
+ ## Recommended platform
50
+ | Platform | Supported | Recommended |
51
+ |----------|-----------|-------------|
52
+ | STM32L0 | [] | [] |
53
+ | STM32L4 | [x] | [] |
54
+ | STM32U5 | [x] | [] |
55
+ | STM32H7 | [x] | [x] |
56
+ | STM32MP1 | [x] | [x] |
57
+ | STM32MP2 | [x] | [] |
58
+ | STM32N6 | [x] | [] |
59
+
60
+ ---
61
+ # Performances
62
+
63
+ ## Metrics
64
+ Measures are done with default STM32Cube.AI configuration with enabled input / output allocated option.
65
+
66
+
67
+ ### Reference **NPU** memory footprint on food-101 dataset (see Accuracy for details on dataset)
68
+ |Model | Format | Resolution | Series | Internal RAM (KiB)| External RAM (KiB)| Weights Flash (KiB)| STM32Cube.AI version | STEdgeAI Core version |
69
+ |----------|--------|-------------|------------------|------------------|---------------------|-------|----------------------|-------------------------|
70
+ | [FdMobileNet 0.25 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/fdmobilenet/ST_pretrainedmodel_public_dataset/food-101/fdmobilenet_0.25_224_tfs/fdmobilenet_0.25_224_tfs_int8.tflite) | Int8 | 224x224x3 | STM32N6 | 294 |0.0| 209.92 | 10.0.0 | 2.0.0 |
71
+ | [ST FdMobileNet v1 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/fdmobilenet/ST_pretrainedmodel_public_dataset/food-101/st_fdmobilenet_v1_224_tfs/st_fdmobilenet_v1_224_tfs_int8.tflite) | Int8 | 224x224x3 | STM32N6 | 294 | 0.0 | 236.49 | 10.0.0 | 2.0.0 |
72
+ | [FdMobileNet 0.25 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/fdmobilenet/ST_pretrainedmodel_public_dataset/food-101/fdmobilenet_0.25_128_tfs/fdmobilenet_0.25_128_tfs_int8.tflite) | Int8 | 128x128x3 | STM32N6 | 96 | 0.0 | 209.92 | 10.0.0 | 2.0.0 |
73
+ | [ST FdMobileNet v1 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/fdmobilenet/ST_pretrainedmodel_public_dataset/food-101/st_fdmobilenet_v1_128_tfs/st_fdmobilenet_v1_128_tfs_int8.tflite) | Int8 | 128x128x3 | STM32N6 | 96 | 0.0 | 236.49 | 10.0.0 | 2.0.0 |
74
+
75
+
76
+ ### Reference **NPU** inference time on food-101 dataset (see Accuracy for details on dataset)
77
+ | Model | Format | Resolution | Board | Execution Engine | Inference time (ms) | Inf / sec | STM32Cube.AI version | STEdgeAI Core version |
78
+ |--------|--------|-------------|------------------|------------------|---------------------|-------|----------------------|-------------------------|
79
+ | [FdMobileNet 0.25 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/fdmobilenet/ST_pretrainedmodel_public_dataset/food-101/fdmobilenet_0.25_224_tfs/fdmobilenet_0.25_224_tfs_int8.tflite) | Int8 | 224x224x3 | STM32N6570-DK | NPU/MCU | 1.46 | 684.93 | 10.0.0 | 2.0.0 |
80
+ | [ST FdMobileNet v1 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/fdmobilenet/ST_pretrainedmodel_public_dataset/food-101/st_fdmobilenet_v1_224_tfs/st_fdmobilenet_v1_224_tfs_int8.tflite) | Int8 | 224x224x3 | STM32N6570-DK | NPU/MCU | 1.81 | 552.49 | 10.0.0 | 2.0.0 |
81
+ | [FdMobileNet 0.25 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/fdmobilenet/ST_pretrainedmodel_public_dataset/food-101/fdmobilenet_0.25_128_tfs/fdmobilenet_0.25_128_tfs_int8.tflite) | Int8 | 128x128x3 | STM32N6570-DK | NPU/MCU | 0.93 | 1075.27 | 10.0.0 | 2.0.0 |
82
+ | [ST FdMobileNet v1 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/image_classification/fdmobilenet/ST_pretrainedmodel_public_dataset/food-101/st_fdmobilenet_v1_128_tfs/st_fdmobilenet_v1_128_tfs_int8.tflite) | Int8 | 128x128x3 | STM32N6570-DK | NPU/MCU | 1.07 | 934.58 | 10.0.0 | 2.0.0 |
83
+
84
+
85
+ ### Reference **MCU** memory footprints based on Flowers dataset (see Accuracy for details on dataset)
86
+ | Model | Format | Resolution | Series | Activation RAM | Runtime RAM | Weights Flash | Code Flash | Total RAM | Total Flash | STM32Cube.AI version |
87
+ |-----------------------|--------|--------------|---------|----------------|-------------|---------------|------------|------------|-------------|----------------------|
88
+ | FdMobileNet 0.25 tfs | Int8 | 224x224x3 | STM32H7 | 157.03 KiB | 14.25 KiB | 128.32 KiB | 58.66 KiB | 171.28 KiB | 186.98 KiB | 10.0.0 |
89
+ | ST FdMobileNet v1 tfs | Int8 | 224x224x3 | STM32H7 | 211.64 KiB | 14.25 KiB | 144.93 KiB | 60.17 KiB | 225.89 KiB | 205.1 KiB | 10.0.0 |
90
+ | FdMobileNet 0.25 tfs | Int8 | 128x128x3 | STM32H7 | 56.16 KiB | 14.2 KiB | 128.32 KiB | 58.16 KiB | 70.36 KiB | 186.95 KiB | 10.0.0 |
91
+ | ST FdMobileNet v1 tfs | Int8 | 128x128x3 | STM32H7 | 74.23 KiB | 14.2 KiB | 144.93 KiB | 60.12 KiB | 88.43 KiB | 205.05 KiB | 10.0.0 |
92
+
93
+
94
+ ### Reference **MCU** inference time based on Flowers dataset (see Accuracy for details on dataset)
95
+ | Model | Format | Resolution | Board | Execution Engine | Frequency | Inference time (ms) | STM32Cube.AI version |
96
+ |-----------------------|--------|--------------|------------------|------------------|---------------|---------------------|----------------------|
97
+ | FdMobileNet 0.25 tfs | Int8 | 224x224x3 | STM32H747I-DISCO | 1 CPU | 400 MHz | 53.52 ms | 10.0.0 |
98
+ | ST FdMobileNet v1 tfs | Int8 | 224x224x3 | STM32H747I-DISCO | 1 CPU | 400 MHz | 102 ms | 10.0.0 |
99
+ | FdMobileNet 0.25 tfs | Int8 | 128x128x3 | STM32H747I-DISCO | 1 CPU | 400 MHz | 17.73 ms | 10.0.0 |
100
+ | ST FdMobileNet v1 tfs | Int8 | 128x128x3 | STM32H747I-DISCO | 1 CPU | 400 MHz | 32.14 ms | 10.0.0 |
101
+ | ST FdMobileNet v1 tfs | Int8 | 224x224x3 | STM32F769I-DISCO | 1 CPU | 216 MHz | 176.5 ms | 10.0.0 |
102
+ | ST FdMobileNet v1 tfs | Int8 | 128x128x3 | STM32F769I-DISCO | 1 CPU | 216 MHz | 59.29 ms | 10.0.0 |
103
+
104
+
105
+ ### Reference **MPU** inference time based on Flowers dataset (see Accuracy for details on dataset)
106
+ | Model | Format | Resolution | Quantization | Board | Execution Engine | Frequency | Inference time (ms) | %NPU | %GPU | %CPU | X-LINUX-AI version | Framework |
107
+ |-----------------------|--------|------------|---------------|-------------------|------------------|-----------|---------------------|-------|-------|------|--------------------|-----------------------|
108
+ | FdMobileNet 0.25 tfs | Int8 | 224x224x3 | per-channel** | STM32MP257F-DK2 | NPU/GPU | 800 MHz | 6.60 ms | 12.28 | 87.72 | 0 | v5.1.0 | OpenVX |
109
+ | ST FdMobileNet v1 tfs | Int8 | 224x224x3 | per-channel** | STM32MP257F-DK2 | NPU/GPU | 800 MHz | 7.84 ms | 10.82 | 89.19 | 0 | v5.1.0 | OpenVX |
110
+ | FdMobileNet 0.25 tfs | Int8 | 128x128x3 | per-channel** | STM32MP257F-DK2 | NPU/GPU | 800 MHz | 2.17 ms | 15.66 | 84.34 | 0 | v5.1.0 | OpenVX |
111
+ | ST FdMobileNet v1 tfs | Int8 | 128x128x3 | per-channel** | STM32MP257F-DK2 | NPU/GPU | 800 MHz | 2.85 ms | 12.75 | 87.25 | 0 | v5.1.0 | OpenVX |
112
+ | FdMobileNet 0.25 tfs | Int8 | 224x224x3 | per-channel | STM32MP157F-DK2 | 2 CPU | 800 MHz | 22.76 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
113
+ | ST FdMobileNet v1 tfs | Int8 | 224x224x3 | per-channel | STM32MP157F-DK2 | 2 CPU | 800 MHz | 33.93 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
114
+ | FdMobileNet 0.25 tfs | Int8 | 128x128x3 | per-channel | STM32MP157F-DK2 | 2 CPU | 800 MHz | 8.08 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
115
+ | ST FdMobileNet v1 tfs | Int8 | 128x128x3 | per-channel | STM32MP157F-DK2 | 2 CPU | 800 MHz | 13.16 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
116
+ | FdMobileNet 0.25 tfs | Int8 | 224x224x3 | per-channel | STM32MP135F-DK2 | 1 CPU | 1000 MHz | 33.50 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
117
+ | ST FdMobileNet v1 tfs | Int8 | 224x224x3 | per-channel | STM32MP135F-DK2 | 1 CPU | 1000 MHz | 61.00 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
118
+ | FdMobileNet 0.25 tfs | Int8 | 128x128x3 | per-channel | STM32MP135F-DK2 | 1 CPU | 1000 MHz | 10.86 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
119
+ | ST FdMobileNet v1 tfs | Int8 | 128x128x3 | per-channel | STM32MP135F-DK2 | 1 CPU | 1000 MHz | 19.43 ms | NA | NA | 100 | v5.1.0 | TensorFlowLite 2.11.0 |
120
+
121
+ ** **To get the most out of MP25 NPU hardware acceleration, please use per-tensor quantization**
122
+
123
+ ### Accuracy with Flowers dataset
124
+ Dataset details: http://download.tensorflow.org/example_images/flower_photos.tgz , License CC - BY 2.0
125
+ Number of classes: 5, 3670 files
126
+
127
+ | Model | Format | Resolution | Top 1 Accuracy (%) |
128
+ |-----------------------|--------|--------------|----------------------|
129
+ | FdMobileNet 0.25 tfs | Float | 224x224x3 | 86.92 |
130
+ | FdMobileNet 0.25 tfs | Int8 | 224x224x3 | 87.06 |
131
+ | ST FdMobileNet v1 tfs | Float | 224x224x3 | 89.51 |
132
+ | ST FdMobileNet v1 tfs | Int8 | 224x224x3 | 88.83 |
133
+ | FdMobileNet 0.25 tfs | Float | 128x128x3 | 84.6 |
134
+ | FdMobileNet 0.25 tfs | Int8 | 128x128x3 | 84.2 |
135
+ | ST FdMobileNet v1 tfs | Float | 128x128x3 | 87.87 |
136
+ | ST FdMobileNet v1 tfs | Int8 | 128x128x3 | 87.6 |
137
+
138
+
139
+ ### Accuracy with Plant dataset
140
+ Dataset details: https://data.mendeley.com/datasets/tywbtsjrjv/1 , License CC0 1.0
141
+ Number of classes: 39, number of files: 55448
142
+
143
+ | Model | Format | Resolution | Top 1 Accuracy (%) |
144
+ |-----------------------|--------|--------------|----------------------|
145
+ | FdMobileNet 0.25 tfs | Float | 224x224x3 | 99.9 |
146
+ | FdMobileNet 0.25 tfs | Int8 | 224x224x3 | 99.8 |
147
+ | ST FdMobileNet v1 tfs | Float | 224x224x3 | 99.59 |
148
+ | ST FdMobileNet v1 tfs | Int8 | 224x224x3 | 99.4 |
149
+ | FdMobileNet 0.25 tfs | Float | 128x128x3 | 99.05 |
150
+ | FdMobileNet 0.25 tfs | Int8 | 128x128x3 | 98.55 |
151
+ | ST FdMobileNet v1 tfs | Float | 128x128x3 | 99.58 |
152
+ | ST FdMobileNet v1 tfs | Int8 | 128x128x3 | 99.8 |
153
+
154
+
155
+ ### Accuracy with Food-101 dataset
156
+ Dataset details: https://data.vision.ee.ethz.ch/cvl/datasets_extra/food-101/,
157
+ Number of classes: 101, number of files: 101000
158
+
159
+ | Model | Format | Resolution | Top 1 Accuracy (%) |
160
+ |-----------------------|--------|--------------|----------------------|
161
+ | FdMobileNet 0.25 tfs | Float | 224x224x3 | 60.41 |
162
+ | FdMobileNet 0.25 tfs | Int8 | 224x224x3 | 58.78 |
163
+ | ST FdMobileNet v1 tfs | Float | 224x224x3 | 66.19 |
164
+ | ST FdMobileNet v1 tfs | Int8 | 224x224x3 | 64.71 |
165
+ | FdMobileNet 0.25 tfs | Float | 128x128x3 | 45.54 |
166
+ | FdMobileNet 0.25 tfs | Int8 | 128x128x3 | 44.86 |
167
+ | ST FdMobileNet v1 tfs | Float | 128x128x3 | 54.19 |
168
+ | ST FdMobileNet v1 tfs | Int8 | 128x128x3 | 53.74 |
169
+
170
+
171
+ ## Retraining and Integration in a simple example:
172
+
173
+ Please refer to the stm32ai-modelzoo-services GitHub [here](https://github.com/STMicroelectronics/stm32ai-modelzoo-services)
174
+
175
+
176
+ # References
177
+
178
+ <a id="1">[1]</a>
179
+ "Tf_flowers : tensorflow datasets," TensorFlow. [Online]. Available: https://www.tensorflow.org/datasets/catalog/tf_flowers.
180
+
181
+ <a id="2">[2]</a>
182
+ J, ARUN PANDIAN; GOPAL, GEETHARAMANI (2019), "Data for: Identification of Plant Leaf Diseases Using a 9-layer Deep Convolutional Neural Network", Mendeley Data, V1, doi: 10.17632/tywbtsjrjv.1
183
+
184
+ <a id="3">[3]</a>
185
+ L. Bossard, M. Guillaumin, and L. Van Gool, "Food-101 -- Mining Discriminative Components with Random Forests." European Conference on Computer Vision, 2014.
186
+