Muthukamalan commited on
Commit
3bd4938
Β·
1 Parent(s): 8de6687

model_summary, performance_profiling

Browse files
Files changed (1) hide show
  1. README.md +230 -1
README.md CHANGED
@@ -10,4 +10,233 @@ pinned: false
10
  license: mit
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  license: mit
11
  ---
12
 
13
+ # The Unsolved MNIST πŸ”’
14
+ **M**odified **N**ational **I**nstitute for **S**tandards and **T**echnology Dataset
15
+
16
+ # Description
17
+
18
+ # Setup
19
+
20
+ # Objective
21
+
22
+ # Logs
23
+
24
+ ## Model Summary
25
+ ```log
26
+ ========================================================================================================================
27
+ Layer (type (var_name)) Input Shape Output Shape Param # Kernel Shape Mult-Adds
28
+ ========================================================================================================================
29
+ LitMNISTModel (LitMNISTModel) [32, 1, 28, 28] [32, 10] -- -- --
30
+ β”œβ”€Net (model) [32, 1, 28, 28] [32, 10] -- -- --
31
+ β”‚ └─conv1.0.weight β”œβ”€72 [8, 1, 3, 3]
32
+ β”‚ └─conv1.2.weight β”œβ”€8 [8]
33
+ β”‚ └─conv1.2.bias β”œβ”€8 [8]
34
+ β”‚ └─conv1.4.weight β”œβ”€720 [10, 8, 3, 3]
35
+ β”‚ └─conv1.6.weight β”œβ”€10 [10]
36
+ β”‚ └─conv1.6.bias β”œβ”€10 [10]
37
+ β”‚ └─conv1.8.weight β”œβ”€900 [10, 10, 3, 3]
38
+ β”‚ └─conv1.10.weight β”œβ”€10 [10]
39
+ β”‚ └─conv1.10.bias β”œβ”€10 [10]
40
+ β”‚ └─trans1.1.weight β”œβ”€80 [8, 10, 1, 1]
41
+ β”‚ └─conv2.0.weight β”œβ”€720 [10, 8, 3, 3]
42
+ β”‚ └─conv2.1.weight β”œβ”€10 [10]
43
+ β”‚ └─conv2.1.bias β”œβ”€10 [10]
44
+ β”‚ └─conv2.4.weight β”œβ”€1,080 [12, 10, 3, 3]
45
+ β”‚ └─conv2.5.weight β”œβ”€12 [12]
46
+ β”‚ └─conv2.5.bias β”œβ”€12 [12]
47
+ β”‚ └─conv2.8.weight β”œβ”€1,296 [12, 12, 3, 3]
48
+ β”‚ └─conv2.9.weight β”œβ”€12 [12]
49
+ β”‚ └─conv2.9.bias β”œβ”€12 [12]
50
+ β”‚ └─trans2.1.weight β”œβ”€96 [8, 12, 1, 1]
51
+ β”‚ └─trans2.2.weight β”œβ”€8 [8]
52
+ β”‚ └─trans2.2.bias β”œβ”€8 [8]
53
+ β”‚ └─conv3.0.weight β”œβ”€720 [10, 8, 3, 3]
54
+ β”‚ └─conv3.1.weight β”œβ”€10 [10]
55
+ β”‚ └─conv3.1.bias β”œβ”€10 [10]
56
+ β”‚ └─conv3.4.weight β”œβ”€1,080 [12, 10, 3, 3]
57
+ β”‚ └─conv3.6.weight β”œβ”€12 [12]
58
+ β”‚ └─conv3.6.bias β”œβ”€12 [12]
59
+ β”‚ └─trans3.0.weight β”œβ”€120 [10, 12, 1, 1]
60
+ β”‚ └─trans3.2.weight β”œβ”€10 [10]
61
+ β”‚ └─trans3.2.bias β”œβ”€10 [10]
62
+ β”‚ └─out4.0.weight └─900 [10, 10, 3, 3]
63
+ β”‚ └─Sequential (conv1) [32, 1, 28, 28] [32, 10, 28, 28] -- -- --
64
+ β”‚ β”‚ └─0.weight β”œβ”€72 [8, 1, 3, 3]
65
+ β”‚ β”‚ └─2.weight β”œβ”€8 [8]
66
+ β”‚ β”‚ └─2.bias β”œβ”€8 [8]
67
+ β”‚ β”‚ └─4.weight β”œβ”€720 [10, 8, 3, 3]
68
+ β”‚ β”‚ └─6.weight β”œβ”€10 [10]
69
+ β”‚ β”‚ └─6.bias β”œβ”€10 [10]
70
+ β”‚ β”‚ └─8.weight β”œβ”€900 [10, 10, 3, 3]
71
+ β”‚ β”‚ └─10.weight β”œβ”€10 [10]
72
+ β”‚ β”‚ └─10.bias └─10 [10]
73
+ β”‚ β”‚ └─Conv2d (0) [32, 1, 28, 28] [32, 8, 28, 28] 72 [3, 3] 1,806,336
74
+ β”‚ β”‚ β”‚ └─weight └─72 [1, 8, 3, 3]
75
+ β”‚ β”‚ └─ReLU (1) [32, 8, 28, 28] [32, 8, 28, 28] -- -- --
76
+ β”‚ β”‚ └─BatchNorm2d (2) [32, 8, 28, 28] [32, 8, 28, 28] 16 -- 512
77
+ β”‚ β”‚ β”‚ └─weight β”œβ”€8 [8]
78
+ β”‚ β”‚ β”‚ └─bias └─8 [8]
79
+ β”‚ β”‚ └─Dropout2d (3) [32, 8, 28, 28] [32, 8, 28, 28] -- -- --
80
+ β”‚ β”‚ └─Conv2d (4) [32, 8, 28, 28] [32, 10, 28, 28] 720 [3, 3] 18,063,360
81
+ β”‚ β”‚ β”‚ └─weight └─720 [8, 10, 3, 3]
82
+ β”‚ β”‚ └─ReLU (5) [32, 10, 28, 28] [32, 10, 28, 28] -- -- --
83
+ β”‚ β”‚ └─BatchNorm2d (6) [32, 10, 28, 28] [32, 10, 28, 28] 20 -- 640
84
+ β”‚ β”‚ β”‚ └─weight β”œβ”€10 [10]
85
+ β”‚ β”‚ β”‚ └─bias └─10 [10]
86
+ β”‚ β”‚ └─Dropout2d (7) [32, 10, 28, 28] [32, 10, 28, 28] -- -- --
87
+ β”‚ β”‚ └─Conv2d (8) [32, 10, 28, 28] [32, 10, 28, 28] 900 [3, 3] 22,579,200
88
+ β”‚ β”‚ β”‚ └─weight └─900 [10, 10, 3, 3]
89
+ β”‚ β”‚ └─ReLU (9) [32, 10, 28, 28] [32, 10, 28, 28] -- -- --
90
+ β”‚ β”‚ └─BatchNorm2d (10) [32, 10, 28, 28] [32, 10, 28, 28] 20 -- 640
91
+ β”‚ β”‚ β”‚ └─weight β”œβ”€10 [10]
92
+ β”‚ β”‚ β”‚ └─bias └─10 [10]
93
+ β”‚ β”‚ └─Dropout2d (11) [32, 10, 28, 28] [32, 10, 28, 28] -- -- --
94
+ β”‚ └─Sequential (trans1) [32, 10, 28, 28] [32, 8, 17, 17] -- -- --
95
+ β”‚ β”‚ └─1.weight └─80 [8, 10, 1, 1]
96
+ β”‚ β”‚ └─MaxPool2d (0) [32, 10, 28, 28] [32, 10, 15, 15] -- 2 --
97
+ β”‚ β”‚ └─Conv2d (1) [32, 10, 15, 15] [32, 8, 17, 17] 80 [1, 1] 739,840
98
+ β”‚ β”‚ β”‚ └─weight └─80 [10, 8, 1, 1]
99
+ β”‚ └─Sequential (conv2) [32, 8, 17, 17] [32, 12, 17, 17] -- -- --
100
+ β”‚ β”‚ └─0.weight β”œβ”€720 [10, 8, 3, 3]
101
+ β”‚ β”‚ └─1.weight β”œβ”€10 [10]
102
+ β”‚ β”‚ └─1.bias β”œβ”€10 [10]
103
+ β”‚ β”‚ └─4.weight β”œβ”€1,080 [12, 10, 3, 3]
104
+ β”‚ β”‚ └─5.weight β”œβ”€12 [12]
105
+ β”‚ β”‚ └─5.bias β”œβ”€12 [12]
106
+ β”‚ β”‚ └─8.weight β”œβ”€1,296 [12, 12, 3, 3]
107
+ β”‚ β”‚ └─9.weight β”œβ”€12 [12]
108
+ β”‚ β”‚ └─9.bias └─12 [12]
109
+ β”‚ β”‚ └─Conv2d (0) [32, 8, 17, 17] [32, 10, 17, 17] 720 [3, 3] 6,658,560
110
+ β”‚ β”‚ β”‚ └─weight └─720 [8, 10, 3, 3]
111
+ β”‚ β”‚ └─BatchNorm2d (1) [32, 10, 17, 17] [32, 10, 17, 17] 20 -- 640
112
+ β”‚ β”‚ β”‚ └─weight β”œβ”€10 [10]
113
+ β”‚ β”‚ β”‚ └─bias └─10 [10]
114
+ β”‚ β”‚ └─ReLU (2) [32, 10, 17, 17] [32, 10, 17, 17] -- -- --
115
+ β”‚ β”‚ └─Dropout2d (3) [32, 10, 17, 17] [32, 10, 17, 17] -- -- --
116
+ β”‚ β”‚ └─Conv2d (4) [32, 10, 17, 17] [32, 12, 17, 17] 1,080 [3, 3] 9,987,840
117
+ β”‚ β”‚ β”‚ └─weight └─1,080 [10, 12, 3, 3]
118
+ β”‚ β”‚ └─BatchNorm2d (5) [32, 12, 17, 17] [32, 12, 17, 17] 24 -- 768
119
+ β”‚ β”‚ β”‚ └─weight β”œβ”€12 [12]
120
+ β”‚ β”‚ β”‚ └─bias └─12 [12]
121
+ β”‚ β”‚ └─ReLU (6) [32, 12, 17, 17] [32, 12, 17, 17] -- -- --
122
+ β”‚ β”‚ └─Dropout2d (7) [32, 12, 17, 17] [32, 12, 17, 17] -- -- --
123
+ β”‚ β”‚ └─Conv2d (8) [32, 12, 17, 17] [32, 12, 17, 17] 1,296 [3, 3] 11,985,408
124
+ β”‚ β”‚ β”‚ └─weight └─1,296 [12, 12, 3, 3]
125
+ β”‚ β”‚ └─BatchNorm2d (9) [32, 12, 17, 17] [32, 12, 17, 17] 24 -- 768
126
+ β”‚ β”‚ β”‚ └─weight β”œβ”€12 [12]
127
+ β”‚ β”‚ β”‚ └─bias └─12 [12]
128
+ β”‚ β”‚ └─ReLU (10) [32, 12, 17, 17] [32, 12, 17, 17] -- -- --
129
+ β”‚ β”‚ └─Dropout2d (11) [32, 12, 17, 17] [32, 12, 17, 17] -- -- --
130
+ β”‚ └─Sequential (trans2) [32, 12, 17, 17] [32, 8, 9, 9] -- -- --
131
+ β”‚ β”‚ └─1.weight β”œβ”€96 [8, 12, 1, 1]
132
+ β”‚ β”‚ └─2.weight β”œβ”€8 [8]
133
+ β”‚ β”‚ └─2.bias └─8 [8]
134
+ β”‚ β”‚ └─MaxPool2d (0) [32, 12, 17, 17] [32, 12, 9, 9] -- 2 --
135
+ β”‚ β”‚ └─Conv2d (1) [32, 12, 9, 9] [32, 8, 9, 9] 96 [1, 1] 248,832
136
+ β”‚ β”‚ β”‚ └─weight └─96 [12, 8, 1, 1]
137
+ β”‚ β”‚ └─BatchNorm2d (2) [32, 8, 9, 9] [32, 8, 9, 9] 16 -- 512
138
+ β”‚ β”‚ β”‚ └─weight β”œβ”€8 [8]
139
+ β”‚ β”‚ β”‚ └─bias └─8 [8]
140
+ β”‚ └─Sequential (conv3) [32, 8, 9, 9] [32, 12, 9, 9] -- -- --
141
+ β”‚ β”‚ └─0.weight β”œβ”€720 [10, 8, 3, 3]
142
+ β”‚ β”‚ └─1.weight β”œβ”€10 [10]
143
+ β”‚ β”‚ └─1.bias β”œβ”€10 [10]
144
+ β”‚ β”‚ └─4.weight β”œβ”€1,080 [12, 10, 3, 3]
145
+ β”‚ β”‚ └─6.weight β”œβ”€12 [12]
146
+ β”‚ β”‚ └─6.bias └─12 [12]
147
+ β”‚ β”‚ └─Conv2d (0) [32, 8, 9, 9] [32, 10, 9, 9] 720 [3, 3] 1,866,240
148
+ β”‚ β”‚ β”‚ └─weight └─720 [8, 10, 3, 3]
149
+ β”‚ β”‚ └─BatchNorm2d (1) [32, 10, 9, 9] [32, 10, 9, 9] 20 -- 640
150
+ β”‚ β”‚ β”‚ └─weight β”œβ”€10 [10]
151
+ β”‚ β”‚ β”‚ └─bias └─10 [10]
152
+ β”‚ β”‚ └─ReLU (2) [32, 10, 9, 9] [32, 10, 9, 9] -- -- --
153
+ β”‚ β”‚ └─Dropout2d (3) [32, 10, 9, 9] [32, 10, 9, 9] -- -- --
154
+ β”‚ β”‚ └─Conv2d (4) [32, 10, 9, 9] [32, 12, 9, 9] 1,080 [3, 3] 2,799,360
155
+ β”‚ β”‚ β”‚ └─weight └─1,080 [10, 12, 3, 3]
156
+ β”‚ β”‚ └─ReLU (5) [32, 12, 9, 9] [32, 12, 9, 9] -- -- --
157
+ β”‚ β”‚ └─BatchNorm2d (6) [32, 12, 9, 9] [32, 12, 9, 9] 24 -- 768
158
+ β”‚ β”‚ β”‚ └─weight β”œβ”€12 [12]
159
+ β”‚ β”‚ β”‚ └─bias └─12 [12]
160
+ β”‚ β”‚ └─Dropout2d (7) [32, 12, 9, 9] [32, 12, 9, 9] -- -- --
161
+ β”‚ └─Sequential (trans3) [32, 12, 9, 9] [32, 10, 4, 4] -- -- --
162
+ β”‚ β”‚ └─0.weight β”œβ”€120 [10, 12, 1, 1]
163
+ β”‚ β”‚ └─2.weight β”œβ”€10 [10]
164
+ β”‚ β”‚ └─2.bias └─10 [10]
165
+ β”‚ β”‚ └─Conv2d (0) [32, 12, 9, 9] [32, 10, 9, 9] 120 [1, 1] 311,040
166
+ β”‚ β”‚ β”‚ └─weight └─120 [12, 10, 1, 1]
167
+ β”‚ β”‚ └─MaxPool2d (1) [32, 10, 9, 9] [32, 10, 4, 4] -- 2 --
168
+ β”‚ β”‚ └─BatchNorm2d (2) [32, 10, 4, 4] [32, 10, 4, 4] 20 -- 640
169
+ β”‚ β”‚ β”‚ └─weight β”œβ”€10 [10]
170
+ β”‚ β”‚ β”‚ └─bias └─10 [10]
171
+ β”‚ └─Sequential (out4) [32, 10, 4, 4] [32, 10, 1, 1] -- -- --
172
+ β”‚ β”‚ └─0.weight └─900 [10, 10, 3, 3]
173
+ β”‚ β”‚ └─Conv2d (0) [32, 10, 4, 4] [32, 10, 4, 4] 900 [3, 3] 460,800
174
+ β”‚ β”‚ β”‚ └─weight └─900 [10, 10, 3, 3]
175
+ β”‚ β”‚ └─AvgPool2d (1) [32, 10, 4, 4] [32, 10, 1, 1] -- 3 --
176
+ ========================================================================================================================
177
+ Total params: 7,988
178
+ Trainable params: 7,988
179
+ Non-trainable params: 0
180
+ Total mult-adds (Units.MEGABYTES): 77.51
181
+ ========================================================================================================================
182
+ Input size (MB): 0.10
183
+ Forward/backward pass size (MB): 18.40
184
+ Params size (MB): 0.03
185
+ Estimated Total Size (MB): 18.53
186
+ ========================================================================================================================
187
+ ```
188
+
189
+ ## Training Logs
190
+
191
+ ```sh
192
+ cd /usr/home/:USER:/UnsolvedMNIST
193
+ tensorboard --logdir=logs
194
+
195
+ ```
196
+
197
+ ## Performance Profiling
198
+ ```log
199
+ ------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------
200
+ Name Self CPU % Self CPU CPU total % CPU total CPU time avg # of Calls
201
+ ------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------
202
+ aten::cudnn_convolution 69.86% 444.597ms 69.86% 444.597ms 37.050ms 12
203
+ aten::_log_softmax 7.52% 47.831ms 7.52% 47.831ms 47.831ms 1
204
+ aten::clamp_min 4.42% 28.104ms 4.42% 28.104ms 3.513ms 8
205
+ aten::cudnn_batch_norm 4.13% 26.264ms 4.20% 26.758ms 2.676ms 10
206
+ aten::add_ 3.47% 22.086ms 3.47% 22.086ms 2.209ms 10
207
+ aten::bernoulli_ 2.79% 17.777ms 2.79% 17.777ms 2.222ms 8
208
+ aten::div_ 2.53% 16.126ms 2.53% 16.126ms 2.016ms 8
209
+ aten::mul 2.29% 14.584ms 2.29% 14.584ms 1.823ms 8
210
+ aten::avg_pool2d 0.63% 4.009ms 0.63% 4.009ms 4.009ms 1
211
+ aten::max_pool2d_with_indices 0.54% 3.446ms 0.54% 3.446ms 1.149ms 3
212
+ aten::convolution 0.39% 2.469ms 70.31% 447.487ms 37.291ms 12
213
+ aten::relu 0.28% 1.804ms 4.70% 29.908ms 3.739ms 8
214
+ aten::_batch_norm_impl_index 0.22% 1.430ms 4.43% 28.188ms 2.819ms 10
215
+ aten::batch_norm 0.16% 1.006ms 4.59% 29.194ms 2.919ms 10
216
+ aten::empty 0.12% 757.000us 0.12% 757.000us 11.828us 64
217
+ aten::max_pool2d 0.12% 751.000us 0.66% 4.197ms 1.399ms 3
218
+ aten::log_softmax 0.10% 653.000us 7.62% 48.484ms 48.484ms 1
219
+ aten::conv2d 0.10% 636.000us 70.41% 448.123ms 37.344ms 12
220
+ aten::feature_dropout 0.08% 479.000us 7.71% 49.058ms 6.132ms 8
221
+ aten::copy_ 0.07% 447.000us 0.07% 447.000us 63.857us 7
222
+ aten::_convolution 0.07% 421.000us 69.92% 445.018ms 37.085ms 12
223
+ aten::to 0.05% 291.000us 0.13% 843.000us 7.205us 117
224
+ aten::zeros 0.04% 270.000us 0.08% 523.000us 87.167us 6
225
+ aten::empty_strided 0.01% 60.000us 0.01% 60.000us 8.571us 7
226
+ aten::_to_copy 0.01% 45.000us 0.09% 552.000us 78.857us 7
227
+ aten::view 0.01% 39.000us 0.01% 39.000us 3.545us 11
228
+ aten::empty_like 0.00% 31.000us 0.06% 384.000us 38.400us 10
229
+ aten::new_empty 0.00% 23.000us 0.01% 92.000us 11.500us 8
230
+ aten::_has_compatible_shallow_copy_type 0.00% 2.000us 0.00% 2.000us 0.031us 64
231
+ aten::zero_ 0.00% 1.000us 0.00% 1.000us 0.167us 6
232
+ ------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------
233
+ Self CPU time total: 636.439ms
234
+
235
+ ```
236
+
237
+
238
+ # Credits:
239
+ - [pytorch_performance_profiling.md](https://gist.github.com/mingfeima/e08310d7e7bb9ae2a693adecf2d8a916)
240
+ - [FLOPs calculation](https://medium.com/@dzmitrybahdanau/the-flops-calculus-of-language-model-training-3b19c1f025e4)
241
+ - [software 2.0](https://karpathy.medium.com/software-2-0-a64152b37c35)
242
+ - [weight init](https://towardsdatascience.com/weight-initialization-in-neural-networks-a-journey-from-the-basics-to-kaiming-954fb9b47c79)