Spaces:
Sleeping
Sleeping
File size: 23,078 Bytes
5148aba 3bd4938 221a558 3bd4938 221a558 3bd4938 221a558 3bd4938 221a558 3bd4938 221a558 3bd4938 221a558 3bd4938 221a558 3bd4938 221a558 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 |
---
title: UnsolvedMNIST
emoji: π
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 4.37.2
app_file: app.py
pinned: false
license: mit
---
# The Unsolved MNIST π’
**M**odified **N**ational **I**nstitute for **S**tandards and **T**echnology Dataset
###### TODO: Implementation
# Description
###### TODO: Implementation
# Setup
###### TODO: Implementation
# Objective
###### TODO: Implementation
# Logs
###### TODO: Implementation
## Model Summary
```log
========================================================================================================================
Layer (type (var_name)) Input Shape Output Shape Param # Kernel Shape Mult-Adds
========================================================================================================================
LitMNISTModel (LitMNISTModel) [32, 1, 28, 28] [32, 10] -- -- --
ββNet (model) [32, 1, 28, 28] [32, 10] -- -- --
β ββconv1.0.weight ββ72 [8, 1, 3, 3]
β ββconv1.2.weight ββ8 [8]
β ββconv1.2.bias ββ8 [8]
β ββconv1.4.weight ββ720 [10, 8, 3, 3]
β ββconv1.6.weight ββ10 [10]
β ββconv1.6.bias ββ10 [10]
β ββconv1.8.weight ββ900 [10, 10, 3, 3]
β ββconv1.10.weight ββ10 [10]
β ββconv1.10.bias ββ10 [10]
β ββtrans1.1.weight ββ80 [8, 10, 1, 1]
β ββconv2.0.weight ββ720 [10, 8, 3, 3]
β ββconv2.1.weight ββ10 [10]
β ββconv2.1.bias ββ10 [10]
β ββconv2.4.weight ββ1,080 [12, 10, 3, 3]
β ββconv2.5.weight ββ12 [12]
β ββconv2.5.bias ββ12 [12]
β ββconv2.8.weight ββ1,296 [12, 12, 3, 3]
β ββconv2.9.weight ββ12 [12]
β ββconv2.9.bias ββ12 [12]
β ββtrans2.1.weight ββ96 [8, 12, 1, 1]
β ββtrans2.2.weight ββ8 [8]
β ββtrans2.2.bias ββ8 [8]
β ββconv3.0.weight ββ720 [10, 8, 3, 3]
β ββconv3.1.weight ββ10 [10]
β ββconv3.1.bias ββ10 [10]
β ββconv3.4.weight ββ1,080 [12, 10, 3, 3]
β ββconv3.6.weight ββ12 [12]
β ββconv3.6.bias ββ12 [12]
β ββtrans3.0.weight ββ120 [10, 12, 1, 1]
β ββtrans3.2.weight ββ10 [10]
β ββtrans3.2.bias ββ10 [10]
β ββout4.0.weight ββ900 [10, 10, 3, 3]
β ββSequential (conv1) [32, 1, 28, 28] [32, 10, 28, 28] -- -- --
β β ββ0.weight ββ72 [8, 1, 3, 3]
β β ββ2.weight ββ8 [8]
β β ββ2.bias ββ8 [8]
β β ββ4.weight ββ720 [10, 8, 3, 3]
β β ββ6.weight ββ10 [10]
β β ββ6.bias ββ10 [10]
β β ββ8.weight ββ900 [10, 10, 3, 3]
β β ββ10.weight ββ10 [10]
β β ββ10.bias ββ10 [10]
β β ββConv2d (0) [32, 1, 28, 28] [32, 8, 28, 28] 72 [3, 3] 1,806,336
β β β ββweight ββ72 [1, 8, 3, 3]
β β ββReLU (1) [32, 8, 28, 28] [32, 8, 28, 28] -- -- --
β β ββBatchNorm2d (2) [32, 8, 28, 28] [32, 8, 28, 28] 16 -- 512
β β β ββweight ββ8 [8]
β β β ββbias ββ8 [8]
β β ββDropout2d (3) [32, 8, 28, 28] [32, 8, 28, 28] -- -- --
β β ββConv2d (4) [32, 8, 28, 28] [32, 10, 28, 28] 720 [3, 3] 18,063,360
β β β ββweight ββ720 [8, 10, 3, 3]
β β ββReLU (5) [32, 10, 28, 28] [32, 10, 28, 28] -- -- --
β β ββBatchNorm2d (6) [32, 10, 28, 28] [32, 10, 28, 28] 20 -- 640
β β β ββweight ββ10 [10]
β β β ββbias ββ10 [10]
β β ββDropout2d (7) [32, 10, 28, 28] [32, 10, 28, 28] -- -- --
β β ββConv2d (8) [32, 10, 28, 28] [32, 10, 28, 28] 900 [3, 3] 22,579,200
β β β ββweight ββ900 [10, 10, 3, 3]
β β ββReLU (9) [32, 10, 28, 28] [32, 10, 28, 28] -- -- --
β β ββBatchNorm2d (10) [32, 10, 28, 28] [32, 10, 28, 28] 20 -- 640
β β β ββweight ββ10 [10]
β β β ββbias ββ10 [10]
β β ββDropout2d (11) [32, 10, 28, 28] [32, 10, 28, 28] -- -- --
β ββSequential (trans1) [32, 10, 28, 28] [32, 8, 17, 17] -- -- --
β β ββ1.weight ββ80 [8, 10, 1, 1]
β β ββMaxPool2d (0) [32, 10, 28, 28] [32, 10, 15, 15] -- 2 --
β β ββConv2d (1) [32, 10, 15, 15] [32, 8, 17, 17] 80 [1, 1] 739,840
β β β ββweight ββ80 [10, 8, 1, 1]
β ββSequential (conv2) [32, 8, 17, 17] [32, 12, 17, 17] -- -- --
β β ββ0.weight ββ720 [10, 8, 3, 3]
β β ββ1.weight ββ10 [10]
β β ββ1.bias ββ10 [10]
β β ββ4.weight ββ1,080 [12, 10, 3, 3]
β β ββ5.weight ββ12 [12]
β β ββ5.bias ββ12 [12]
β β ββ8.weight ββ1,296 [12, 12, 3, 3]
β β ββ9.weight ββ12 [12]
β β ββ9.bias ββ12 [12]
β β ββConv2d (0) [32, 8, 17, 17] [32, 10, 17, 17] 720 [3, 3] 6,658,560
β β β ββweight ββ720 [8, 10, 3, 3]
β β ββBatchNorm2d (1) [32, 10, 17, 17] [32, 10, 17, 17] 20 -- 640
β β β ββweight ββ10 [10]
β β β ββbias ββ10 [10]
β β ββReLU (2) [32, 10, 17, 17] [32, 10, 17, 17] -- -- --
β β ββDropout2d (3) [32, 10, 17, 17] [32, 10, 17, 17] -- -- --
β β ββConv2d (4) [32, 10, 17, 17] [32, 12, 17, 17] 1,080 [3, 3] 9,987,840
β β β ββweight ββ1,080 [10, 12, 3, 3]
β β ββBatchNorm2d (5) [32, 12, 17, 17] [32, 12, 17, 17] 24 -- 768
β β β ββweight ββ12 [12]
β β β ββbias ββ12 [12]
β β ββReLU (6) [32, 12, 17, 17] [32, 12, 17, 17] -- -- --
β β ββDropout2d (7) [32, 12, 17, 17] [32, 12, 17, 17] -- -- --
β β ββConv2d (8) [32, 12, 17, 17] [32, 12, 17, 17] 1,296 [3, 3] 11,985,408
β β β ββweight ββ1,296 [12, 12, 3, 3]
β β ββBatchNorm2d (9) [32, 12, 17, 17] [32, 12, 17, 17] 24 -- 768
β β β ββweight ββ12 [12]
β β β ββbias ββ12 [12]
β β ββReLU (10) [32, 12, 17, 17] [32, 12, 17, 17] -- -- --
β β ββDropout2d (11) [32, 12, 17, 17] [32, 12, 17, 17] -- -- --
β ββSequential (trans2) [32, 12, 17, 17] [32, 8, 9, 9] -- -- --
β β ββ1.weight ββ96 [8, 12, 1, 1]
β β ββ2.weight ββ8 [8]
β β ββ2.bias ββ8 [8]
β β ββMaxPool2d (0) [32, 12, 17, 17] [32, 12, 9, 9] -- 2 --
β β ββConv2d (1) [32, 12, 9, 9] [32, 8, 9, 9] 96 [1, 1] 248,832
β β β ββweight ββ96 [12, 8, 1, 1]
β β ββBatchNorm2d (2) [32, 8, 9, 9] [32, 8, 9, 9] 16 -- 512
β β β ββweight ββ8 [8]
β β β ββbias ββ8 [8]
β ββSequential (conv3) [32, 8, 9, 9] [32, 12, 9, 9] -- -- --
β β ββ0.weight ββ720 [10, 8, 3, 3]
β β ββ1.weight ββ10 [10]
β β ββ1.bias ββ10 [10]
β β ββ4.weight ββ1,080 [12, 10, 3, 3]
β β ββ6.weight ββ12 [12]
β β ββ6.bias ββ12 [12]
β β ββConv2d (0) [32, 8, 9, 9] [32, 10, 9, 9] 720 [3, 3] 1,866,240
β β β ββweight ββ720 [8, 10, 3, 3]
β β ββBatchNorm2d (1) [32, 10, 9, 9] [32, 10, 9, 9] 20 -- 640
β β β ββweight ββ10 [10]
β β β ββbias ββ10 [10]
β β ββReLU (2) [32, 10, 9, 9] [32, 10, 9, 9] -- -- --
β β ββDropout2d (3) [32, 10, 9, 9] [32, 10, 9, 9] -- -- --
β β ββConv2d (4) [32, 10, 9, 9] [32, 12, 9, 9] 1,080 [3, 3] 2,799,360
β β β ββweight ββ1,080 [10, 12, 3, 3]
β β ββReLU (5) [32, 12, 9, 9] [32, 12, 9, 9] -- -- --
β β ββBatchNorm2d (6) [32, 12, 9, 9] [32, 12, 9, 9] 24 -- 768
β β β ββweight ββ12 [12]
β β β ββbias ββ12 [12]
β β ββDropout2d (7) [32, 12, 9, 9] [32, 12, 9, 9] -- -- --
β ββSequential (trans3) [32, 12, 9, 9] [32, 10, 4, 4] -- -- --
β β ββ0.weight ββ120 [10, 12, 1, 1]
β β ββ2.weight ββ10 [10]
β β ββ2.bias ββ10 [10]
β β ββConv2d (0) [32, 12, 9, 9] [32, 10, 9, 9] 120 [1, 1] 311,040
β β β ββweight ββ120 [12, 10, 1, 1]
β β ββMaxPool2d (1) [32, 10, 9, 9] [32, 10, 4, 4] -- 2 --
β β ββBatchNorm2d (2) [32, 10, 4, 4] [32, 10, 4, 4] 20 -- 640
β β β ββweight ββ10 [10]
β β β ββbias ββ10 [10]
β ββSequential (out4) [32, 10, 4, 4] [32, 10, 1, 1] -- -- --
β β ββ0.weight ββ900 [10, 10, 3, 3]
β β ββConv2d (0) [32, 10, 4, 4] [32, 10, 4, 4] 900 [3, 3] 460,800
β β β ββweight ββ900 [10, 10, 3, 3]
β β ββAvgPool2d (1) [32, 10, 4, 4] [32, 10, 1, 1] -- 3 --
========================================================================================================================
Total params: 7,988
Trainable params: 7,988
Non-trainable params: 0
Total mult-adds (Units.MEGABYTES): 77.51
========================================================================================================================
Input size (MB): 0.10
Forward/backward pass size (MB): 18.40
Params size (MB): 0.03
Estimated Total Size (MB): 18.53
========================================================================================================================
```
## Training Logs
###### TODO: Implementation
```sh
cd /usr/home/:USER:/UnsolvedMNIST
tensorboard --logdir=logs
```
## Performance Profiling
###### TODO: Implementation
```log
------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------
Name Self CPU % Self CPU CPU total % CPU total CPU time avg # of Calls
------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------
aten::cudnn_convolution 69.86% 444.597ms 69.86% 444.597ms 37.050ms 12
aten::_log_softmax 7.52% 47.831ms 7.52% 47.831ms 47.831ms 1
aten::clamp_min 4.42% 28.104ms 4.42% 28.104ms 3.513ms 8
aten::cudnn_batch_norm 4.13% 26.264ms 4.20% 26.758ms 2.676ms 10
aten::add_ 3.47% 22.086ms 3.47% 22.086ms 2.209ms 10
aten::bernoulli_ 2.79% 17.777ms 2.79% 17.777ms 2.222ms 8
aten::div_ 2.53% 16.126ms 2.53% 16.126ms 2.016ms 8
aten::mul 2.29% 14.584ms 2.29% 14.584ms 1.823ms 8
aten::avg_pool2d 0.63% 4.009ms 0.63% 4.009ms 4.009ms 1
aten::max_pool2d_with_indices 0.54% 3.446ms 0.54% 3.446ms 1.149ms 3
aten::convolution 0.39% 2.469ms 70.31% 447.487ms 37.291ms 12
aten::relu 0.28% 1.804ms 4.70% 29.908ms 3.739ms 8
aten::_batch_norm_impl_index 0.22% 1.430ms 4.43% 28.188ms 2.819ms 10
aten::batch_norm 0.16% 1.006ms 4.59% 29.194ms 2.919ms 10
aten::empty 0.12% 757.000us 0.12% 757.000us 11.828us 64
aten::max_pool2d 0.12% 751.000us 0.66% 4.197ms 1.399ms 3
aten::log_softmax 0.10% 653.000us 7.62% 48.484ms 48.484ms 1
aten::conv2d 0.10% 636.000us 70.41% 448.123ms 37.344ms 12
aten::feature_dropout 0.08% 479.000us 7.71% 49.058ms 6.132ms 8
aten::copy_ 0.07% 447.000us 0.07% 447.000us 63.857us 7
aten::_convolution 0.07% 421.000us 69.92% 445.018ms 37.085ms 12
aten::to 0.05% 291.000us 0.13% 843.000us 7.205us 117
aten::zeros 0.04% 270.000us 0.08% 523.000us 87.167us 6
aten::empty_strided 0.01% 60.000us 0.01% 60.000us 8.571us 7
aten::_to_copy 0.01% 45.000us 0.09% 552.000us 78.857us 7
aten::view 0.01% 39.000us 0.01% 39.000us 3.545us 11
aten::empty_like 0.00% 31.000us 0.06% 384.000us 38.400us 10
aten::new_empty 0.00% 23.000us 0.01% 92.000us 11.500us 8
aten::_has_compatible_shallow_copy_type 0.00% 2.000us 0.00% 2.000us 0.031us 64
aten::zero_ 0.00% 1.000us 0.00% 1.000us 0.167us 6
------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------
Self CPU time total: 636.439ms
```
# Contribution
###### TODO: Implementation |