Spaces:
Sleeping
Sleeping
title: UnsolvedMNIST | |
emoji: π | |
colorFrom: blue | |
colorTo: indigo | |
sdk: gradio | |
sdk_version: 4.37.2 | |
app_file: app.py | |
pinned: false | |
license: mit | |
# The Unsolved MNIST π’ | |
**M**odified **N**ational **I**nstitute for **S**tandards and **T**echnology Dataset | |
###### TODO: Implementation | |
# Description | |
###### TODO: Implementation | |
# Setup | |
###### TODO: Implementation | |
# Objective | |
###### TODO: Implementation | |
# Logs | |
###### TODO: Implementation | |
## Model Summary | |
```log | |
======================================================================================================================== | |
Layer (type (var_name)) Input Shape Output Shape Param # Kernel Shape Mult-Adds | |
======================================================================================================================== | |
LitMNISTModel (LitMNISTModel) [32, 1, 28, 28] [32, 10] -- -- -- | |
ββNet (model) [32, 1, 28, 28] [32, 10] -- -- -- | |
β ββconv1.0.weight ββ72 [8, 1, 3, 3] | |
β ββconv1.2.weight ββ8 [8] | |
β ββconv1.2.bias ββ8 [8] | |
β ββconv1.4.weight ββ720 [10, 8, 3, 3] | |
β ββconv1.6.weight ββ10 [10] | |
β ββconv1.6.bias ββ10 [10] | |
β ββconv1.8.weight ββ900 [10, 10, 3, 3] | |
β ββconv1.10.weight ββ10 [10] | |
β ββconv1.10.bias ββ10 [10] | |
β ββtrans1.1.weight ββ80 [8, 10, 1, 1] | |
β ββconv2.0.weight ββ720 [10, 8, 3, 3] | |
β ββconv2.1.weight ββ10 [10] | |
β ββconv2.1.bias ββ10 [10] | |
β ββconv2.4.weight ββ1,080 [12, 10, 3, 3] | |
β ββconv2.5.weight ββ12 [12] | |
β ββconv2.5.bias ββ12 [12] | |
β ββconv2.8.weight ββ1,296 [12, 12, 3, 3] | |
β ββconv2.9.weight ββ12 [12] | |
β ββconv2.9.bias ββ12 [12] | |
β ββtrans2.1.weight ββ96 [8, 12, 1, 1] | |
β ββtrans2.2.weight ββ8 [8] | |
β ββtrans2.2.bias ββ8 [8] | |
β ββconv3.0.weight ββ720 [10, 8, 3, 3] | |
β ββconv3.1.weight ββ10 [10] | |
β ββconv3.1.bias ββ10 [10] | |
β ββconv3.4.weight ββ1,080 [12, 10, 3, 3] | |
β ββconv3.6.weight ββ12 [12] | |
β ββconv3.6.bias ββ12 [12] | |
β ββtrans3.0.weight ββ120 [10, 12, 1, 1] | |
β ββtrans3.2.weight ββ10 [10] | |
β ββtrans3.2.bias ββ10 [10] | |
β ββout4.0.weight ββ900 [10, 10, 3, 3] | |
β ββSequential (conv1) [32, 1, 28, 28] [32, 10, 28, 28] -- -- -- | |
β β ββ0.weight ββ72 [8, 1, 3, 3] | |
β β ββ2.weight ββ8 [8] | |
β β ββ2.bias ββ8 [8] | |
β β ββ4.weight ββ720 [10, 8, 3, 3] | |
β β ββ6.weight ββ10 [10] | |
β β ββ6.bias ββ10 [10] | |
β β ββ8.weight ββ900 [10, 10, 3, 3] | |
β β ββ10.weight ββ10 [10] | |
β β ββ10.bias ββ10 [10] | |
β β ββConv2d (0) [32, 1, 28, 28] [32, 8, 28, 28] 72 [3, 3] 1,806,336 | |
β β β ββweight ββ72 [1, 8, 3, 3] | |
β β ββReLU (1) [32, 8, 28, 28] [32, 8, 28, 28] -- -- -- | |
β β ββBatchNorm2d (2) [32, 8, 28, 28] [32, 8, 28, 28] 16 -- 512 | |
β β β ββweight ββ8 [8] | |
β β β ββbias ββ8 [8] | |
β β ββDropout2d (3) [32, 8, 28, 28] [32, 8, 28, 28] -- -- -- | |
β β ββConv2d (4) [32, 8, 28, 28] [32, 10, 28, 28] 720 [3, 3] 18,063,360 | |
β β β ββweight ββ720 [8, 10, 3, 3] | |
β β ββReLU (5) [32, 10, 28, 28] [32, 10, 28, 28] -- -- -- | |
β β ββBatchNorm2d (6) [32, 10, 28, 28] [32, 10, 28, 28] 20 -- 640 | |
β β β ββweight ββ10 [10] | |
β β β ββbias ββ10 [10] | |
β β ββDropout2d (7) [32, 10, 28, 28] [32, 10, 28, 28] -- -- -- | |
β β ββConv2d (8) [32, 10, 28, 28] [32, 10, 28, 28] 900 [3, 3] 22,579,200 | |
β β β ββweight ββ900 [10, 10, 3, 3] | |
β β ββReLU (9) [32, 10, 28, 28] [32, 10, 28, 28] -- -- -- | |
β β ββBatchNorm2d (10) [32, 10, 28, 28] [32, 10, 28, 28] 20 -- 640 | |
β β β ββweight ββ10 [10] | |
β β β ββbias ββ10 [10] | |
β β ββDropout2d (11) [32, 10, 28, 28] [32, 10, 28, 28] -- -- -- | |
β ββSequential (trans1) [32, 10, 28, 28] [32, 8, 17, 17] -- -- -- | |
β β ββ1.weight ββ80 [8, 10, 1, 1] | |
β β ββMaxPool2d (0) [32, 10, 28, 28] [32, 10, 15, 15] -- 2 -- | |
β β ββConv2d (1) [32, 10, 15, 15] [32, 8, 17, 17] 80 [1, 1] 739,840 | |
β β β ββweight ββ80 [10, 8, 1, 1] | |
β ββSequential (conv2) [32, 8, 17, 17] [32, 12, 17, 17] -- -- -- | |
β β ββ0.weight ββ720 [10, 8, 3, 3] | |
β β ββ1.weight ββ10 [10] | |
β β ββ1.bias ββ10 [10] | |
β β ββ4.weight ββ1,080 [12, 10, 3, 3] | |
β β ββ5.weight ββ12 [12] | |
β β ββ5.bias ββ12 [12] | |
β β ββ8.weight ββ1,296 [12, 12, 3, 3] | |
β β ββ9.weight ββ12 [12] | |
β β ββ9.bias ββ12 [12] | |
β β ββConv2d (0) [32, 8, 17, 17] [32, 10, 17, 17] 720 [3, 3] 6,658,560 | |
β β β ββweight ββ720 [8, 10, 3, 3] | |
β β ββBatchNorm2d (1) [32, 10, 17, 17] [32, 10, 17, 17] 20 -- 640 | |
β β β ββweight ββ10 [10] | |
β β β ββbias ββ10 [10] | |
β β ββReLU (2) [32, 10, 17, 17] [32, 10, 17, 17] -- -- -- | |
β β ββDropout2d (3) [32, 10, 17, 17] [32, 10, 17, 17] -- -- -- | |
β β ββConv2d (4) [32, 10, 17, 17] [32, 12, 17, 17] 1,080 [3, 3] 9,987,840 | |
β β β ββweight ββ1,080 [10, 12, 3, 3] | |
β β ββBatchNorm2d (5) [32, 12, 17, 17] [32, 12, 17, 17] 24 -- 768 | |
β β β ββweight ββ12 [12] | |
β β β ββbias ββ12 [12] | |
β β ββReLU (6) [32, 12, 17, 17] [32, 12, 17, 17] -- -- -- | |
β β ββDropout2d (7) [32, 12, 17, 17] [32, 12, 17, 17] -- -- -- | |
β β ββConv2d (8) [32, 12, 17, 17] [32, 12, 17, 17] 1,296 [3, 3] 11,985,408 | |
β β β ββweight ββ1,296 [12, 12, 3, 3] | |
β β ββBatchNorm2d (9) [32, 12, 17, 17] [32, 12, 17, 17] 24 -- 768 | |
β β β ββweight ββ12 [12] | |
β β β ββbias ββ12 [12] | |
β β ββReLU (10) [32, 12, 17, 17] [32, 12, 17, 17] -- -- -- | |
β β ββDropout2d (11) [32, 12, 17, 17] [32, 12, 17, 17] -- -- -- | |
β ββSequential (trans2) [32, 12, 17, 17] [32, 8, 9, 9] -- -- -- | |
β β ββ1.weight ββ96 [8, 12, 1, 1] | |
β β ββ2.weight ββ8 [8] | |
β β ββ2.bias ββ8 [8] | |
β β ββMaxPool2d (0) [32, 12, 17, 17] [32, 12, 9, 9] -- 2 -- | |
β β ββConv2d (1) [32, 12, 9, 9] [32, 8, 9, 9] 96 [1, 1] 248,832 | |
β β β ββweight ββ96 [12, 8, 1, 1] | |
β β ββBatchNorm2d (2) [32, 8, 9, 9] [32, 8, 9, 9] 16 -- 512 | |
β β β ββweight ββ8 [8] | |
β β β ββbias ββ8 [8] | |
β ββSequential (conv3) [32, 8, 9, 9] [32, 12, 9, 9] -- -- -- | |
β β ββ0.weight ββ720 [10, 8, 3, 3] | |
β β ββ1.weight ββ10 [10] | |
β β ββ1.bias ββ10 [10] | |
β β ββ4.weight ββ1,080 [12, 10, 3, 3] | |
β β ββ6.weight ββ12 [12] | |
β β ββ6.bias ββ12 [12] | |
β β ββConv2d (0) [32, 8, 9, 9] [32, 10, 9, 9] 720 [3, 3] 1,866,240 | |
β β β ββweight ββ720 [8, 10, 3, 3] | |
β β ββBatchNorm2d (1) [32, 10, 9, 9] [32, 10, 9, 9] 20 -- 640 | |
β β β ββweight ββ10 [10] | |
β β β ββbias ββ10 [10] | |
β β ββReLU (2) [32, 10, 9, 9] [32, 10, 9, 9] -- -- -- | |
β β ββDropout2d (3) [32, 10, 9, 9] [32, 10, 9, 9] -- -- -- | |
β β ββConv2d (4) [32, 10, 9, 9] [32, 12, 9, 9] 1,080 [3, 3] 2,799,360 | |
β β β ββweight ββ1,080 [10, 12, 3, 3] | |
β β ββReLU (5) [32, 12, 9, 9] [32, 12, 9, 9] -- -- -- | |
β β ββBatchNorm2d (6) [32, 12, 9, 9] [32, 12, 9, 9] 24 -- 768 | |
β β β ββweight ββ12 [12] | |
β β β ββbias ββ12 [12] | |
β β ββDropout2d (7) [32, 12, 9, 9] [32, 12, 9, 9] -- -- -- | |
β ββSequential (trans3) [32, 12, 9, 9] [32, 10, 4, 4] -- -- -- | |
β β ββ0.weight ββ120 [10, 12, 1, 1] | |
β β ββ2.weight ββ10 [10] | |
β β ββ2.bias ββ10 [10] | |
β β ββConv2d (0) [32, 12, 9, 9] [32, 10, 9, 9] 120 [1, 1] 311,040 | |
β β β ββweight ββ120 [12, 10, 1, 1] | |
β β ββMaxPool2d (1) [32, 10, 9, 9] [32, 10, 4, 4] -- 2 -- | |
β β ββBatchNorm2d (2) [32, 10, 4, 4] [32, 10, 4, 4] 20 -- 640 | |
β β β ββweight ββ10 [10] | |
β β β ββbias ββ10 [10] | |
β ββSequential (out4) [32, 10, 4, 4] [32, 10, 1, 1] -- -- -- | |
β β ββ0.weight ββ900 [10, 10, 3, 3] | |
β β ββConv2d (0) [32, 10, 4, 4] [32, 10, 4, 4] 900 [3, 3] 460,800 | |
β β β ββweight ββ900 [10, 10, 3, 3] | |
β β ββAvgPool2d (1) [32, 10, 4, 4] [32, 10, 1, 1] -- 3 -- | |
======================================================================================================================== | |
Total params: 7,988 | |
Trainable params: 7,988 | |
Non-trainable params: 0 | |
Total mult-adds (Units.MEGABYTES): 77.51 | |
======================================================================================================================== | |
Input size (MB): 0.10 | |
Forward/backward pass size (MB): 18.40 | |
Params size (MB): 0.03 | |
Estimated Total Size (MB): 18.53 | |
======================================================================================================================== | |
``` | |
## Training Logs | |
###### TODO: Implementation | |
```sh | |
cd /usr/home/:USER:/UnsolvedMNIST | |
tensorboard --logdir=logs | |
``` | |
## Performance Profiling | |
###### TODO: Implementation | |
```log | |
------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ | |
Name Self CPU % Self CPU CPU total % CPU total CPU time avg # of Calls | |
------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ | |
aten::cudnn_convolution 69.86% 444.597ms 69.86% 444.597ms 37.050ms 12 | |
aten::_log_softmax 7.52% 47.831ms 7.52% 47.831ms 47.831ms 1 | |
aten::clamp_min 4.42% 28.104ms 4.42% 28.104ms 3.513ms 8 | |
aten::cudnn_batch_norm 4.13% 26.264ms 4.20% 26.758ms 2.676ms 10 | |
aten::add_ 3.47% 22.086ms 3.47% 22.086ms 2.209ms 10 | |
aten::bernoulli_ 2.79% 17.777ms 2.79% 17.777ms 2.222ms 8 | |
aten::div_ 2.53% 16.126ms 2.53% 16.126ms 2.016ms 8 | |
aten::mul 2.29% 14.584ms 2.29% 14.584ms 1.823ms 8 | |
aten::avg_pool2d 0.63% 4.009ms 0.63% 4.009ms 4.009ms 1 | |
aten::max_pool2d_with_indices 0.54% 3.446ms 0.54% 3.446ms 1.149ms 3 | |
aten::convolution 0.39% 2.469ms 70.31% 447.487ms 37.291ms 12 | |
aten::relu 0.28% 1.804ms 4.70% 29.908ms 3.739ms 8 | |
aten::_batch_norm_impl_index 0.22% 1.430ms 4.43% 28.188ms 2.819ms 10 | |
aten::batch_norm 0.16% 1.006ms 4.59% 29.194ms 2.919ms 10 | |
aten::empty 0.12% 757.000us 0.12% 757.000us 11.828us 64 | |
aten::max_pool2d 0.12% 751.000us 0.66% 4.197ms 1.399ms 3 | |
aten::log_softmax 0.10% 653.000us 7.62% 48.484ms 48.484ms 1 | |
aten::conv2d 0.10% 636.000us 70.41% 448.123ms 37.344ms 12 | |
aten::feature_dropout 0.08% 479.000us 7.71% 49.058ms 6.132ms 8 | |
aten::copy_ 0.07% 447.000us 0.07% 447.000us 63.857us 7 | |
aten::_convolution 0.07% 421.000us 69.92% 445.018ms 37.085ms 12 | |
aten::to 0.05% 291.000us 0.13% 843.000us 7.205us 117 | |
aten::zeros 0.04% 270.000us 0.08% 523.000us 87.167us 6 | |
aten::empty_strided 0.01% 60.000us 0.01% 60.000us 8.571us 7 | |
aten::_to_copy 0.01% 45.000us 0.09% 552.000us 78.857us 7 | |
aten::view 0.01% 39.000us 0.01% 39.000us 3.545us 11 | |
aten::empty_like 0.00% 31.000us 0.06% 384.000us 38.400us 10 | |
aten::new_empty 0.00% 23.000us 0.01% 92.000us 11.500us 8 | |
aten::_has_compatible_shallow_copy_type 0.00% 2.000us 0.00% 2.000us 0.031us 64 | |
aten::zero_ 0.00% 1.000us 0.00% 1.000us 0.167us 6 | |
------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ | |
Self CPU time total: 636.439ms | |
``` | |
# Contribution | |
###### TODO: Implementation |