Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
5.23.3
metadata
title: UnsolvedMNIST
emoji: π
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 4.37.2
app_file: app.py
pinned: false
license: mit
The Unsolved MNIST π’
Modified National Institute for Standards and Technology Dataset
TODO: Implementation
Description
TODO: Implementation
Setup
TODO: Implementation
Objective
TODO: Implementation
Logs
TODO: Implementation
Model Summary
========================================================================================================================
Layer (type (var_name)) Input Shape Output Shape Param # Kernel Shape Mult-Adds
========================================================================================================================
LitMNISTModel (LitMNISTModel) [32, 1, 28, 28] [32, 10] -- -- --
ββNet (model) [32, 1, 28, 28] [32, 10] -- -- --
β ββconv1.0.weight ββ72 [8, 1, 3, 3]
β ββconv1.2.weight ββ8 [8]
β ββconv1.2.bias ββ8 [8]
β ββconv1.4.weight ββ720 [10, 8, 3, 3]
β ββconv1.6.weight ββ10 [10]
β ββconv1.6.bias ββ10 [10]
β ββconv1.8.weight ββ900 [10, 10, 3, 3]
β ββconv1.10.weight ββ10 [10]
β ββconv1.10.bias ββ10 [10]
β ββtrans1.1.weight ββ80 [8, 10, 1, 1]
β ββconv2.0.weight ββ720 [10, 8, 3, 3]
β ββconv2.1.weight ββ10 [10]
β ββconv2.1.bias ββ10 [10]
β ββconv2.4.weight ββ1,080 [12, 10, 3, 3]
β ββconv2.5.weight ββ12 [12]
β ββconv2.5.bias ββ12 [12]
β ββconv2.8.weight ββ1,296 [12, 12, 3, 3]
β ββconv2.9.weight ββ12 [12]
β ββconv2.9.bias ββ12 [12]
β ββtrans2.1.weight ββ96 [8, 12, 1, 1]
β ββtrans2.2.weight ββ8 [8]
β ββtrans2.2.bias ββ8 [8]
β ββconv3.0.weight ββ720 [10, 8, 3, 3]
β ββconv3.1.weight ββ10 [10]
β ββconv3.1.bias ββ10 [10]
β ββconv3.4.weight ββ1,080 [12, 10, 3, 3]
β ββconv3.6.weight ββ12 [12]
β ββconv3.6.bias ββ12 [12]
β ββtrans3.0.weight ββ120 [10, 12, 1, 1]
β ββtrans3.2.weight ββ10 [10]
β ββtrans3.2.bias ββ10 [10]
β ββout4.0.weight ββ900 [10, 10, 3, 3]
β ββSequential (conv1) [32, 1, 28, 28] [32, 10, 28, 28] -- -- --
β β ββ0.weight ββ72 [8, 1, 3, 3]
β β ββ2.weight ββ8 [8]
β β ββ2.bias ββ8 [8]
β β ββ4.weight ββ720 [10, 8, 3, 3]
β β ββ6.weight ββ10 [10]
β β ββ6.bias ββ10 [10]
β β ββ8.weight ββ900 [10, 10, 3, 3]
β β ββ10.weight ββ10 [10]
β β ββ10.bias ββ10 [10]
β β ββConv2d (0) [32, 1, 28, 28] [32, 8, 28, 28] 72 [3, 3] 1,806,336
β β β ββweight ββ72 [1, 8, 3, 3]
β β ββReLU (1) [32, 8, 28, 28] [32, 8, 28, 28] -- -- --
β β ββBatchNorm2d (2) [32, 8, 28, 28] [32, 8, 28, 28] 16 -- 512
β β β ββweight ββ8 [8]
β β β ββbias ββ8 [8]
β β ββDropout2d (3) [32, 8, 28, 28] [32, 8, 28, 28] -- -- --
β β ββConv2d (4) [32, 8, 28, 28] [32, 10, 28, 28] 720 [3, 3] 18,063,360
β β β ββweight ββ720 [8, 10, 3, 3]
β β ββReLU (5) [32, 10, 28, 28] [32, 10, 28, 28] -- -- --
β β ββBatchNorm2d (6) [32, 10, 28, 28] [32, 10, 28, 28] 20 -- 640
β β β ββweight ββ10 [10]
β β β ββbias ββ10 [10]
β β ββDropout2d (7) [32, 10, 28, 28] [32, 10, 28, 28] -- -- --
β β ββConv2d (8) [32, 10, 28, 28] [32, 10, 28, 28] 900 [3, 3] 22,579,200
β β β ββweight ββ900 [10, 10, 3, 3]
β β ββReLU (9) [32, 10, 28, 28] [32, 10, 28, 28] -- -- --
β β ββBatchNorm2d (10) [32, 10, 28, 28] [32, 10, 28, 28] 20 -- 640
β β β ββweight ββ10 [10]
β β β ββbias ββ10 [10]
β β ββDropout2d (11) [32, 10, 28, 28] [32, 10, 28, 28] -- -- --
β ββSequential (trans1) [32, 10, 28, 28] [32, 8, 17, 17] -- -- --
β β ββ1.weight ββ80 [8, 10, 1, 1]
β β ββMaxPool2d (0) [32, 10, 28, 28] [32, 10, 15, 15] -- 2 --
β β ββConv2d (1) [32, 10, 15, 15] [32, 8, 17, 17] 80 [1, 1] 739,840
β β β ββweight ββ80 [10, 8, 1, 1]
β ββSequential (conv2) [32, 8, 17, 17] [32, 12, 17, 17] -- -- --
β β ββ0.weight ββ720 [10, 8, 3, 3]
β β ββ1.weight ββ10 [10]
β β ββ1.bias ββ10 [10]
β β ββ4.weight ββ1,080 [12, 10, 3, 3]
β β ββ5.weight ββ12 [12]
β β ββ5.bias ββ12 [12]
β β ββ8.weight ββ1,296 [12, 12, 3, 3]
β β ββ9.weight ββ12 [12]
β β ββ9.bias ββ12 [12]
β β ββConv2d (0) [32, 8, 17, 17] [32, 10, 17, 17] 720 [3, 3] 6,658,560
β β β ββweight ββ720 [8, 10, 3, 3]
β β ββBatchNorm2d (1) [32, 10, 17, 17] [32, 10, 17, 17] 20 -- 640
β β β ββweight ββ10 [10]
β β β ββbias ββ10 [10]
β β ββReLU (2) [32, 10, 17, 17] [32, 10, 17, 17] -- -- --
β β ββDropout2d (3) [32, 10, 17, 17] [32, 10, 17, 17] -- -- --
β β ββConv2d (4) [32, 10, 17, 17] [32, 12, 17, 17] 1,080 [3, 3] 9,987,840
β β β ββweight ββ1,080 [10, 12, 3, 3]
β β ββBatchNorm2d (5) [32, 12, 17, 17] [32, 12, 17, 17] 24 -- 768
β β β ββweight ββ12 [12]
β β β ββbias ββ12 [12]
β β ββReLU (6) [32, 12, 17, 17] [32, 12, 17, 17] -- -- --
β β ββDropout2d (7) [32, 12, 17, 17] [32, 12, 17, 17] -- -- --
β β ββConv2d (8) [32, 12, 17, 17] [32, 12, 17, 17] 1,296 [3, 3] 11,985,408
β β β ββweight ββ1,296 [12, 12, 3, 3]
β β ββBatchNorm2d (9) [32, 12, 17, 17] [32, 12, 17, 17] 24 -- 768
β β β ββweight ββ12 [12]
β β β ββbias ββ12 [12]
β β ββReLU (10) [32, 12, 17, 17] [32, 12, 17, 17] -- -- --
β β ββDropout2d (11) [32, 12, 17, 17] [32, 12, 17, 17] -- -- --
β ββSequential (trans2) [32, 12, 17, 17] [32, 8, 9, 9] -- -- --
β β ββ1.weight ββ96 [8, 12, 1, 1]
β β ββ2.weight ββ8 [8]
β β ββ2.bias ββ8 [8]
β β ββMaxPool2d (0) [32, 12, 17, 17] [32, 12, 9, 9] -- 2 --
β β ββConv2d (1) [32, 12, 9, 9] [32, 8, 9, 9] 96 [1, 1] 248,832
β β β ββweight ββ96 [12, 8, 1, 1]
β β ββBatchNorm2d (2) [32, 8, 9, 9] [32, 8, 9, 9] 16 -- 512
β β β ββweight ββ8 [8]
β β β ββbias ββ8 [8]
β ββSequential (conv3) [32, 8, 9, 9] [32, 12, 9, 9] -- -- --
β β ββ0.weight ββ720 [10, 8, 3, 3]
β β ββ1.weight ββ10 [10]
β β ββ1.bias ββ10 [10]
β β ββ4.weight ββ1,080 [12, 10, 3, 3]
β β ββ6.weight ββ12 [12]
β β ββ6.bias ββ12 [12]
β β ββConv2d (0) [32, 8, 9, 9] [32, 10, 9, 9] 720 [3, 3] 1,866,240
β β β ββweight ββ720 [8, 10, 3, 3]
β β ββBatchNorm2d (1) [32, 10, 9, 9] [32, 10, 9, 9] 20 -- 640
β β β ββweight ββ10 [10]
β β β ββbias ββ10 [10]
β β ββReLU (2) [32, 10, 9, 9] [32, 10, 9, 9] -- -- --
β β ββDropout2d (3) [32, 10, 9, 9] [32, 10, 9, 9] -- -- --
β β ββConv2d (4) [32, 10, 9, 9] [32, 12, 9, 9] 1,080 [3, 3] 2,799,360
β β β ββweight ββ1,080 [10, 12, 3, 3]
β β ββReLU (5) [32, 12, 9, 9] [32, 12, 9, 9] -- -- --
β β ββBatchNorm2d (6) [32, 12, 9, 9] [32, 12, 9, 9] 24 -- 768
β β β ββweight ββ12 [12]
β β β ββbias ββ12 [12]
β β ββDropout2d (7) [32, 12, 9, 9] [32, 12, 9, 9] -- -- --
β ββSequential (trans3) [32, 12, 9, 9] [32, 10, 4, 4] -- -- --
β β ββ0.weight ββ120 [10, 12, 1, 1]
β β ββ2.weight ββ10 [10]
β β ββ2.bias ββ10 [10]
β β ββConv2d (0) [32, 12, 9, 9] [32, 10, 9, 9] 120 [1, 1] 311,040
β β β ββweight ββ120 [12, 10, 1, 1]
β β ββMaxPool2d (1) [32, 10, 9, 9] [32, 10, 4, 4] -- 2 --
β β ββBatchNorm2d (2) [32, 10, 4, 4] [32, 10, 4, 4] 20 -- 640
β β β ββweight ββ10 [10]
β β β ββbias ββ10 [10]
β ββSequential (out4) [32, 10, 4, 4] [32, 10, 1, 1] -- -- --
β β ββ0.weight ββ900 [10, 10, 3, 3]
β β ββConv2d (0) [32, 10, 4, 4] [32, 10, 4, 4] 900 [3, 3] 460,800
β β β ββweight ββ900 [10, 10, 3, 3]
β β ββAvgPool2d (1) [32, 10, 4, 4] [32, 10, 1, 1] -- 3 --
========================================================================================================================
Total params: 7,988
Trainable params: 7,988
Non-trainable params: 0
Total mult-adds (Units.MEGABYTES): 77.51
========================================================================================================================
Input size (MB): 0.10
Forward/backward pass size (MB): 18.40
Params size (MB): 0.03
Estimated Total Size (MB): 18.53
========================================================================================================================
Training Logs
TODO: Implementation
cd /usr/home/:USER:/UnsolvedMNIST
tensorboard --logdir=logs
Performance Profiling
TODO: Implementation
------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------
Name Self CPU % Self CPU CPU total % CPU total CPU time avg # of Calls
------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------
aten::cudnn_convolution 69.86% 444.597ms 69.86% 444.597ms 37.050ms 12
aten::_log_softmax 7.52% 47.831ms 7.52% 47.831ms 47.831ms 1
aten::clamp_min 4.42% 28.104ms 4.42% 28.104ms 3.513ms 8
aten::cudnn_batch_norm 4.13% 26.264ms 4.20% 26.758ms 2.676ms 10
aten::add_ 3.47% 22.086ms 3.47% 22.086ms 2.209ms 10
aten::bernoulli_ 2.79% 17.777ms 2.79% 17.777ms 2.222ms 8
aten::div_ 2.53% 16.126ms 2.53% 16.126ms 2.016ms 8
aten::mul 2.29% 14.584ms 2.29% 14.584ms 1.823ms 8
aten::avg_pool2d 0.63% 4.009ms 0.63% 4.009ms 4.009ms 1
aten::max_pool2d_with_indices 0.54% 3.446ms 0.54% 3.446ms 1.149ms 3
aten::convolution 0.39% 2.469ms 70.31% 447.487ms 37.291ms 12
aten::relu 0.28% 1.804ms 4.70% 29.908ms 3.739ms 8
aten::_batch_norm_impl_index 0.22% 1.430ms 4.43% 28.188ms 2.819ms 10
aten::batch_norm 0.16% 1.006ms 4.59% 29.194ms 2.919ms 10
aten::empty 0.12% 757.000us 0.12% 757.000us 11.828us 64
aten::max_pool2d 0.12% 751.000us 0.66% 4.197ms 1.399ms 3
aten::log_softmax 0.10% 653.000us 7.62% 48.484ms 48.484ms 1
aten::conv2d 0.10% 636.000us 70.41% 448.123ms 37.344ms 12
aten::feature_dropout 0.08% 479.000us 7.71% 49.058ms 6.132ms 8
aten::copy_ 0.07% 447.000us 0.07% 447.000us 63.857us 7
aten::_convolution 0.07% 421.000us 69.92% 445.018ms 37.085ms 12
aten::to 0.05% 291.000us 0.13% 843.000us 7.205us 117
aten::zeros 0.04% 270.000us 0.08% 523.000us 87.167us 6
aten::empty_strided 0.01% 60.000us 0.01% 60.000us 8.571us 7
aten::_to_copy 0.01% 45.000us 0.09% 552.000us 78.857us 7
aten::view 0.01% 39.000us 0.01% 39.000us 3.545us 11
aten::empty_like 0.00% 31.000us 0.06% 384.000us 38.400us 10
aten::new_empty 0.00% 23.000us 0.01% 92.000us 11.500us 8
aten::_has_compatible_shallow_copy_type 0.00% 2.000us 0.00% 2.000us 0.031us 64
aten::zero_ 0.00% 1.000us 0.00% 1.000us 0.167us 6
------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------
Self CPU time total: 636.439ms