UnsolvedMNIST / README.md
Muthukamalan's picture
add pytorch profilier and allow_tf_32
221a558

A newer version of the Gradio SDK is available: 5.23.3

Upgrade
metadata
title: UnsolvedMNIST
emoji: 🌍
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 4.37.2
app_file: app.py
pinned: false
license: mit

The Unsolved MNIST πŸ”’

Modified National Institute for Standards and Technology Dataset

TODO: Implementation

Description

TODO: Implementation

Setup

TODO: Implementation

Objective

TODO: Implementation

Logs

TODO: Implementation

Model Summary

========================================================================================================================
Layer (type (var_name))                  Input Shape      Output Shape     Param #          Kernel Shape     Mult-Adds
========================================================================================================================
LitMNISTModel (LitMNISTModel)            [32, 1, 28, 28]  [32, 10]         --               --               --
β”œβ”€Net (model)                            [32, 1, 28, 28]  [32, 10]         --               --               --
β”‚    └─conv1.0.weight                                                      β”œβ”€72             [8, 1, 3, 3]
β”‚    └─conv1.2.weight                                                      β”œβ”€8              [8]
β”‚    └─conv1.2.bias                                                        β”œβ”€8              [8]
β”‚    └─conv1.4.weight                                                      β”œβ”€720            [10, 8, 3, 3]
β”‚    └─conv1.6.weight                                                      β”œβ”€10             [10]
β”‚    └─conv1.6.bias                                                        β”œβ”€10             [10]
β”‚    └─conv1.8.weight                                                      β”œβ”€900            [10, 10, 3, 3]
β”‚    └─conv1.10.weight                                                     β”œβ”€10             [10]
β”‚    └─conv1.10.bias                                                       β”œβ”€10             [10]
β”‚    └─trans1.1.weight                                                     β”œβ”€80             [8, 10, 1, 1]
β”‚    └─conv2.0.weight                                                      β”œβ”€720            [10, 8, 3, 3]
β”‚    └─conv2.1.weight                                                      β”œβ”€10             [10]
β”‚    └─conv2.1.bias                                                        β”œβ”€10             [10]
β”‚    └─conv2.4.weight                                                      β”œβ”€1,080          [12, 10, 3, 3]
β”‚    └─conv2.5.weight                                                      β”œβ”€12             [12]
β”‚    └─conv2.5.bias                                                        β”œβ”€12             [12]
β”‚    └─conv2.8.weight                                                      β”œβ”€1,296          [12, 12, 3, 3]
β”‚    └─conv2.9.weight                                                      β”œβ”€12             [12]
β”‚    └─conv2.9.bias                                                        β”œβ”€12             [12]
β”‚    └─trans2.1.weight                                                     β”œβ”€96             [8, 12, 1, 1]
β”‚    └─trans2.2.weight                                                     β”œβ”€8              [8]
β”‚    └─trans2.2.bias                                                       β”œβ”€8              [8]
β”‚    └─conv3.0.weight                                                      β”œβ”€720            [10, 8, 3, 3]
β”‚    └─conv3.1.weight                                                      β”œβ”€10             [10]
β”‚    └─conv3.1.bias                                                        β”œβ”€10             [10]
β”‚    └─conv3.4.weight                                                      β”œβ”€1,080          [12, 10, 3, 3]
β”‚    └─conv3.6.weight                                                      β”œβ”€12             [12]
β”‚    └─conv3.6.bias                                                        β”œβ”€12             [12]
β”‚    └─trans3.0.weight                                                     β”œβ”€120            [10, 12, 1, 1]
β”‚    └─trans3.2.weight                                                     β”œβ”€10             [10]
β”‚    └─trans3.2.bias                                                       β”œβ”€10             [10]
β”‚    └─out4.0.weight                                                       └─900            [10, 10, 3, 3]
β”‚    └─Sequential (conv1)                [32, 1, 28, 28]  [32, 10, 28, 28] --               --               --
β”‚    β”‚    └─0.weight                                                       β”œβ”€72             [8, 1, 3, 3]
β”‚    β”‚    └─2.weight                                                       β”œβ”€8              [8]
β”‚    β”‚    └─2.bias                                                         β”œβ”€8              [8]
β”‚    β”‚    └─4.weight                                                       β”œβ”€720            [10, 8, 3, 3]
β”‚    β”‚    └─6.weight                                                       β”œβ”€10             [10]
β”‚    β”‚    └─6.bias                                                         β”œβ”€10             [10]
β”‚    β”‚    └─8.weight                                                       β”œβ”€900            [10, 10, 3, 3]
β”‚    β”‚    └─10.weight                                                      β”œβ”€10             [10]
β”‚    β”‚    └─10.bias                                                        └─10             [10]
β”‚    β”‚    └─Conv2d (0)                   [32, 1, 28, 28]  [32, 8, 28, 28]  72               [3, 3]           1,806,336
β”‚    β”‚    β”‚    └─weight                                                    └─72             [1, 8, 3, 3]
β”‚    β”‚    └─ReLU (1)                     [32, 8, 28, 28]  [32, 8, 28, 28]  --               --               --
β”‚    β”‚    └─BatchNorm2d (2)              [32, 8, 28, 28]  [32, 8, 28, 28]  16               --               512
β”‚    β”‚    β”‚    └─weight                                                    β”œβ”€8              [8]
β”‚    β”‚    β”‚    └─bias                                                      └─8              [8]
β”‚    β”‚    └─Dropout2d (3)                [32, 8, 28, 28]  [32, 8, 28, 28]  --               --               --
β”‚    β”‚    └─Conv2d (4)                   [32, 8, 28, 28]  [32, 10, 28, 28] 720              [3, 3]           18,063,360
β”‚    β”‚    β”‚    └─weight                                                    └─720            [8, 10, 3, 3]
β”‚    β”‚    └─ReLU (5)                     [32, 10, 28, 28] [32, 10, 28, 28] --               --               --
β”‚    β”‚    └─BatchNorm2d (6)              [32, 10, 28, 28] [32, 10, 28, 28] 20               --               640
β”‚    β”‚    β”‚    └─weight                                                    β”œβ”€10             [10]
β”‚    β”‚    β”‚    └─bias                                                      └─10             [10]
β”‚    β”‚    └─Dropout2d (7)                [32, 10, 28, 28] [32, 10, 28, 28] --               --               --
β”‚    β”‚    └─Conv2d (8)                   [32, 10, 28, 28] [32, 10, 28, 28] 900              [3, 3]           22,579,200
β”‚    β”‚    β”‚    └─weight                                                    └─900            [10, 10, 3, 3]
β”‚    β”‚    └─ReLU (9)                     [32, 10, 28, 28] [32, 10, 28, 28] --               --               --
β”‚    β”‚    └─BatchNorm2d (10)             [32, 10, 28, 28] [32, 10, 28, 28] 20               --               640
β”‚    β”‚    β”‚    └─weight                                                    β”œβ”€10             [10]
β”‚    β”‚    β”‚    └─bias                                                      └─10             [10]
β”‚    β”‚    └─Dropout2d (11)               [32, 10, 28, 28] [32, 10, 28, 28] --               --               --
β”‚    └─Sequential (trans1)               [32, 10, 28, 28] [32, 8, 17, 17]  --               --               --
β”‚    β”‚    └─1.weight                                                       └─80             [8, 10, 1, 1]
β”‚    β”‚    └─MaxPool2d (0)                [32, 10, 28, 28] [32, 10, 15, 15] --               2                --
β”‚    β”‚    └─Conv2d (1)                   [32, 10, 15, 15] [32, 8, 17, 17]  80               [1, 1]           739,840
β”‚    β”‚    β”‚    └─weight                                                    └─80             [10, 8, 1, 1]
β”‚    └─Sequential (conv2)                [32, 8, 17, 17]  [32, 12, 17, 17] --               --               --
β”‚    β”‚    └─0.weight                                                       β”œβ”€720            [10, 8, 3, 3]
β”‚    β”‚    └─1.weight                                                       β”œβ”€10             [10]
β”‚    β”‚    └─1.bias                                                         β”œβ”€10             [10]
β”‚    β”‚    └─4.weight                                                       β”œβ”€1,080          [12, 10, 3, 3]
β”‚    β”‚    └─5.weight                                                       β”œβ”€12             [12]
β”‚    β”‚    └─5.bias                                                         β”œβ”€12             [12]
β”‚    β”‚    └─8.weight                                                       β”œβ”€1,296          [12, 12, 3, 3]
β”‚    β”‚    └─9.weight                                                       β”œβ”€12             [12]
β”‚    β”‚    └─9.bias                                                         └─12             [12]
β”‚    β”‚    └─Conv2d (0)                   [32, 8, 17, 17]  [32, 10, 17, 17] 720              [3, 3]           6,658,560
β”‚    β”‚    β”‚    └─weight                                                    └─720            [8, 10, 3, 3]
β”‚    β”‚    └─BatchNorm2d (1)              [32, 10, 17, 17] [32, 10, 17, 17] 20               --               640
β”‚    β”‚    β”‚    └─weight                                                    β”œβ”€10             [10]
β”‚    β”‚    β”‚    └─bias                                                      └─10             [10]
β”‚    β”‚    └─ReLU (2)                     [32, 10, 17, 17] [32, 10, 17, 17] --               --               --
β”‚    β”‚    └─Dropout2d (3)                [32, 10, 17, 17] [32, 10, 17, 17] --               --               --
β”‚    β”‚    └─Conv2d (4)                   [32, 10, 17, 17] [32, 12, 17, 17] 1,080            [3, 3]           9,987,840
β”‚    β”‚    β”‚    └─weight                                                    └─1,080          [10, 12, 3, 3]
β”‚    β”‚    └─BatchNorm2d (5)              [32, 12, 17, 17] [32, 12, 17, 17] 24               --               768
β”‚    β”‚    β”‚    └─weight                                                    β”œβ”€12             [12]
β”‚    β”‚    β”‚    └─bias                                                      └─12             [12]
β”‚    β”‚    └─ReLU (6)                     [32, 12, 17, 17] [32, 12, 17, 17] --               --               --
β”‚    β”‚    └─Dropout2d (7)                [32, 12, 17, 17] [32, 12, 17, 17] --               --               --
β”‚    β”‚    └─Conv2d (8)                   [32, 12, 17, 17] [32, 12, 17, 17] 1,296            [3, 3]           11,985,408
β”‚    β”‚    β”‚    └─weight                                                    └─1,296          [12, 12, 3, 3]
β”‚    β”‚    └─BatchNorm2d (9)              [32, 12, 17, 17] [32, 12, 17, 17] 24               --               768
β”‚    β”‚    β”‚    └─weight                                                    β”œβ”€12             [12]
β”‚    β”‚    β”‚    └─bias                                                      └─12             [12]
β”‚    β”‚    └─ReLU (10)                    [32, 12, 17, 17] [32, 12, 17, 17] --               --               --
β”‚    β”‚    └─Dropout2d (11)               [32, 12, 17, 17] [32, 12, 17, 17] --               --               --
β”‚    └─Sequential (trans2)               [32, 12, 17, 17] [32, 8, 9, 9]    --               --               --
β”‚    β”‚    └─1.weight                                                       β”œβ”€96             [8, 12, 1, 1]
β”‚    β”‚    └─2.weight                                                       β”œβ”€8              [8]
β”‚    β”‚    └─2.bias                                                         └─8              [8]
β”‚    β”‚    └─MaxPool2d (0)                [32, 12, 17, 17] [32, 12, 9, 9]   --               2                --
β”‚    β”‚    └─Conv2d (1)                   [32, 12, 9, 9]   [32, 8, 9, 9]    96               [1, 1]           248,832
β”‚    β”‚    β”‚    └─weight                                                    └─96             [12, 8, 1, 1]
β”‚    β”‚    └─BatchNorm2d (2)              [32, 8, 9, 9]    [32, 8, 9, 9]    16               --               512
β”‚    β”‚    β”‚    └─weight                                                    β”œβ”€8              [8]
β”‚    β”‚    β”‚    └─bias                                                      └─8              [8]
β”‚    └─Sequential (conv3)                [32, 8, 9, 9]    [32, 12, 9, 9]   --               --               --
β”‚    β”‚    └─0.weight                                                       β”œβ”€720            [10, 8, 3, 3]
β”‚    β”‚    └─1.weight                                                       β”œβ”€10             [10]
β”‚    β”‚    └─1.bias                                                         β”œβ”€10             [10]
β”‚    β”‚    └─4.weight                                                       β”œβ”€1,080          [12, 10, 3, 3]
β”‚    β”‚    └─6.weight                                                       β”œβ”€12             [12]
β”‚    β”‚    └─6.bias                                                         └─12             [12]
β”‚    β”‚    └─Conv2d (0)                   [32, 8, 9, 9]    [32, 10, 9, 9]   720              [3, 3]           1,866,240
β”‚    β”‚    β”‚    └─weight                                                    └─720            [8, 10, 3, 3]
β”‚    β”‚    └─BatchNorm2d (1)              [32, 10, 9, 9]   [32, 10, 9, 9]   20               --               640
β”‚    β”‚    β”‚    └─weight                                                    β”œβ”€10             [10]
β”‚    β”‚    β”‚    └─bias                                                      └─10             [10]
β”‚    β”‚    └─ReLU (2)                     [32, 10, 9, 9]   [32, 10, 9, 9]   --               --               --
β”‚    β”‚    └─Dropout2d (3)                [32, 10, 9, 9]   [32, 10, 9, 9]   --               --               --
β”‚    β”‚    └─Conv2d (4)                   [32, 10, 9, 9]   [32, 12, 9, 9]   1,080            [3, 3]           2,799,360
β”‚    β”‚    β”‚    └─weight                                                    └─1,080          [10, 12, 3, 3]
β”‚    β”‚    └─ReLU (5)                     [32, 12, 9, 9]   [32, 12, 9, 9]   --               --               --
β”‚    β”‚    └─BatchNorm2d (6)              [32, 12, 9, 9]   [32, 12, 9, 9]   24               --               768
β”‚    β”‚    β”‚    └─weight                                                    β”œβ”€12             [12]
β”‚    β”‚    β”‚    └─bias                                                      └─12             [12]
β”‚    β”‚    └─Dropout2d (7)                [32, 12, 9, 9]   [32, 12, 9, 9]   --               --               --
β”‚    └─Sequential (trans3)               [32, 12, 9, 9]   [32, 10, 4, 4]   --               --               --
β”‚    β”‚    └─0.weight                                                       β”œβ”€120            [10, 12, 1, 1]
β”‚    β”‚    └─2.weight                                                       β”œβ”€10             [10]
β”‚    β”‚    └─2.bias                                                         └─10             [10]
β”‚    β”‚    └─Conv2d (0)                   [32, 12, 9, 9]   [32, 10, 9, 9]   120              [1, 1]           311,040
β”‚    β”‚    β”‚    └─weight                                                    └─120            [12, 10, 1, 1]
β”‚    β”‚    └─MaxPool2d (1)                [32, 10, 9, 9]   [32, 10, 4, 4]   --               2                --
β”‚    β”‚    └─BatchNorm2d (2)              [32, 10, 4, 4]   [32, 10, 4, 4]   20               --               640
β”‚    β”‚    β”‚    └─weight                                                    β”œβ”€10             [10]
β”‚    β”‚    β”‚    └─bias                                                      └─10             [10]
β”‚    └─Sequential (out4)                 [32, 10, 4, 4]   [32, 10, 1, 1]   --               --               --
β”‚    β”‚    └─0.weight                                                       └─900            [10, 10, 3, 3]
β”‚    β”‚    └─Conv2d (0)                   [32, 10, 4, 4]   [32, 10, 4, 4]   900              [3, 3]           460,800
β”‚    β”‚    β”‚    └─weight                                                    └─900            [10, 10, 3, 3]
β”‚    β”‚    └─AvgPool2d (1)                [32, 10, 4, 4]   [32, 10, 1, 1]   --               3                --
========================================================================================================================
Total params: 7,988
Trainable params: 7,988
Non-trainable params: 0
Total mult-adds (Units.MEGABYTES): 77.51
========================================================================================================================
Input size (MB): 0.10
Forward/backward pass size (MB): 18.40
Params size (MB): 0.03
Estimated Total Size (MB): 18.53
========================================================================================================================

Training Logs

TODO: Implementation
cd /usr/home/:USER:/UnsolvedMNIST
tensorboard --logdir=logs

Performance Profiling

TODO: Implementation
-------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  
                                       Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg    # of Calls  
-------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  
                    aten::cudnn_convolution        69.86%     444.597ms        69.86%     444.597ms      37.050ms            12  
                         aten::_log_softmax         7.52%      47.831ms         7.52%      47.831ms      47.831ms             1  
                            aten::clamp_min         4.42%      28.104ms         4.42%      28.104ms       3.513ms             8  
                     aten::cudnn_batch_norm         4.13%      26.264ms         4.20%      26.758ms       2.676ms            10  
                                 aten::add_         3.47%      22.086ms         3.47%      22.086ms       2.209ms            10  
                           aten::bernoulli_         2.79%      17.777ms         2.79%      17.777ms       2.222ms             8  
                                 aten::div_         2.53%      16.126ms         2.53%      16.126ms       2.016ms             8  
                                  aten::mul         2.29%      14.584ms         2.29%      14.584ms       1.823ms             8  
                           aten::avg_pool2d         0.63%       4.009ms         0.63%       4.009ms       4.009ms             1  
              aten::max_pool2d_with_indices         0.54%       3.446ms         0.54%       3.446ms       1.149ms             3  
                          aten::convolution         0.39%       2.469ms        70.31%     447.487ms      37.291ms            12  
                                 aten::relu         0.28%       1.804ms         4.70%      29.908ms       3.739ms             8  
               aten::_batch_norm_impl_index         0.22%       1.430ms         4.43%      28.188ms       2.819ms            10  
                           aten::batch_norm         0.16%       1.006ms         4.59%      29.194ms       2.919ms            10  
                                aten::empty         0.12%     757.000us         0.12%     757.000us      11.828us            64  
                           aten::max_pool2d         0.12%     751.000us         0.66%       4.197ms       1.399ms             3  
                          aten::log_softmax         0.10%     653.000us         7.62%      48.484ms      48.484ms             1  
                               aten::conv2d         0.10%     636.000us        70.41%     448.123ms      37.344ms            12  
                      aten::feature_dropout         0.08%     479.000us         7.71%      49.058ms       6.132ms             8  
                                aten::copy_         0.07%     447.000us         0.07%     447.000us      63.857us             7  
                         aten::_convolution         0.07%     421.000us        69.92%     445.018ms      37.085ms            12  
                                   aten::to         0.05%     291.000us         0.13%     843.000us       7.205us           117  
                                aten::zeros         0.04%     270.000us         0.08%     523.000us      87.167us             6  
                        aten::empty_strided         0.01%      60.000us         0.01%      60.000us       8.571us             7  
                             aten::_to_copy         0.01%      45.000us         0.09%     552.000us      78.857us             7  
                                 aten::view         0.01%      39.000us         0.01%      39.000us       3.545us            11  
                           aten::empty_like         0.00%      31.000us         0.06%     384.000us      38.400us            10  
                            aten::new_empty         0.00%      23.000us         0.01%      92.000us      11.500us             8  
    aten::_has_compatible_shallow_copy_type         0.00%       2.000us         0.00%       2.000us       0.031us            64  
                                aten::zero_         0.00%       1.000us         0.00%       1.000us       0.167us             6  
-------------------------------------------  ------------  ------------  ------------  ------------  ------------  ------------  
Self CPU time total: 636.439ms

Contribution

TODO: Implementation