Muthukamalan commited on
Commit
221a558
·
1 Parent(s): 3bd4938

add pytorch profilier and allow_tf_32

Browse files
Files changed (2) hide show
  1. README.md +10 -6
  2. train.py +2 -1
README.md CHANGED
@@ -12,14 +12,19 @@ license: mit
12
 
13
  # The Unsolved MNIST 🔢
14
  **M**odified **N**ational **I**nstitute for **S**tandards and **T**echnology Dataset
 
15
 
16
  # Description
 
17
 
18
  # Setup
 
19
 
20
  # Objective
 
21
 
22
  # Logs
 
23
 
24
  ## Model Summary
25
  ```log
@@ -187,7 +192,7 @@ Estimated Total Size (MB): 18.53
187
  ```
188
 
189
  ## Training Logs
190
-
191
  ```sh
192
  cd /usr/home/:USER:/UnsolvedMNIST
193
  tensorboard --logdir=logs
@@ -195,6 +200,7 @@ tensorboard --logdir=logs
195
  ```
196
 
197
  ## Performance Profiling
 
198
  ```log
199
  ------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------
200
  Name Self CPU % Self CPU CPU total % CPU total CPU time avg # of Calls
@@ -235,8 +241,6 @@ Self CPU time total: 636.439ms
235
  ```
236
 
237
 
238
- # Credits:
239
- - [pytorch_performance_profiling.md](https://gist.github.com/mingfeima/e08310d7e7bb9ae2a693adecf2d8a916)
240
- - [FLOPs calculation](https://medium.com/@dzmitrybahdanau/the-flops-calculus-of-language-model-training-3b19c1f025e4)
241
- - [software 2.0](https://karpathy.medium.com/software-2-0-a64152b37c35)
242
- - [weight init](https://towardsdatascience.com/weight-initialization-in-neural-networks-a-journey-from-the-basics-to-kaiming-954fb9b47c79)
 
12
 
13
  # The Unsolved MNIST 🔢
14
  **M**odified **N**ational **I**nstitute for **S**tandards and **T**echnology Dataset
15
+ ###### TODO: Implementation
16
 
17
  # Description
18
+ ###### TODO: Implementation
19
 
20
  # Setup
21
+ ###### TODO: Implementation
22
 
23
  # Objective
24
+ ###### TODO: Implementation
25
 
26
  # Logs
27
+ ###### TODO: Implementation
28
 
29
  ## Model Summary
30
  ```log
 
192
  ```
193
 
194
  ## Training Logs
195
+ ###### TODO: Implementation
196
  ```sh
197
  cd /usr/home/:USER:/UnsolvedMNIST
198
  tensorboard --logdir=logs
 
200
  ```
201
 
202
  ## Performance Profiling
203
+ ###### TODO: Implementation
204
  ```log
205
  ------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------
206
  Name Self CPU % Self CPU CPU total % CPU total CPU time avg # of Calls
 
241
  ```
242
 
243
 
244
+
245
+ # Contribution
246
+ ###### TODO: Implementation
 
 
train.py CHANGED
@@ -23,6 +23,7 @@ from utils import TRAIN_TRANSFORMS, TEST_TRANSFORMS
23
 
24
 
25
  # Auxilary utils
 
26
  torch.set_float32_matmul_precision('high')
27
  torch.cuda.amp.autocast(enabled=True,dtype=torch.float32)
28
  device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
@@ -122,7 +123,7 @@ summary(
122
  trainer = pl.Trainer(
123
  max_epochs=CONFIG['training'].get('num_epochs',15),
124
  logger=logger,
125
- profiler=perf_profiler,#'advanced',
126
  callbacks=call_backs,
127
  precision=32,
128
  enable_model_summary=False,
 
23
 
24
 
25
  # Auxilary utils
26
+ torch.backends.cuda.matmul.allow_tf32=True
27
  torch.set_float32_matmul_precision('high')
28
  torch.cuda.amp.autocast(enabled=True,dtype=torch.float32)
29
  device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
 
123
  trainer = pl.Trainer(
124
  max_epochs=CONFIG['training'].get('num_epochs',15),
125
  logger=logger,
126
+ profiler='pytorch',#perf_profiler,#'advanced',
127
  callbacks=call_backs,
128
  precision=32,
129
  enable_model_summary=False,