justheuristic commited on
Commit
81885f7
·
1 Parent(s): b55d469

link to colab

Browse files
Files changed (1) hide show
  1. static/tabs.html +10 -3
static/tabs.html CHANGED
@@ -89,7 +89,6 @@ a:visited {
89
  is minimal.
90
  The combination of offloading and 8-bit optimizers means that we conserve GPU memory (0 bytes per parameter)
91
  and also use only a limited amount of CPU memory (2 bytes per parameter).
92
-
93
  </p>
94
  <p>
95
  <b>Dataset Streaming</b>
@@ -98,12 +97,20 @@ a:visited {
98
  This can pose a significant problem, as most desktop and cheap cloud instance simply do not have that much space.
99
  Furthermore, downloading the dataset over the internet would take up hours before one can even begin training.
100
  <!--Changing the dataset means downloading a new dataset in full and using additional disk space.-->
101
- </p><p>
 
102
  To circumvent these problems, we stream the training dataset in the same way as you stream online videos.
103
  Participants download a small random portion of the training dataset and immediately begin training on it,
104
  while additional data is loaded in background. As such, we can train a model with virtually no memory
105
- overhead from the dataset and switching to a new dataset is as simple as changing an argument to the streamer class.
106
  </p>
 
 
 
 
 
 
 
107
  </div>
108
  <div role="tabpanel" class="tab-pane" id="tab2">
109
  <p>In this section, we discuss common concerns related to security of the collaborative training.</p>
 
89
  is minimal.
90
  The combination of offloading and 8-bit optimizers means that we conserve GPU memory (0 bytes per parameter)
91
  and also use only a limited amount of CPU memory (2 bytes per parameter).
 
92
  </p>
93
  <p>
94
  <b>Dataset Streaming</b>
 
97
  This can pose a significant problem, as most desktop and cheap cloud instance simply do not have that much space.
98
  Furthermore, downloading the dataset over the internet would take up hours before one can even begin training.
99
  <!--Changing the dataset means downloading a new dataset in full and using additional disk space.-->
100
+ </p>
101
+ <p>
102
  To circumvent these problems, we stream the training dataset in the same way as you stream online videos.
103
  Participants download a small random portion of the training dataset and immediately begin training on it,
104
  while additional data is loaded in background. As such, we can train a model with virtually no memory
105
+ overhead from the dataset and switching to a new dataset is as simple as changing an argument to the dataset class.
106
  </p>
107
+ <center>
108
+ Here's a tutorial for using these techniques:<br>
109
+ <a href="https://colab.research.google.com/gist/justheuristic/75f6a2a731f05a213a55cd2c8a458aaf/fine-tune-a-language-model-with-dataset-streaming-and-8-bit-optimizers.ipynb">
110
+ <img src="https://colab.research.google.com/assets/colab-badge.svg" width=360px>
111
+ </a>
112
+ </center>
113
+
114
  </div>
115
  <div role="tabpanel" class="tab-pane" id="tab2">
116
  <p>In this section, we discuss common concerns related to security of the collaborative training.</p>