Mais Alheraki
commited on
Commit
Β·
e22146c
1
Parent(s):
bf20bad
Update README.md
Browse files
README.md
CHANGED
@@ -22,16 +22,17 @@ pinned: false
|
|
22 |
CALM: Collaborative Arabic Language Model
|
23 |
</p>
|
24 |
<p class="mb-2">
|
25 |
-
The CALM project is joint effort lead by <u><a target="_blank"
|
26 |
<u><a target="_blank" href="https://yandex.com/">Yandex</a> and <a href="https://huggingface.co/">HuggingFace</a></u> to train an Arabic language model with
|
27 |
volunteers from around the globe. The project is an adaptation of the framework proposed at the NeurIPS 2021 demonstration:
|
28 |
-
<u><a target="_blank"
|
29 |
</p>
|
30 |
<p class="mb-2">
|
31 |
One of the main obstacles facing many researchers in the Arabic NLP community is the lack of computing resources that are needed for training large models. Models with
|
32 |
-
leading performane on Arabic NLP tasks, such as <a target="_blank" href="https://github.com/aub-mind/arabert">AraBERT</a>,
|
33 |
-
<a href="https://github.com/CAMeL-Lab/CAMeLBERT" target="_blank" >CamelBERT</a>,
|
34 |
-
<a href="https://huggingface.co/aubmindlab/araelectra-base-generator" target="_blank" >AraELECTRA</a>, and
|
|
|
35 |
took days to train on TPUs. In the spirit of democratization of AI and community enabling, a core value at NCAI, CALM aims to demonstrate the effectiveness
|
36 |
of collaborative training and form a community of volunteers for ANLP researchers with basic level cloud GPUs who wish to train their own models collaboratively.
|
37 |
</p>
|
@@ -40,7 +41,7 @@ pinned: false
|
|
40 |
Each volunteer GPU trains the model locally at its own pace on a portion of the dataset while another portion is being streamed in the background to reduces local
|
41 |
memory consumption. Computing the gradients and aggregating them is performed in a distributed manner, based on the computing abilities of each participating
|
42 |
volunteer. Details of the distributed training process are further described in the paper
|
43 |
-
<a target="_blank"
|
44 |
</p>
|
45 |
|
46 |
<p class="mb-2" style="font-size:20px;font-weight:bold">
|
@@ -52,15 +53,15 @@ pinned: false
|
|
52 |
</p>
|
53 |
|
54 |
<ul class="mb-2">
|
55 |
-
<li>π Create an account on <a href="https://huggingface.co">Huggingface</a>.</li>
|
56 |
-
<li>π Join the <a href="https://huggingface.co/CALM">NCAI-CALM Organization</a> on Huggingface through the invitation link shared with you by email.</li>
|
57 |
<li>π Get your Access Token, it's later required in the notebook.
|
58 |
</li>
|
59 |
</ul>
|
60 |
|
61 |
<p class="h2 mb-2" style="font-size:18px;font-weight:bold">How to get my Huggingface Access Token</p>
|
62 |
<ul class="mb-2">
|
63 |
-
<li>π Go to your <a href="https://huggingface.co">HF account</a>.</li>
|
64 |
<li>π Go to Settings β Access Tokens.</li>
|
65 |
<li>π Generate a new Access Token and enter any name for "what's this token for".</li>
|
66 |
<li>π Select <code>read</code> role.</li>
|
|
|
22 |
CALM: Collaborative Arabic Language Model
|
23 |
</p>
|
24 |
<p class="mb-2">
|
25 |
+
The CALM project is joint effort lead by <u><a target="_blank" href="https://sdaia.gov.sa/ncai/?Lang=en">NCAI</a></u> in collaboration with
|
26 |
<u><a target="_blank" href="https://yandex.com/">Yandex</a> and <a href="https://huggingface.co/">HuggingFace</a></u> to train an Arabic language model with
|
27 |
volunteers from around the globe. The project is an adaptation of the framework proposed at the NeurIPS 2021 demonstration:
|
28 |
+
<u><a target="_blank" href="https://huggingface.co/training-transformers-together">Training Transformers Together</a></u>.
|
29 |
</p>
|
30 |
<p class="mb-2">
|
31 |
One of the main obstacles facing many researchers in the Arabic NLP community is the lack of computing resources that are needed for training large models. Models with
|
32 |
+
leading performane on Arabic NLP tasks, such as <u><a target="_blank" href="https://github.com/aub-mind/arabert">AraBERT</a></u>,
|
33 |
+
<u><a href="https://github.com/CAMeL-Lab/CAMeLBERT" target="_blank" >CamelBERT</a></u>,
|
34 |
+
<u><a href="https://huggingface.co/aubmindlab/araelectra-base-generator" target="_blank" >AraELECTRA</a></u>, and
|
35 |
+
<u><a href="https://huggingface.co/qarib">QARiB</a></u>,
|
36 |
took days to train on TPUs. In the spirit of democratization of AI and community enabling, a core value at NCAI, CALM aims to demonstrate the effectiveness
|
37 |
of collaborative training and form a community of volunteers for ANLP researchers with basic level cloud GPUs who wish to train their own models collaboratively.
|
38 |
</p>
|
|
|
41 |
Each volunteer GPU trains the model locally at its own pace on a portion of the dataset while another portion is being streamed in the background to reduces local
|
42 |
memory consumption. Computing the gradients and aggregating them is performed in a distributed manner, based on the computing abilities of each participating
|
43 |
volunteer. Details of the distributed training process are further described in the paper
|
44 |
+
<u><a target="_blank" href="https://papers.nips.cc/paper/2021/hash/41a60377ba920919939d83326ebee5a1-Abstract.html">Deep Learning in Open Collaborations</a></u>.
|
45 |
</p>
|
46 |
|
47 |
<p class="mb-2" style="font-size:20px;font-weight:bold">
|
|
|
53 |
</p>
|
54 |
|
55 |
<ul class="mb-2">
|
56 |
+
<li>π Create an account on <u><a target="_blank" href="https://huggingface.co">Huggingface</a></u>.</li>
|
57 |
+
<li>π Join the <u><a target="_blank" href="https://huggingface.co/CALM">NCAI-CALM Organization</a></u> on Huggingface through the invitation link shared with you by email.</li>
|
58 |
<li>π Get your Access Token, it's later required in the notebook.
|
59 |
</li>
|
60 |
</ul>
|
61 |
|
62 |
<p class="h2 mb-2" style="font-size:18px;font-weight:bold">How to get my Huggingface Access Token</p>
|
63 |
<ul class="mb-2">
|
64 |
+
<li>π Go to your <u><a href="https://huggingface.co">HF account</a></u>.</li>
|
65 |
<li>π Go to Settings β Access Tokens.</li>
|
66 |
<li>π Generate a new Access Token and enter any name for "what's this token for".</li>
|
67 |
<li>π Select <code>read</code> role.</li>
|