Ontocord.AI commited on
Commit
d8c28d4
Β·
1 Parent(s): a81ba11

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -6,7 +6,7 @@ colorTo: purple
6
  sdk: static
7
  pinned: false
8
  ---
9
- # Multi-Domain Expert Layers (MDEL) Training:
10
  ## How to increase knowledge without breaking the bank?
11
 
12
  🍩 Ontocord.AI 🍩 and the open source community.
@@ -18,7 +18,7 @@ Bedrock AI, TurkuNLP, ETH, Redmond.AI, Incite, MICS CentraleSupelec, Centro de E
18
 
19
  Open sourcing AI models can lead to increased innovation, accessibility, transparency, and community building. However we need a mechanism to train more capable models in an efficient and modular way.
20
 
21
- The proposed method that we call Multi-Domain Expert Layers (MDEL) training for open source language models involves branching from a base model, training each branch independently on a specific domain for specific layers, and merging the trained models at the end. Additionally, the specific layers are kept as experts, with a classifier used as a router to activate the experts during inference. This approach makes it possible to easily increase expertise of a model, to independently train more "adapters", and to reuse previously trained experts and models without retraining, resulting in a modular and efficient system.
22
 
23
  In this effort, we seek international labs and open source aligned researchers and companies in various countries to each train a set of domain experts of their choosing, thereby enabling international participation and knowledge sharing. This will also result in lower costs for training and a lower environmental impact due to reuse and lower energy usage. Currently we have volunteers from four continents and are looking for more.
24
 
@@ -32,3 +32,4 @@ If you are interested in contributing to this project, please reach out to us an
32
 
33
  Let's work together to create open-source models that benefit everyone! 🀝 #AI #MDEL #Supercomputers #Summit #OpenSource #Innovation #VolunteersNeeded #OpenScience #DemocratizeAI
34
 
 
 
6
  sdk: static
7
  pinned: false
8
  ---
9
+ # Multi-Domain Expert Learning (MDEL)**:
10
  ## How to increase knowledge without breaking the bank?
11
 
12
  🍩 Ontocord.AI 🍩 and the open source community.
 
18
 
19
  Open sourcing AI models can lead to increased innovation, accessibility, transparency, and community building. However we need a mechanism to train more capable models in an efficient and modular way.
20
 
21
+ The proposed method that we call Multi-Domain Expert Learning (MDEL) for open source language models involves branching from a base model, training each branch independently on a specific domain for specific layers or other adapters, and merging the trained models at the end. Additionally, the specific layers or adapters are kept as experts, with a classifier used as a router to activate the experts during inference. This approach makes it possible to easily increase expertise of a model, to independently train more "adapters", and to reuse previously trained experts and models without retraining, resulting in a modular and efficient system.
22
 
23
  In this effort, we seek international labs and open source aligned researchers and companies in various countries to each train a set of domain experts of their choosing, thereby enabling international participation and knowledge sharing. This will also result in lower costs for training and a lower environmental impact due to reuse and lower energy usage. Currently we have volunteers from four continents and are looking for more.
24
 
 
32
 
33
  Let's work together to create open-source models that benefit everyone! 🀝 #AI #MDEL #Supercomputers #Summit #OpenSource #Innovation #VolunteersNeeded #OpenScience #DemocratizeAI
34
 
35
+ ** Why did we change the term Layer to Learning? Because we are exploring, in addition layerwise experts, also working with different architecture like Flamingo (https://arxiv.org/abs/2204.14198) and EMU (https://arxiv.org/abs/2307.05222) which will allow us to swap out different modal experts to improve the performance of the model.