Ontocord.AI commited on
Commit
07c1961
·
1 Parent(s): d8c28d4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -22,7 +22,7 @@ The proposed method that we call Multi-Domain Expert Learning (MDEL) for open so
22
 
23
  In this effort, we seek international labs and open source aligned researchers and companies in various countries to each train a set of domain experts of their choosing, thereby enabling international participation and knowledge sharing. This will also result in lower costs for training and a lower environmental impact due to reuse and lower energy usage. Currently we have volunteers from four continents and are looking for more.
24
 
25
- We will be using a varient of the c-BTM (https://arxiv.org/pdf/2303.14177v1.pdf) method and will be focusing on models around 7-20B parameters.
26
 
27
  ## In some of our models, we will also be adding multimodal abilities for both understanding and generation with context lengths of 8K-65K tokens.
28
 
@@ -32,4 +32,4 @@ If you are interested in contributing to this project, please reach out to us an
32
 
33
  Let's work together to create open-source models that benefit everyone! 🤝 #AI #MDEL #Supercomputers #Summit #OpenSource #Innovation #VolunteersNeeded #OpenScience #DemocratizeAI
34
 
35
- ** Why did we change the term Layer to Learning? Because we are exploring, in addition layerwise experts, also working with different architecture like Flamingo (https://arxiv.org/abs/2204.14198) and EMU (https://arxiv.org/abs/2307.05222) which will allow us to swap out different modal experts to improve the performance of the model.
 
22
 
23
  In this effort, we seek international labs and open source aligned researchers and companies in various countries to each train a set of domain experts of their choosing, thereby enabling international participation and knowledge sharing. This will also result in lower costs for training and a lower environmental impact due to reuse and lower energy usage. Currently we have volunteers from four continents and are looking for more.
24
 
25
+ We will be using a varient of the c-BTM (https://arxiv.org/pdf/2303.14177v1.pdf) method and will be focusing on models ranging from 7-70B parameters.
26
 
27
  ## In some of our models, we will also be adding multimodal abilities for both understanding and generation with context lengths of 8K-65K tokens.
28
 
 
32
 
33
  Let's work together to create open-source models that benefit everyone! 🤝 #AI #MDEL #Supercomputers #Summit #OpenSource #Innovation #VolunteersNeeded #OpenScience #DemocratizeAI
34
 
35
+ ** Why did we change the term "Layer" to "Learning"? Because we are exploring, in addition layerwise experts, also working with different adapters and architecture like Flamingo (https://arxiv.org/abs/2204.14198), EMU (https://arxiv.org/abs/2307.05222) and a novel multi-node architecture for training loras we call lora-x, which will allow us to swap out different component experts to improve the performance of the model.