Ontocord.AI
commited on
Commit
·
07c1961
1
Parent(s):
d8c28d4
Update README.md
Browse files
README.md
CHANGED
@@ -22,7 +22,7 @@ The proposed method that we call Multi-Domain Expert Learning (MDEL) for open so
|
|
22 |
|
23 |
In this effort, we seek international labs and open source aligned researchers and companies in various countries to each train a set of domain experts of their choosing, thereby enabling international participation and knowledge sharing. This will also result in lower costs for training and a lower environmental impact due to reuse and lower energy usage. Currently we have volunteers from four continents and are looking for more.
|
24 |
|
25 |
-
We will be using a varient of the c-BTM (https://arxiv.org/pdf/2303.14177v1.pdf) method and will be focusing on models
|
26 |
|
27 |
## In some of our models, we will also be adding multimodal abilities for both understanding and generation with context lengths of 8K-65K tokens.
|
28 |
|
@@ -32,4 +32,4 @@ If you are interested in contributing to this project, please reach out to us an
|
|
32 |
|
33 |
Let's work together to create open-source models that benefit everyone! 🤝 #AI #MDEL #Supercomputers #Summit #OpenSource #Innovation #VolunteersNeeded #OpenScience #DemocratizeAI
|
34 |
|
35 |
-
** Why did we change the term Layer to Learning? Because we are exploring, in addition layerwise experts, also working with different architecture like Flamingo (https://arxiv.org/abs/2204.14198)
|
|
|
22 |
|
23 |
In this effort, we seek international labs and open source aligned researchers and companies in various countries to each train a set of domain experts of their choosing, thereby enabling international participation and knowledge sharing. This will also result in lower costs for training and a lower environmental impact due to reuse and lower energy usage. Currently we have volunteers from four continents and are looking for more.
|
24 |
|
25 |
+
We will be using a varient of the c-BTM (https://arxiv.org/pdf/2303.14177v1.pdf) method and will be focusing on models ranging from 7-70B parameters.
|
26 |
|
27 |
## In some of our models, we will also be adding multimodal abilities for both understanding and generation with context lengths of 8K-65K tokens.
|
28 |
|
|
|
32 |
|
33 |
Let's work together to create open-source models that benefit everyone! 🤝 #AI #MDEL #Supercomputers #Summit #OpenSource #Innovation #VolunteersNeeded #OpenScience #DemocratizeAI
|
34 |
|
35 |
+
** Why did we change the term "Layer" to "Learning"? Because we are exploring, in addition layerwise experts, also working with different adapters and architecture like Flamingo (https://arxiv.org/abs/2204.14198), EMU (https://arxiv.org/abs/2307.05222) and a novel multi-node architecture for training loras we call lora-x, which will allow us to swap out different component experts to improve the performance of the model.
|