Update README.md
Browse files
README.md
CHANGED
@@ -107,8 +107,7 @@ model-index:
|
|
107 |
|
108 |
**DeepAutoAI/Explore_Llama-3.1-8B-Inst** is developed by **deepAuto.ai** by learning the distribution of llama-3.1-8B-instruct.
|
109 |
Our approach leverages the base model’s pretrained weights and optimizes them for the **Winogrande** and **ARC-Challenge** datasets by
|
110 |
-
training a latent diffusion model on the pretrained weights. specifically , this model is based on learning the distrinution of
|
111 |
-
the last transformer block, 30, and 24th FFN layers of the original Llama model.
|
112 |
|
113 |
Through this process, we learn the distribution of the base model's weight space, enabling us to explore optimal configurations.
|
114 |
We then sample multiple sets of weights, using the **model-soup averaging technique** to identify the best-performing weights for both datasets.
|
|
|
107 |
|
108 |
**DeepAutoAI/Explore_Llama-3.1-8B-Inst** is developed by **deepAuto.ai** by learning the distribution of llama-3.1-8B-instruct.
|
109 |
Our approach leverages the base model’s pretrained weights and optimizes them for the **Winogrande** and **ARC-Challenge** datasets by
|
110 |
+
training a latent diffusion model on the pretrained weights. specifically , this model is based on learning the distrinution of transformer layers from 16 to 31.
|
|
|
111 |
|
112 |
Through this process, we learn the distribution of the base model's weight space, enabling us to explore optimal configurations.
|
113 |
We then sample multiple sets of weights, using the **model-soup averaging technique** to identify the best-performing weights for both datasets.
|