Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,13 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
4 |
|
5 |
# Welcome to a new generation of Minecraft
|
@@ -21,4 +29,29 @@ Andy 3.5 is designed to be used with MindCraft
|
|
21 |
|
22 |
## How was model trained?
|
23 |
|
24 |
-
The
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
datasets:
|
4 |
+
- Sweaterdog/Andy-3.5
|
5 |
+
language:
|
6 |
+
- en
|
7 |
+
base_model:
|
8 |
+
- deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
|
9 |
+
tags:
|
10 |
+
- Minecraft
|
11 |
---
|
12 |
|
13 |
# Welcome to a new generation of Minecraft
|
|
|
29 |
|
30 |
## How was model trained?
|
31 |
|
32 |
+
The model was trained on a dataset of ~10,000 messages coming directly from MindCraft, ensuring quality data
|
33 |
+
|
34 |
+
## What are capabilities and Limitations?
|
35 |
+
|
36 |
+
The smaller model *(The preview ones at least)* had 1/3 of the parameters tuned, the larger preview model has not been released yet.
|
37 |
+
Andy-3.5 was trained on EVERYTHING regarding Minecraft and MindCraft, it knows how to use commands natively without a system prompt.
|
38 |
+
Andy-3.5 also knows how to build / use !newAction to perform commands, it was trained on lots of building, as well as, using !newAction to do tasks like manually making something or strip mining.
|
39 |
+
|
40 |
+
****Know this is a PREVIEW model, it is NOT finished!****
|
41 |
+
|
42 |
+
## Why a preview model?
|
43 |
+
|
44 |
+
Andy-3.5-preview was made to test the intelligence of a Minecraft Ai with the current dataset, it was meant to see the progress of the training and what area's are needed for the future
|
45 |
+
DO NOT expect this model to be able to do everything perfectly, it only knows as much as the dataset told it, as well as the other 2/3 of the untouched parameters allow.
|
46 |
+
The model *may* experience bugs, such as not saying your name, getting previous messages confused, or other small things.
|
47 |
+
|
48 |
+
## Important notes and considerations
|
49 |
+
|
50 |
+
The preview model of Andy-3.5-mini *(Andy-3.5-mini-preview)* was trained on a context length of 4096, this was meant to speed up training and VRAM usage.
|
51 |
+
|
52 |
+
The Base model of Andy-3.5-mini-preview was a distilled version of Deepseek-R1, which was a tuned model of Qwen-2.5-1.5b
|
53 |
+
|
54 |
+
Since a context window of 4096 is not nearly enough for MindCraft, you can go higher, Qwen-2.5-1.5b was trained on a context length of 64,000, the distilled version of Deepseek-R1 was trained on a length of 128,000, the usable length may be closer to 32,000 tokens for Andy-3.5-mini-preview
|
55 |
+
|
56 |
+
|
57 |
+
When the full versions of Andy-3.5 and Andy-3.5-preview release, they will both be trained on a context length of 128,000 to ensure proper usage during playing.
|