Sweaterdog
/

Andy-3.5

Inference Endpoints

Model card Files Files and versions Community

Sweaterdog commited on Jan 21

Commit

03f1688

·

verified ·

1 Parent(s): abbdd5c

Update README.md

Files changed (1) hide show

README.md +34 -1

README.md CHANGED Viewed

@@ -1,5 +1,13 @@
 ---
 license: apache-2.0
 ---
 # Welcome to a new generation of Minecraft
@@ -21,4 +29,29 @@ Andy 3.5 is designed to be used with MindCraft
 ## How was model trained?
-The dataset was made from

 ---
 license: apache-2.0
+datasets:
+- Sweaterdog/Andy-3.5
+language:
+- en
+base_model:
+- deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
+tags:
+- Minecraft
 ---
 # Welcome to a new generation of Minecraft
 ## How was model trained?
+The model was trained on a dataset of ~10,000 messages coming directly from MindCraft, ensuring quality data
+## What are capabilities and Limitations?
+The smaller model *(The preview ones at least)* had 1/3 of the parameters tuned, the larger preview model has not been released yet.
+Andy-3.5 was trained on EVERYTHING regarding Minecraft and MindCraft, it knows how to use commands natively without a system prompt.
+Andy-3.5 also knows how to build / use !newAction to perform commands, it was trained on lots of building, as well as, using !newAction to do tasks like manually making something or strip mining.
+****Know this is a PREVIEW model, it is NOT finished!****
+## Why a preview model?
+Andy-3.5-preview was made to test the intelligence of a Minecraft Ai with the current dataset, it was meant to see the progress of the training and what area's are needed for the future
+DO NOT expect this model to be able to do everything perfectly, it only knows as much as the dataset told it, as well as the other 2/3 of the untouched parameters allow.
+The model *may* experience bugs, such as not saying your name, getting previous messages confused, or other small things.
+## Important notes and considerations
+The preview model of Andy-3.5-mini *(Andy-3.5-mini-preview)* was trained on a context length of 4096, this was meant to speed up training and VRAM usage.
+The Base model of Andy-3.5-mini-preview was a distilled version of Deepseek-R1, which was a tuned model of Qwen-2.5-1.5b
+Since a context window of 4096 is not nearly enough for MindCraft, you can go higher, Qwen-2.5-1.5b was trained on a context length of 64,000, the distilled version of Deepseek-R1 was trained on a length of 128,000, the usable length may be closer to 32,000 tokens for Andy-3.5-mini-preview
+When the full versions of Andy-3.5 and Andy-3.5-preview release, they will both be trained on a context length of 128,000 to ensure proper usage during playing.