add Deepseek coder

Browse files

Files changed (1) hide show

README.md +6 -2

README.md CHANGED Viewed

@@ -25,13 +25,17 @@ tags:
 1. Shortened syntax: Only function descriptions are needed for inference and no added instruction is required.
 2. Function descriptions are moved outside of the system prompt. This avoids the behaviour of function calling being affected by how the system prompt had been trained to influence the model.
-Available models:
 - Llama-7B-chat with function calling ([Base Model](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-adapters-v2)), ([GGUF - files are in the main branch of the base model]) - Free
 - Mistral-7B-Instruct-v0.1 with function calling ([Base Model](https://huggingface.co/Trelis/Mistral-7B-Instruct-v0.1-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/Mistral-7B-Instruct-v0.1-function-calling-adapters-v2)) - Paid, [purchase here](https://buy.stripe.com/cN2cNybSdgyncV25kQ)
-- Llama-13B-chat with function calling ([Base Model](https://huggingface.co/Trelis/Llama-2-13b-chat-hf-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/Llama-2-13b-chat-hf-function-calling-adapters-v2)) - Paid, [purchase here](https://buy.stripe.com/9AQ7te3lHdmbdZ68wz)
 - CodeLlama-34B-Instruct with function calling ([Base Model](https://huggingface.co/Trelis/CodeLlama-34b-Instruct-hf-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/CodeLlama-34b-Instruct-hf-function-calling-adapters-v2)) - Paid, [purchase here](https://buy.stripe.com/cN27teg8t2Hx5sA8wM)
 - Llama-70B-chat with function calling ([Base Model](https://huggingface.co/Trelis/Llama-2-70b-chat-hf-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/Llama-2-70b-chat-hf-function-calling-adapters-v2)) - Paid, [purchase here](https://buy.stripe.com/8wMdRC1dzci75sA4gy)
 ## Performance and Tips
 1. Larger models are better at handling function calling. The cross entropy training losses are approximately 0.5 for 7B, 0.4 for 13B, 0.3 for 70B. The absolute numbers don't mean anything but the relative values offer a sense of relative performance.

 1. Shortened syntax: Only function descriptions are needed for inference and no added instruction is required.
 2. Function descriptions are moved outside of the system prompt. This avoids the behaviour of function calling being affected by how the system prompt had been trained to influence the model.
+Most Popular Models:
 - Llama-7B-chat with function calling ([Base Model](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-adapters-v2)), ([GGUF - files are in the main branch of the base model]) - Free
 - Mistral-7B-Instruct-v0.1 with function calling ([Base Model](https://huggingface.co/Trelis/Mistral-7B-Instruct-v0.1-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/Mistral-7B-Instruct-v0.1-function-calling-adapters-v2)) - Paid, [purchase here](https://buy.stripe.com/cN2cNybSdgyncV25kQ)
+- Deepseek-Coder-6.7B-Instruct with function calling ([Base Model](https://huggingface.co/Trelis/deepseek-coder-6.7b-instruct-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/deepseek-coder-6.7b-instruct-function-calling-adapters-v2/settings)) - Paid, [purchase here](https://buy.stripe.com/cN27te5tPa9Z6wEdRo)
+- Deepseek-Coder-33B-Instruct with function calling ([Base Model](https://huggingface.co/Trelis/deepseek-coder-33b-instruct-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/deepseek-coder-33b-instruct-function-calling-adapters-v2/settings)) - Paid, [purchase here](https://buy.stripe.com/9AQ6pabSd81RcV25kT)
 - CodeLlama-34B-Instruct with function calling ([Base Model](https://huggingface.co/Trelis/CodeLlama-34b-Instruct-hf-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/CodeLlama-34b-Instruct-hf-function-calling-adapters-v2)) - Paid, [purchase here](https://buy.stripe.com/cN27teg8t2Hx5sA8wM)
 - Llama-70B-chat with function calling ([Base Model](https://huggingface.co/Trelis/Llama-2-70b-chat-hf-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/Llama-2-70b-chat-hf-function-calling-adapters-v2)) - Paid, [purchase here](https://buy.stripe.com/8wMdRC1dzci75sA4gy)
+- Other Models:
+- - Llama-13B-chat with function calling ([Base Model](https://huggingface.co/Trelis/Llama-2-13b-chat-hf-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/Llama-2-13b-chat-hf-function-calling-adapters-v2)) - Paid, [purchase here](https://buy.stripe.com/9AQ7te3lHdmbdZ68wz)
 ## Performance and Tips
 1. Larger models are better at handling function calling. The cross entropy training losses are approximately 0.5 for 7B, 0.4 for 13B, 0.3 for 70B. The absolute numbers don't mean anything but the relative values offer a sense of relative performance.