Ollama/llama.cpp support
#10
by
dpreti
- opened
We observed that at the moment the model is not served correctly with Ollama and llama.cpp .
We are currently investigating the reasons behind this unexpected behavior.
In the meanwhile we strongly suggest to serve the model using vLLM or the Transformer library as showed in the model card.
Velvet-14B model has been released on the ollama library. It can be used in its q4_K_M quantized version with the command
ollama run Almawave/velvet:14b
Other versions, including the q8_0, and fp16 are available and can be found here:
https://ollama.com/Almawave/Velvet
dpreti
changed discussion status to
closed