macadeliccc
/

Mistral-7B-v0.2-OpenHermes

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

macadeliccc commited on Mar 26, 2024

Commit

3ee2ade

·

verified ·

1 Parent(s): a852008

Update README.md

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -45,6 +45,7 @@ python -m vllm.entrypoints.openai.api_server \
 --model teknium/OpenHermes-2.5-Mistral-7B \
 --gpu-memory-utilization 0.9 \ # can go as low as 0.83-0.85 if you need a little more gpu for your application
 --max-model-len 16000 # 32000 if you can run it. This works on 4090
 ```
 ## Gradio chatbot interface for your endpoint

 --model teknium/OpenHermes-2.5-Mistral-7B \
 --gpu-memory-utilization 0.9 \ # can go as low as 0.83-0.85 if you need a little more gpu for your application
 --max-model-len 16000 # 32000 if you can run it. This works on 4090
+--chat-template ./examples/template_chatml.jinja
 ```
 ## Gradio chatbot interface for your endpoint