Update README.md
Browse files
README.md
CHANGED
@@ -8,21 +8,25 @@ prompt_template: |
|
|
8 |
<|im_start|>user
|
9 |
{prompt}<|im_end|>
|
10 |
<|im_start|>assistant
|
11 |
-
|
12 |
tags:
|
13 |
- deepsparse
|
14 |
---
|
15 |
|
16 |
# OpenHermes 2.5 Mistral 7B - DeepSparse
|
17 |
|
18 |
-
This repo contains [DeepSparse](https://github.com/neuralmagic/deepsparse) model files for [Teknium's OpenHermes 2.5 Mistral 7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B).
|
19 |
|
20 |
This model was quantized and pruned with [SparseGPT](https://arxiv.org/abs/2301.00774), using [SparseML](https://github.com/neuralmagic/sparseml).
|
21 |
|
22 |
## Inference
|
23 |
|
24 |
-
Install DeepSparse:
|
|
|
|
|
|
|
25 |
|
|
|
26 |
```python
|
27 |
from deepsparse import TextGeneration
|
28 |
system_message = ""
|
|
|
8 |
<|im_start|>user
|
9 |
{prompt}<|im_end|>
|
10 |
<|im_start|>assistant
|
11 |
+
quantized_by: mgoin
|
12 |
tags:
|
13 |
- deepsparse
|
14 |
---
|
15 |
|
16 |
# OpenHermes 2.5 Mistral 7B - DeepSparse
|
17 |
|
18 |
+
This repo contains [DeepSparse](https://github.com/neuralmagic/deepsparse), a sparsity-aware CPU inference runtime, model files for [Teknium's OpenHermes 2.5 Mistral 7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B).
|
19 |
|
20 |
This model was quantized and pruned with [SparseGPT](https://arxiv.org/abs/2301.00774), using [SparseML](https://github.com/neuralmagic/sparseml).
|
21 |
|
22 |
## Inference
|
23 |
|
24 |
+
Install [DeepSparse LLM](https://github.com/neuralmagic/deepsparse):
|
25 |
+
```
|
26 |
+
pip install deepsparse-nightly[llm]
|
27 |
+
```
|
28 |
|
29 |
+
Run in a [Python pipeline](https://github.com/neuralmagic/deepsparse/blob/main/docs/llms/text-generation-pipeline.md):
|
30 |
```python
|
31 |
from deepsparse import TextGeneration
|
32 |
system_message = ""
|