|
--- |
|
tags: |
|
- deepsparse |
|
--- |
|
# zephyr-7b-alpha for DeepSparse |
|
|
|
## Usage |
|
`pip install deepsparse-nightly[llm]` |
|
```python |
|
from deepsparse import TextGeneration |
|
model = TextGeneration(model="hf:mgoin/zephyr-7b-alpha-ds") |
|
out = model("Once upon a time,", max_new_tokens=100) |
|
print(out.generations[0].text) |
|
### there was a young woman named Lily. She was a kind and gentle soul, with a heart full of love and compassion. Lily had always been fascinated by the natural world, and she spent most of her free time exploring the forests and fields around her home.\n\nOne day, as she was wandering through the woods, Lily stumbled upon a small clearing. In the center of the clearing, she saw a beautiful butterfly fluttering its wings. The butterfly was unlike any she had |
|
``` |
|
|
|
|
|
## How to export from [zephyr-7b-alpha](https://huggingface.co/HuggingFaceH4/zephyr-7b-alpha) |
|
Install SparseML with this [PR](https://github.com/neuralmagic/sparseml/pull/1766) |
|
|
|
``` |
|
git clone https://huggingface.co/HuggingFaceH4/zephyr-7b-alpha |
|
sparseml.transformers.export_onnx --model_path ./zephyr-7b-alpha --task text-generation --sequence_length 512 --trust_remote_code |
|
cp deployment/model.onnx deployment/model-orig.onnx |
|
python ~/onnx_kv_inject.py --input-file deployment/model-orig.onnx --output-file deployment/model.onnx |
|
``` |