|
--- |
|
tags: |
|
- deepsparse |
|
--- |
|
## Usage |
|
|
|
```python |
|
from deepsparse import TextGeneration |
|
|
|
prompt = "How to get in a good university?" |
|
formatted_prompt = f"<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n" |
|
|
|
model = TextGeneration(model_path="hf:nm-testing/TinyLlama-1.1B-Chat-v0.4-pruned60-quant") |
|
print(model(formatted_prompt, max_new_tokens=200).generations[0].text) |
|
|
|
""" |
|
There are several factors to consider when choosing a university: |
|
|
|
1. Location: The university should be located in a region with a high number of students. This will ensure that there are enough students to ensure that there are enough professors. |
|
2. Tuition: The tuition of the university should be low. This will ensure that students have enough money to attend the university. |
|
3. Academic: The university should have a good academic program. This will ensure that students have knowledge of the subject. |
|
4. Faculty: The faculty of the university should be good. This will ensure that professors have knowledge of the subject. |
|
5. Faculty: The faculty of the university should be good. This will ensure that professors have knowledge of the subject. |
|
6. Faculty: The faculty of the university should be good. This will ensure that professors have knowledge of the subject. |
|
""" |
|
``` |
|
## With Repetition Penalty |
|
|
|
```python |
|
from deepsparse import TextGeneration |
|
generation_config = { |
|
"repetition_penalty": 1.1, |
|
"do_sample": True, |
|
"max_new_tokens": 500, |
|
} |
|
prompt = "How to get in a good university?" |
|
formatted_prompt = f"<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n" |
|
model = TextGeneration(model="hf:nm-testing/TinyLlama-1.1B-Chat-v0.4-pruned60-quant") |
|
print(model(formatted_prompt, generation_config=generation_config,).generations[0].text) |
|
""" |
|
The university is one of the best options for students. |
|
It provides the right atmosphere for studying. |
|
The |
|
"""" |
|
``` |
|
## One-shot and Export |
|
|
|
```bash |
|
git clone https://github.com/neuralmagic/sparseml |
|
pip install -e "sparseml[transformers]" |
|
python sparseml/src/sparseml/transformers/sparsification/obcq/obcq.py TinyLlama/TinyLlama-1.1B-Chat-v0.4 open_platypus --recipe recipe.yaml --save True |
|
python sparseml/src/sparseml/transformers/sparsification/obcq/export.py --task text-generation --model_path obcq_deployment |
|
cp deployment/model.onnx deployment/model-orig.onnx |
|
python onnx_kv_inject.py --input-file deployment/model-orig.onnx --output-file deployment/model.onnx |
|
``` |