nm-testing
/

open-llama-3b-v2-instruct-pruned50-quant-ds

Text Generation

Model card Files Files and versions

mwitiderrick commited on Jan 10, 2024

Commit

cc4c8b3

·

1 Parent(s): e857200

Update README.md

Files changed (1) hide show

README.md +8 -6

README.md CHANGED Viewed

@@ -3,9 +3,9 @@ base_model: mediocredev/open-llama-3b-v2-instruct
 inference: false
 model_type: llama
 prompt_template: |
-  Q:
   {prompt}
-  \nA
 quantized_by: mwitiderrick
 tags:
 - deepsparse
@@ -24,8 +24,10 @@ Run in a [Python pipeline](https://github.com/neuralmagic/deepsparse/blob/main/d
 ```python
 from deepsparse import TextGeneration
-prompt = 'Q: What is the largest animal?\nA:'
-formatted_prompt =  f"Q: {prompt}\nA:"
 model = TextGeneration(model_path="hf:nm-testing/open_llama_3b-pruned-quant-50")
 print(model(formatted_prompt, max_new_tokens=200).generations[0].text)
@@ -37,9 +39,9 @@ print(model(formatted_prompt, max_new_tokens=200).generations[0].text)
 ## Prompt template
 ```
-Q:
 {prompt}
-\nA:
 ```
 ## Sparsification

 inference: false
 model_type: llama
 prompt_template: |
+  ### User:\n
   {prompt}
+  ### Assistant:\n
 quantized_by: mwitiderrick
 tags:
 - deepsparse
 ```python
 from deepsparse import TextGeneration
+system_message = 'You are a helpful assistant, who always provide explanation.'
+user_message = 'How many days are there in a leap year?'
+formatted_prompt = f'### System:\n{system_message}<|endoftext|>\n### User:\n{user_message}<|endoftext|>\n### Assistant:\n'
 model = TextGeneration(model_path="hf:nm-testing/open_llama_3b-pruned-quant-50")
 print(model(formatted_prompt, max_new_tokens=200).generations[0].text)
 ## Prompt template
 ```
+### User:\n
 {prompt}
+### Assistant:\n
 ```
 ## Sparsification