BSC-LT
/

salamandra-7b-instruct-fp8

Text Generation

text-generation-inference

compressed-tensors

🇪🇺 Region: EU

Model card Files Files and versions Community

ferran-espuna commited on Nov 7, 2024

Commit

7c490b6

·

verified ·

1 Parent(s): 08da042

Update README.md

Files changed (1) hide show

README.md +32 -1

README.md CHANGED Viewed

@@ -63,7 +63,38 @@ This model card corresponds to the fp8-quantized version of Salamandra-7b-instru
 The entire Salamandra family is released under a permissive [Apache 2.0 license]((https://www.apache.org/licenses/LICENSE-2.0)).
-## Additional information
 ### Author
 International Business Machines (IBM).

 The entire Salamandra family is released under a permissive [Apache 2.0 license]((https://www.apache.org/licenses/LICENSE-2.0)).
+## Additional information## How to Use
+The following example code works under ``Python 3.9.16``, ``vllm==0.6.3.post1``, ``torch==2.4.0`` and ``torchvision==0.19.0``, though it should run on
+any current version of the libraries. This is an example of a conversational chatbot using the model:
+```
+from vllm import LLM, SamplingParams
+model_name = "BSC-LT/salamandra-7b-instruct-fp8"
+llm = LLM(model=model_name)
+messages = []
+while True:
+    user_input = input("user >> ")
+    if user_input.lower() == "exit":
+        print("Chat ended.")
+        break
+    messages.append({'role': 'user', 'content': user_input})
+    outputs = llm.chat(messages,
+                       sampling_params=SamplingParams(
+                           temperature=0.5,
+                           stop_token_ids=[5],
+                           max_tokens=200)
+                       )[0].outputs
+    model_output = outputs[0].text
+    print(f'assistant >> {model_output}')
+    messages.append({'role': 'assistant', 'content': model_output})
+```
 ### Author
 International Business Machines (IBM).