Llama3-TenyxChat-70B

Runtime error

Sarath Shekkizhar commited on Apr 29, 2024

Commit

9ef606e

verified ·

1 Parent(s): 84a1dc5

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -15,7 +15,7 @@ DESCRIPTION = """
 Llama3-TenyxChat-70B is part of the TenyxChat series, models trained to function as useful assistants.
 The model is obtained via direct preference tuning using Tenyx's fine-tuning technology. Model details available at our model page.
-**The model is currently loaded in bfloat16**.
 """
@@ -29,8 +29,8 @@ if not torch.cuda.is_available():
 if torch.cuda.is_available():
     model_id = "tenyx/Llama3-TenyxChat-70B"
-    model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.bfloat16)
-    # model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", load_in_8bit=True)
     tokenizer = AutoTokenizer.from_pretrained(model_id)
     tokenizer.use_default_system_prompt = False

 Llama3-TenyxChat-70B is part of the TenyxChat series, models trained to function as useful assistants.
 The model is obtained via direct preference tuning using Tenyx's fine-tuning technology. Model details available at our model page.
+**The model is currently loaded in 8-bit**.
 """
 if torch.cuda.is_available():
     model_id = "tenyx/Llama3-TenyxChat-70B"
+    # model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.bfloat16)
+    model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", load_in_8bit=True)
     tokenizer = AutoTokenizer.from_pretrained(model_id)
     tokenizer.use_default_system_prompt = False