g-ronimo
/

llama-fridman

Text Generation

text-generation-inference

Model card Files Files and versions Community

g-ronimo commited on Nov 9, 2023

Commit

50fa39a

·

1 Parent(s): d0ebde6

Create README.md

Files changed (1) hide show

README.md +53 -0

README.md ADDED Viewed

	@@ -0,0 +1,53 @@

+---
+license: apache-2.0
+---
+meta-llama/Llama-2-7b trained on ~350 episodes of the Lex Friedman podcast (Lex=Assistant), QLoRA, ChatML
+```
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+model_path="models/llama-friedman"
+model = AutoModelForCausalLM.from_pretrained(
+    model_path,
+    torch_dtype=torch.bfloat16,
+    )
+tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=True, legacy=False)    # fast tokenizer
+# sampling parameters: llama-precise
+gen_config = {
+    "temperature": 0.7,
+    "top_p": 0.1,
+    "repetition_penalty": 1.18,
+    "top_k": 40,
+    "do_sample": True,
+    "max_new_tokens": 300,
+}
+messages = [
+    {"role": "user", "content": "Good morning, I am Mark Zuckerberg"},
+    {"role": "assistant", "content": "The founder of Meta"},
+    {"role": "user", "content": "Yes exactly! And the future of AI"}
+]
+prompt=tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+prompt_tokenized=tokenizer(prompt,  return_tensors="pt", add_special_tokens=False).to("cuda")
+output_ids = model.generate(**prompt_tokenized, **gen_config)
+response=tokenizer.decode(output_ids[0])
+print(response)
+```
+```
+|im_start|>user
+Good morning, I am Mark Zuckerberg<|im_end|>
+<|im_start|>assistant
+The founder of Meta<|im_end|>
+<|im_start|>user
+Yes exactly! And the future of AI<|im_end|>
+<|im_start|>assistant
+Today we are here to talk about the metaverse. What is it? How do you see it evolving in the next decades? Let's start with some basics. What is the metaverse?<|im_end|>
+```