sky-2002 commited on
Commit
740a5c3
·
verified ·
1 Parent(s): 215c58f

update readme with example

Browse files
Files changed (1) hide show
  1. README.md +16 -6
README.md CHANGED
@@ -14,18 +14,28 @@ licence: license
14
 
15
  # Model Card for tiny-starcoder-ft
16
 
17
- This model is a fine-tuned version of [bigcode/tiny_starcoder_py](https://huggingface.co/bigcode/tiny_starcoder_py).
18
  It has been trained using [TRL](https://github.com/huggingface/trl).
19
 
20
  ## Quick start
21
 
22
  ```python
23
- from transformers import pipeline
 
 
 
 
24
 
25
- question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
26
- generator = pipeline("text-generation", model="sky-2002/tiny-starcoder-ft", device="cuda")
27
- output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
28
- print(output["generated_text"])
 
 
 
 
 
 
29
  ```
30
 
31
  ## Training procedure
 
14
 
15
  # Model Card for tiny-starcoder-ft
16
 
17
+ This model is a fine-tuned version of [bigcode/tiny_starcoder_py](https://huggingface.co/bigcode/tiny_starcoder_py) using a samples from [iamtarun/python_code_instructions_18k_alpaca](https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca) dataset.
18
  It has been trained using [TRL](https://github.com/huggingface/trl).
19
 
20
  ## Quick start
21
 
22
  ```python
23
+ model_name = "sky-2002/tiny-starcoder-ft"
24
+ model = AutoModelForCausalLM.from_pretrained(
25
+ pretrained_model_name_or_path=model_name
26
+ ).to(device)
27
+ tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=model_name)
28
 
29
+ prompt = "Write python code to calculate sum of a list"
30
+
31
+ # Format with template
32
+ messages = [{"role": "user", "content": prompt}]
33
+ formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False)
34
+
35
+ inputs = tokenizer(formatted_prompt, return_tensors="pt").to(device)
36
+
37
+ outputs = model.generate(**inputs, max_new_tokens=100)
38
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
39
  ```
40
 
41
  ## Training procedure