pszemraj commited on
Commit
23625d3
·
1 Parent(s): bdd82da

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +136 -0
README.md ADDED
@@ -0,0 +1,136 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - BEE-spoke-data/pypi_clean-deduped
5
+ source_model: BEE-spoke-data/smol_llama-101M-GQA
6
+ language:
7
+ - en
8
+ tags:
9
+ - python
10
+ - codegen
11
+ - markdown
12
+ - smol_llama
13
+ metrics:
14
+ - accuracy
15
+ inference:
16
+ parameters:
17
+ max_new_tokens: 64
18
+ min_new_tokens: 8
19
+ num_beams: 4
20
+ early_stopping: true
21
+ no_repeat_ngram_size: 7
22
+ repetition_penalty: 1.05
23
+ renormalize_logits: true
24
+ widget:
25
+ - text: |
26
+ def add_numbers(a, b):
27
+ return
28
+ example_title: Add Numbers Function
29
+ - text: |
30
+ class Car:
31
+ def __init__(self, make, model):
32
+ self.make = make
33
+ self.model = model
34
+
35
+ def display_car(self):
36
+ example_title: Car Class
37
+ - text: |
38
+ import pandas as pd
39
+ data = {'Name': ['Tom', 'Nick', 'John'], 'Age': [20, 21, 19]}
40
+ df = pd.DataFrame(data).convert_dtypes() # eda
41
+ example_title: Pandas DataFrame
42
+ - text: |
43
+ def factorial(n):
44
+ if n == 0:
45
+ return 1
46
+ else:
47
+ example_title: Factorial Function
48
+ - text: |
49
+ def fibonacci(n):
50
+ if n <= 0:
51
+ raise ValueError("Incorrect input")
52
+ elif n == 1:
53
+ return 0
54
+ elif n == 2:
55
+ return 1
56
+ else:
57
+ example_title: Fibonacci Function
58
+ - text: |
59
+ import matplotlib.pyplot as plt import numpy as np
60
+ x = np.linspace(0, 10, 100)
61
+ # simple plot
62
+ example_title: Matplotlib Plot
63
+ - text: |
64
+ def reverse_string(s:str) -> str:
65
+ return
66
+ example_title: Reverse String Function
67
+ - text: |
68
+ def is_palindrome(word:str) -> bool:
69
+ return
70
+ example_title: Palindrome Function
71
+ - text: |
72
+ def bubble_sort(lst: list):
73
+ n = len(lst)
74
+ for i in range(n):
75
+ for j in range(0, n-i-1):
76
+ example_title: Bubble Sort Function
77
+ - text: |
78
+ def binary_search(arr, low, high, x):
79
+ if high >= low:
80
+ mid = (high + low) // 2
81
+ if arr[mid] == x:
82
+ return mid
83
+ elif arr[mid] > x:
84
+ example_title: Binary Search Function
85
+ ---
86
+
87
+ # smol_llama-101M-GQA: python
88
+
89
+ > These are some quick notes, will update a bit more over the next few days.
90
+
91
+ This is the general pre-trained checkpoint `BEE-spoke-data/smol_llama-101M-GQA` trained further on a deduped version of `pypi` for one epoch.
92
+
93
+ - It has the same architecture as the base, the only change being the inclusion (_& training on_) of new Python-related tokens
94
+ - The model appears capable of generating basic Python code and README-style markdown
95
+ - This experiment aims to test how well this model size can handle code generation; meaning **both** its capabilities and limitations.
96
+
97
+ Please use with caution & understand that there may still be some bugs 🐛 to work out
98
+ ## Usage
99
+
100
+ ## Usage
101
+
102
+ Please consider the following points before using the model:
103
+
104
+ 1. The model is trained exclusively with the "slow" llama2 tokenizer. Ensure to set `use_fast=False` while loading the tokenizer for optimal performance.
105
+ 2. It is recommended to use version `4.33.3` for successful model loading due to a known problem in transformers 4.34.1.
106
+
107
+ Note: The use of the tokenizer in the API widget is unclear and may result in additional whitespace.
108
+
109
+ Here's how to install the necessary packages and load the model:
110
+
111
+ ```python
112
+ # pip install transformers==4.33.3 accelerate sentencepiece
113
+
114
+ from transformers import AutoTokenizer, AutoModelForCausalLM
115
+
116
+ tokenizer = AutoTokenizer.from_pretrained(
117
+ "BEE-spoke-data/smol_llama-101M-GQA-python",
118
+ use_fast=False,
119
+ )
120
+ model = AutoModelForCausalLM.from_pretrained(
121
+ "BEE-spoke-data/smol_llama-101M-GQA-python",
122
+ device_map="auto",
123
+ )
124
+
125
+ # use as any other decoder
126
+ ```
127
+
128
+ For code generation tasks, it is recommended to use beam search or similar methods instead of sampling. A more detailed example is also provided in the summary section below.
129
+
130
+ ### longer code-gen example
131
+
132
+
133
+ Below is a quick script that can be used as a reference/starting point for writing your own, better one :)
134
+
135
+
136
+