Spaces:
Runtime error
Runtime error
Commit
·
bc258f9
1
Parent(s):
727081c
Delete architectures
Browse files- architectures/codegen.md +0 -25
- architectures/codeparrot.md +0 -33
- architectures/incoder.md +0 -31
- architectures/intro.md +0 -8
- architectures/polycoder.md +0 -14
architectures/codegen.md
DELETED
@@ -1,25 +0,0 @@
|
|
1 |
-
The CodeGen architecture follows a standard transformer decoder with left-to-right causal masking. With rotary position embedding for the positional encoding [(Su et al., 2021)](https://arxiv.org/abs/2104.09864), and a context length of 2048. CodeGen models are trained in various sizes.
|
2 |
-
|
3 |
-
<div align="center">
|
4 |
-
|
5 |
-
|Model | # parameters |
|
6 |
-
| - | - |
|
7 |
-
| [Salesforce/codegen-350m-mono](https://huggingface.co/Salesforce/codegen-350-mono) | 350M |
|
8 |
-
| [Salesforce/codegen-2B-mono](https://huggingface.co/Salesforce/codegen-2B-mono) | 2.7B |
|
9 |
-
| [Salesforce/codegen-6B-mono](https://huggingface.co/Salesforce/codegen-6B-mono) | 6.1B |
|
10 |
-
| [Salesforce/codegen-16B-mono](https://huggingface.co/Salesforce/codegen-16B-mono) | 16.1B |
|
11 |
-
|
12 |
-
</div>
|
13 |
-
|
14 |
-
|
15 |
-
You can load the model and tokenizer directly from 🤗 [`transformers`](https://huggingface.co/docs/transformers/index):
|
16 |
-
|
17 |
-
```python
|
18 |
-
from transformers import AutoTokenizer, AutoModelForCausalLM
|
19 |
-
|
20 |
-
tokenizer = AutoTokenizer.from_pretrained('Salesforce/codegen-16B-mono')
|
21 |
-
model = AutoModelForCausalLM.from_pretrained('Salesforce/codegen-16B-mono')
|
22 |
-
|
23 |
-
inputs = tokenizer("def hello_world():", return_tensors="pt")
|
24 |
-
outputs = model(**inputs)
|
25 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
architectures/codeparrot.md
DELETED
@@ -1,33 +0,0 @@
|
|
1 |
-
[CodeParrot](https://huggingface.co/codeparrot/codeparrot) uses GPT-2 architecture with BPE tokenizer trained on Python code from the training split of the data, and a context length of 1024. This model was released as an educational tool for training large language models from scratch on code, with detailed tutorials and descriptions of the training process. It makes use of 🤗 [`accelerate`](https://huggingface.co/docs/accelerate/index) for distributed training and mixed precision. See this [blog](https://huggingface.co/blog/codeparrot) and [repo](https://github.com/huggingface/transformers/tree/main/examples/research_projects/codeparrot) for more details.
|
2 |
-
|
3 |
-
<div align="center">
|
4 |
-
|
5 |
-
|Model | # parameters |
|
6 |
-
| - | - |
|
7 |
-
| [codeparrot-small](https://huggingface.co/codeparrot/codeparrot-small) | 110M |
|
8 |
-
| [codeparrot](https://huggingface.co/codeparrot/codeparrot) | 1.5B |
|
9 |
-
|
10 |
-
</div>
|
11 |
-
|
12 |
-
|
13 |
-
You can load the model and tokenizer directly from 🤗 [`transformers`](https://huggingface.co/docs/transformers/index):
|
14 |
-
|
15 |
-
```python
|
16 |
-
from transformers import AutoTokenizer, AutoModelWithLMHead
|
17 |
-
|
18 |
-
tokenizer = AutoTokenizer.from_pretrained("codeparrot/codeparrot")
|
19 |
-
model = AutoModelWithLMHead.from_pretrained("codeparrot/codeparrot")
|
20 |
-
|
21 |
-
inputs = tokenizer("def hello_world():", return_tensors="pt")
|
22 |
-
outputs = model(**inputs)
|
23 |
-
|
24 |
-
```
|
25 |
-
|
26 |
-
You can also use `pipeline` to generate code:
|
27 |
-
|
28 |
-
```python
|
29 |
-
from transformers import pipeline
|
30 |
-
|
31 |
-
pipe = pipeline("text-generation", model="codeparrot/codeparrot")
|
32 |
-
outputs = pipe("def hello_world():")
|
33 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
architectures/incoder.md
DELETED
@@ -1,31 +0,0 @@
|
|
1 |
-
[InCoder](https://huggingface.co/facebook/incoder-6B) uses a decoder-only Transformer with Causal Masking objective, to train a left-to-right language model to fill in masked token segments, with a context length of 2048.
|
2 |
-
<div align="center">
|
3 |
-
|
4 |
-
|Model | # parameters |
|
5 |
-
| - | - |
|
6 |
-
| [facebook/incoder-1B](https://huggingface.co/facebook/incoder-1B) |1.3B |
|
7 |
-
| [facebook/incoder-6B](https://huggingface.co/facebook/incoder-6B) |6.7B |
|
8 |
-
|
9 |
-
</div>
|
10 |
-
|
11 |
-
[Causal Masking objective](https://arxiv.org/abs/2201.07520) is a hybrid approach of Causal and Masked language models, "it combines the benefit of per-token generation with optional bi-directionality specifically tailored to prompting".
|
12 |
-
During the training of InCoder, spans of code were randomly masked and moved to the end of each file, which allows for bidirectional context. Figure below from InCoder [paper](https://arxiv.org/pdf/2204.05999.pdf) illustrates the training process.
|
13 |
-
|
14 |
-
<p align="center">
|
15 |
-
<img src="https://huggingface.co/datasets/loubnabnl/repo-images/raw/main/incoder.png" alt="drawing" width="750"/>
|
16 |
-
</p>
|
17 |
-
|
18 |
-
So in addition to program synthesis (via left-to-right generation), InCoder can also perform editing (via infilling). The model gives promising results in some zero-shot code infilling tasks such as type prediction, variable re-naming and comment generation.
|
19 |
-
|
20 |
-
You can load the model and tokenizer directly from 🤗 [`transformers`](https://huggingface.co/docs/transformers/index):
|
21 |
-
|
22 |
-
```python
|
23 |
-
from transformers import AutoTokenizer, AutoModelWithLMHead
|
24 |
-
|
25 |
-
tokenizer = AutoTokenizer.from_pretrained("facebook/incoder-6B")
|
26 |
-
model = AutoModelWithLMHead.from_pretrained("facebook/incoder-6B")
|
27 |
-
|
28 |
-
inputs = tokenizer("def hello_world():", return_tensors="pt")
|
29 |
-
outputs = model(**inputs)
|
30 |
-
|
31 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
architectures/intro.md
DELETED
@@ -1,8 +0,0 @@
|
|
1 |
-
Various architectures are used in code generation models, but most of them use the auto-regressive left-to-right setting, such as GPT. However InCoder used a decoder-only Transformer with Causal Masking objective,
|
2 |
-
that combines both next token prediction and bidirectional context through masking. AlphaCode used an encoder-decoder architecture.
|
3 |
-
|
4 |
-
<p align="center">
|
5 |
-
<img src="https://huggingface.co/datasets/loubnabnl/repo-images/resolve/main/model_size.png" alt="drawing" width="440"/>
|
6 |
-
</p>
|
7 |
-
|
8 |
-
For model-specific information about each architecture, please select a model below:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
architectures/polycoder.md
DELETED
@@ -1,14 +0,0 @@
|
|
1 |
-
[PolyCoder](https://github.com/VHellendoorn/Code-LMs) uses GPT2 architecture, with BPE tokenizer trained on a random 5% subset of the data (all languages), and a context length of 2048. To study the effect of scaling of model size, the odel was trained in 3 different sizes.
|
2 |
-
|
3 |
-
<div align="center">
|
4 |
-
|
5 |
-
|Model | # parameters |
|
6 |
-
| - | - |
|
7 |
-
| GPT2 | 160M |
|
8 |
-
| GPT2 | 400M |
|
9 |
-
| GPT2 | 2.7B |
|
10 |
-
|
11 |
-
</div>
|
12 |
-
|
13 |
-
|
14 |
-
PolyCoder is currently being integrated in 🤗 `transformers`. Meanwhile it can be loaded following the instructions in the original GitHub [repo](https://github.com/vhellendoorn/code-lms#models).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|