mayank-mishra commited on
Commit
c7d6e2f
·
verified ·
1 Parent(s): b18c03f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +147 -3
README.md CHANGED
@@ -1,3 +1,147 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ inference: false
4
+ license: apache-2.0
5
+ library_name: transformers
6
+ model-index:
7
+ - name: ibm/PowerLM-3b
8
+ results:
9
+ - task:
10
+ type: text-generation
11
+ dataset:
12
+ type: lm-eval-harness
13
+ name: ARC
14
+ metrics:
15
+ - name: accuracy-norm
16
+ type: accuracy-norm
17
+ value: 57.2
18
+ verified: false
19
+ - task:
20
+ type: text-generation
21
+ dataset:
22
+ type: bigcode/humanevalpack
23
+ name: HumanEvalSynthesis (Average)
24
+ metrics:
25
+ - name: pass@1
26
+ type: pass@1
27
+ value: 51.4
28
+ verified: false
29
+ - task:
30
+ type: text-generation
31
+ dataset:
32
+ type: bigcode/humanevalpack
33
+ name: HumanEvalExplain (Average)
34
+ metrics:
35
+ - name: pass@1
36
+ type: pass@1
37
+ value: 38.9
38
+ verified: false
39
+ - task:
40
+ type: text-generation
41
+ dataset:
42
+ type: bigcode/humanevalpack
43
+ name: HumanEvalFix (Average)
44
+ metrics:
45
+ - name: pass@1
46
+ type: pass@1
47
+ value: 38.3
48
+ verified: false
49
+ - task:
50
+ type: text-generation
51
+ dataset:
52
+ type: repoqa
53
+ name: RepoQA (Python@16K)
54
+ metrics:
55
+ - name: pass@1 (thresh=0.5)
56
+ type: pass@1 (thresh=0.5)
57
+ value: 73.0
58
+ verified: false
59
+ - task:
60
+ type: text-generation
61
+ dataset:
62
+ type: repoqa
63
+ name: RepoQA (C++@16K)
64
+ metrics:
65
+ - name: pass@1 (thresh=0.5)
66
+ type: pass@1 (thresh=0.5)
67
+ value: 37.0
68
+ verified: false
69
+ - task:
70
+ type: text-generation
71
+ dataset:
72
+ type: repoqa
73
+ name: RepoQA (Java@16K)
74
+ metrics:
75
+ - name: pass@1 (thresh=0.5)
76
+ type: pass@1 (thresh=0.5)
77
+ value: 73.0
78
+ verified: false
79
+ - task:
80
+ type: text-generation
81
+ dataset:
82
+ type: repoqa
83
+ name: RepoQA (TypeScript@16K)
84
+ metrics:
85
+ - name: pass@1 (thresh=0.5)
86
+ type: pass@1 (thresh=0.5)
87
+ value: 62.0
88
+ verified: false
89
+ - task:
90
+ type: text-generation
91
+ dataset:
92
+ type: repoqa
93
+ name: RepoQA (Rust@16K)
94
+ metrics:
95
+ - name: pass@1 (thresh=0.5)
96
+ type: pass@1 (thresh=0.5)
97
+ value: 63.0
98
+ verified: false
99
+ ---
100
+
101
+
102
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd5057674cdb524450093d/1hzxoPwqkBJXshKVVe6_9.png)
103
+
104
+ # Granite-8B-Code-Instruct-128K
105
+
106
+ ## Model Summary
107
+ **Granite-8B-Code-Instruct-128K** is a 8B parameter long-context instruct model fine tuned from *Granite-8B-Code-Base-128K* on a combination of **permissively licensed** data used in training the original Granite code instruct models, in addition to synthetically generated code instruction datasets tailored for solving long context problems. By exposing the model to both short and long context data, we aim to enhance its long-context capability without sacrificing code generation performance at short input context.
108
+
109
+ - **Developers:** IBM Research
110
+ - **GitHub Repository:** [ibm-granite/granite-code-models](https://github.com/ibm-granite/granite-code-models)
111
+ - **Paper:** [Scaling Granite Code Models to 128K Context](https://arxiv.org/abs/2407.13739)
112
+ - **Release Date**: July 18th, 2024
113
+ - **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).
114
+
115
+ ## Usage
116
+ ### Intended use
117
+
118
+ ### Generation
119
+ This is a simple example of how to use **PowerLM-3b** model.
120
+
121
+ ```python
122
+ import torch
123
+ from transformers import AutoModelForCausalLM, AutoTokenizer
124
+ device = "cuda" # or "cpu"
125
+ model_path = "ibm-granite/granite-8B-Code-instruct-128k"
126
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
127
+ # drop device_map if running on CPU
128
+ model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
129
+ model.eval()
130
+ # change input text as desired
131
+ chat = [
132
+ { "role": "user", "content": "Write a code to find the maximum value in a list of numbers." },
133
+ ]
134
+ chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
135
+ # tokenize the text
136
+ input_tokens = tokenizer(chat, return_tensors="pt")
137
+ # transfer tokenized inputs to the device
138
+ for i in input_tokens:
139
+ input_tokens[i] = input_tokens[i].to(device)
140
+ # generate output tokens
141
+ output = model.generate(**input_tokens, max_new_tokens=100)
142
+ # decode output tokens into text
143
+ output = tokenizer.batch_decode(output)
144
+ # loop over the batch to print, in this example the batch size is 1
145
+ for i in output:
146
+ print(i)
147
+ ```