ibm-research
/

PowerLM-3b

+---
+pipeline_tag: text-generation
+inference: false
+license: apache-2.0
+library_name: transformers
+model-index:
+- name: ibm/PowerLM-3b
+  results:
+  - task:
+      type: text-generation
+    dataset:
+        type: lm-eval-harness
+        name: ARC
+    metrics:
+    - name: accuracy-norm
+      type: accuracy-norm
+      value: 57.2
+      verified: false
+  - task:
+      type: text-generation
+    dataset:
+        type: bigcode/humanevalpack
+        name: HumanEvalSynthesis (Average)
+    metrics:
+    - name: pass@1
+      type: pass@1
+      value: 51.4
+      verified: false
+  - task:
+      type: text-generation
+    dataset:
+        type: bigcode/humanevalpack
+        name:  HumanEvalExplain (Average)
+    metrics:
+    - name: pass@1
+      type: pass@1
+      value: 38.9
+      verified: false
+  - task:
+      type: text-generation
+    dataset:
+        type: bigcode/humanevalpack
+        name:  HumanEvalFix (Average)
+    metrics:
+    - name: pass@1
+      type: pass@1
+      value: 38.3
+      verified: false
+  - task:
+      type: text-generation
+    dataset:
+        type: repoqa
+        name:  RepoQA (Python@16K)
+    metrics:
+    - name: pass@1 (thresh=0.5)
+      type: pass@1 (thresh=0.5)
+      value: 73.0
+      verified: false
+  - task:
+      type: text-generation
+    dataset:
+        type: repoqa
+        name:  RepoQA (C++@16K)
+    metrics:
+    - name: pass@1 (thresh=0.5)
+      type: pass@1 (thresh=0.5)
+      value: 37.0
+      verified: false
+  - task:
+      type: text-generation
+    dataset:
+        type: repoqa
+        name:  RepoQA (Java@16K)
+    metrics:
+    - name: pass@1 (thresh=0.5)
+      type: pass@1 (thresh=0.5)
+      value: 73.0
+      verified: false
+  - task:
+      type: text-generation
+    dataset:
+        type: repoqa
+        name:  RepoQA (TypeScript@16K)
+    metrics:
+    - name: pass@1 (thresh=0.5)
+      type: pass@1 (thresh=0.5)
+      value: 62.0
+      verified: false
+  - task:
+      type: text-generation
+    dataset:
+        type: repoqa
+        name:  RepoQA (Rust@16K)
+    metrics:
+    - name: pass@1 (thresh=0.5)
+      type: pass@1 (thresh=0.5)
+      value: 63.0
+      verified: false
+---
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd5057674cdb524450093d/1hzxoPwqkBJXshKVVe6_9.png)
+# Granite-8B-Code-Instruct-128K
+## Model Summary
+**Granite-8B-Code-Instruct-128K** is a 8B parameter long-context instruct model fine tuned from *Granite-8B-Code-Base-128K* on a combination of **permissively licensed** data used in training the original Granite code instruct models, in addition to synthetically generated code instruction datasets tailored for solving long context problems. By exposing the model to both short and long context data, we aim to enhance its long-context capability without sacrificing code generation performance at short input context.
+- **Developers:** IBM Research
+- **GitHub Repository:** [ibm-granite/granite-code-models](https://github.com/ibm-granite/granite-code-models)
+- **Paper:** [Scaling Granite Code Models to 128K Context](https://arxiv.org/abs/2407.13739)
+- **Release Date**: July 18th, 2024
+- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).
+## Usage
+### Intended use
+### Generation
+This is a simple example of how to use **PowerLM-3b** model.
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+device = "cuda" # or "cpu"
+model_path = "ibm-granite/granite-8B-Code-instruct-128k"
+tokenizer = AutoTokenizer.from_pretrained(model_path)
+# drop device_map if running on CPU
+model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
+model.eval()
+# change input text as desired
+chat = [
+    { "role": "user", "content": "Write a code to find the maximum value in a list of numbers." },
+]
+chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
+# tokenize the text
+input_tokens = tokenizer(chat, return_tensors="pt")
+# transfer tokenized inputs to the device
+for i in input_tokens:
+    input_tokens[i] = input_tokens[i].to(device)
+# generate output tokens
+output = model.generate(**input_tokens, max_new_tokens=100)
+# decode output tokens into text
+output = tokenizer.batch_decode(output)
+# loop over the batch to print, in this example the batch size is 1
+for i in output:
+    print(i)
+```