|
--- |
|
license: mit |
|
train: false |
|
inference: false |
|
pipeline_tag: text-generation |
|
--- |
|
*aanaphi2-v0.1* is a finetuned (SFT + DPO) chat model based on <a href="https://huggingface.co/microsoft/phi-2">Microsoft's Phi-2 base model</a> (2.8B parameters). |
|
|
|
![image/gif](https://cdn-uploads.huggingface.co/production/uploads/636b945ef575d3705149e982/pIeboaaroFY5fpomUADrS.gif) |
|
|
|
## Performance |
|
| Models | phi-2 | aanaphi2-v0.1 | |
|
|-------------------|------------------|------------------| |
|
| ARC (25-shot) | 61.09 | <b>63.74</b> | |
|
| HellaSwag (10-shot)| 75.11 | <b>78.30</b> | |
|
| MMLU (5-shot) | <b>58.11</b> | 57.70 | |
|
| TruthfulQA-MC2 | 44.47 | <b>51.56</b> | |
|
| Winogrande (5-shot)| <b>74.35</b> | 73.40 | |
|
| GSM8K (5-shot) | 54.81 | <b>58.61</b> | |
|
| Average | 61.33 | <b>63.89</b> | |
|
|
|
|
|
## Installation |
|
Make sure you have the latest version of the transformers library: |
|
``` |
|
pip install pip --upgrade && pip install transformers --upgrade |
|
``` |
|
|
|
## Basic Usage |
|
``` Python |
|
#Load model |
|
import transformers, torch |
|
|
|
#GPU runtime |
|
device = 'cuda' |
|
compute_dtype = torch.float16 |
|
|
|
##CPU runtime |
|
#device = 'cpu' |
|
#compute_dtype = torch.float32 |
|
|
|
cache_path = '' |
|
model_id = "mobiuslabsgmbh/aanaphi2-v0.1" |
|
model = transformers.AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=compute_dtype, |
|
cache_dir=cache_path, |
|
device_map=device) |
|
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id, cache_dir=cache_path) |
|
|
|
#Set Prompt format |
|
instruction_template = "### Human: " |
|
response_template = "### Assistant: " |
|
def prompt_format(prompt): |
|
out = instruction_template + prompt + '\n' + response_template |
|
return out |
|
model.eval(); |
|
|
|
@torch.no_grad() |
|
def generate(prompt, max_length=1024): |
|
prompt_chat = prompt_format(prompt) |
|
inputs = tokenizer(prompt_chat, return_tensors="pt", return_attention_mask=True).to(device) |
|
outputs = model.generate(**inputs, max_length=max_length, eos_token_id= tokenizer.eos_token_id) |
|
text = tokenizer.batch_decode(outputs[:,:-1])[0] |
|
return text |
|
|
|
#Generate |
|
print(generate('If A+B=C and B=C, what would be the value of A?')) |
|
``` |