phi-2-universal-NER

This model is a fine-tuned version of microsoft/phi-2 on the Universal-NER/Pile-NER-type dataset.

Model description

This model shows power of small language model. We can finetune phi-2 on google colab free version. It's very simple and easy. I couldn't fine tuned whole model on free colab so used PEFT.

Intended uses & limitations

This model is fine tuned from Phi-2 and UniversalNER dataset.

Phi-2 model license changed to MIT but UniversalNER is still under research license so this model can be used for research purpose only.

Training and evaluation data

I have used just 5 epochs in fine tuning.

Training procedure notebook

https://github.com/mit1280/fined-tuning/blob/main/phi_2_fine_tune_using_PEFT%2Binference.ipynb

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • training_steps: 1000

Inference Code

from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
from transformers import StoppingCriteria

config = PeftConfig.from_pretrained("Mit1208/phi-2-universal-NER")
base_model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2",device_map="auto", trust_remote_code=True)
model = PeftModel.from_pretrained(base_model, "Mit1208/phi-2-universal-NER", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("Mit1208/phi-2-universal-NER", trust_remote_code=True)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

conversations = [ { "from": "human", "value": "Text: Mit Patel here from India"}, {"from": "gpt", "value": "I've read this text."}, 
                   {"from":"human", "value":"what is a name of the person in the text?"}]
inference_text = tokenizer.apply_chat_template(conversations, tokenize=False) + '<|im_start|>gpt:\n'
inputs = tokenizer(inference_text, return_tensors="pt", return_attention_mask=False).to(device)

class EosListStoppingCriteria(StoppingCriteria):
    def __init__(self, eos_sequence = tokenizer.encode("<|im_end|>")):
        self.eos_sequence = eos_sequence

    def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool:
        last_ids = input_ids[:,-len(self.eos_sequence):].tolist()
        return self.eos_sequence in last_ids

outputs = model.generate(**inputs, max_length=512, pad_token_id= tokenizer.eos_token_id,
            stopping_criteria = [EosListStoppingCriteria()])

text = tokenizer.batch_decode(outputs)[0]

print(text)

# Output
'''
<|im_start|>human
Text: Mit Patel here from India<|im_end|>
<|im_start|>gpt
I've read this text.<|im_end|>
<|im_start|>human
what is a name of the person in the text?<|im_end|>
<|im_start|>gpt:
["Mit Patel"]<|im_end|>
'''

Framework versions

  • PEFT 0.7.1
  • Transformers 4.36.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
25
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for Mit1208/phi-2-universal-NER

Base model

microsoft/phi-2
Adapter
(919)
this model

Dataset used to train Mit1208/phi-2-universal-NER