metadata
license: apache-2.0
datasets:
- wikitext
- ptb_text_only
language:
- en
metrics:
- perplexity
pipeline_tag: text-generation
model-index:
- name: distilgpt2
results:
- task:
type: text-generation
dataset:
name: penn_treebank
type: ptb_text_only
metrics:
- name: perlexity@BASELINE
type: dmx-perlexity
value: 63.45857238769531
- name: perlexity@FALLBACK
type: dmx-perlexity
value: 64.36720275878906
- task:
type: text-generation
dataset:
name: wikitext2
type: wikitext-2-raw-v1
metrics:
- name: perlexity@BASELINE
type: dmx-perlexity
value: 46.05925369262695
- name: perlexity@FALLBACK
type: dmx-perlexity
value: 46.570838928222656
This is a d-Matrix functional reference of the GPT2 model family, of the following revisions:
The reference provides the following functional configurations:
Configuration | Explanation |
---|---|
BASELINE |
a reference functionally equivalent to the original model |
BASIC |
all linear algebraic operands quantized to BFP16-64 , and all other operations transformed to approximated kernel simulations |
Usage
Prerequisites:
- Install d-Matrix ML Tools:
pip install dmx-mltools
- clone this repo.
cd
to the cloned repo.
>>> import os
>>> from mltools import dmx
>>> from transformers import pipeline
>>> import evaluate
>>> from datasets import load_dataset
>>> pipe = pipeline(
>>> "text-generation",
>>> model="d-matrix/gpt2",
>>> revision="gpt2-xl",
>>> dmx_config="BASELINE",
>>> use_auth_token=os.environ.get("HUGGING_FACE_HUB_TOKEN"),
>>> trust_remote_code=True,
>>> # device_map="auto", # enabling model parallel on multi-GPU nodes
>>> )
>>> pipe.model = dmx.Model(
>>> pipe.model, monkey_patched=False, hf=True, input_names=["input_ids", "labels"]
>>> )
>>> perplexity = evaluate.load("d-matrix/dmx_perplexity", module_type="metric")
>>> input_texts = load_dataset("ptb_text_only", "penn_treebank", split="test")["sentence"]
>>> results = perplexity.compute(model=pipe.model.body, references=input_texts)
>>> print(results)
{'loss': 3.0451672077178955, 'perplexity': 21.0135440826416}