YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Quantization made by Richard Erkhov.

Github

Discord

Request more models

gemma-2b-zephyr-dpo - AWQ

Original model description:

license: other tags: - alignment-handbook - trl - dpo - generated_from_trainer datasets: - argilla/dpo-mix-7k license_name: gemma-terms-of-use license_link: https://ai.google.dev/gemma/terms base_model: Columbia-NLP/gemma-2b-zephyr-sft model-index: - name: gemma-2b-zephyr-dpo results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 52.22 name: normalized accuracy - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 73.11 name: normalized accuracy - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 42.55 name: accuracy - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 42.64 - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 64.4 name: accuracy - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 19.94 name: accuracy

Model Card for Gemma 2B Zephyr DPO

We trained the google/gemma-2b with DPO and data from argilla/dpo-mix-7k. We carefully selected the hyper-parameters to achieve the best DPO performance.

Model description

  • Model type: A 2.5B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
  • Language(s) (NLP): Primarily English
  • License: Gemma Terms of Use
  • Finetuned from model: google/gemma-2b

License

This model has the same license as the original Gemma model collection

OpenLLM Leaderboard Performance

Models Avg. ARC HellaSwag MMLU TruthfulQA Winogrande GSM8k
google/gemma-2b 46.37 48.38 71.77 41.77 33.08 66.77 16.91
google/gemma-2b-it 42.75 43.94 62.70 37.65 45.82 60.93 5.46
wandb/gemma-2b-zephyr-sft 47.18 49.74 72.38 41.37 34.42 66.93 18.27
wandb/gemma-2b-zephyr-dpo 46.92 49.66 72.23 41.13 34.47 66.54 17.51
Columbia-NLP/gemma-2b-zephyr-sft 48.75 51.80 72.63 42.20 41.96 63.85 20.09
Columbia-NLP/gemma-2b-zephyr-dpo 49.14 52.22 73.11 42.55 42.64 64.40 19.94

MT-Bench

We evaluate our model with GPT-4-0125-preview as the judge.

Model Total Coding Extraction Humanities Math Reasoning Roleplay STEM Writing
google/gemma-2b-it 4.71 2.95 4.35 6.15 2.90 3.50 5.60 5.50 6.70
wandb/gemma-2b-zephyr-sft 4.03 3.10 3.15 5.00 2.70 2.65 5.10 4.80 5.75
wandb/gemma-2b-zephyr-dpo 4.06 2.80 2.90 5.55 2.65 2.70 5.20 4.80 5.85
anakin87_gemma-2b-orpo 4.14 3.00 3.70 6.30 2.70 2.35 5.68 4.75 4.75
Columbia-NLP/gemma-2b-zephyr-sft 4.34 3.10 3.70 6.25 2.65 2.70 5.55 5.25 5.50
Columbia-NLP/gemma-2b-zephyr-dpo 4.75 3.50 4.05 6.75 3.30 3.70 5.85 5.40 5.53
Downloads last month
3
Safetensors
Model size
790M params
Tensor type
I32
BF16
FP16
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.