|
--- |
|
language: |
|
- multilingual |
|
license: gemma |
|
library_name: transformers |
|
tags: |
|
- nlp |
|
- code |
|
base_model: google/gemma-2-2b-jpn-it |
|
datasets: |
|
- mlabonne/orpo-dpo-mix-40k |
|
license_link: https://ai.google.dev/gemma/terms |
|
pipeline_tag: text-generation |
|
quantized_by: ymcki |
|
widget: |
|
- messages: |
|
- role: user |
|
content: Can you provide ways to eat combinations of bananas and dragonfruits? |
|
model-index: |
|
- name: gemma-2-2b-jpn-it-abliterated-17-ORPO |
|
results: |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: IFEval (0-Shot) |
|
type: HuggingFaceH4/ifeval |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: inst_level_strict_acc and prompt_level_strict_acc |
|
value: 49.48 |
|
name: strict accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ymcki/gemma-2-2b-jpn-it-abliterated-17-ORPO |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: BBH (3-Shot) |
|
type: BBH |
|
args: |
|
num_few_shot: 3 |
|
metrics: |
|
- type: acc_norm |
|
value: 14.92 |
|
name: normalized accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ymcki/gemma-2-2b-jpn-it-abliterated-17-ORPO |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MATH Lvl 5 (4-Shot) |
|
type: hendrycks/competition_math |
|
args: |
|
num_few_shot: 4 |
|
metrics: |
|
- type: exact_match |
|
value: 2.87 |
|
name: exact match |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ymcki/gemma-2-2b-jpn-it-abliterated-17-ORPO |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: GPQA (0-shot) |
|
type: Idavidrein/gpqa |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: acc_norm |
|
value: 3.24 |
|
name: acc_norm |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ymcki/gemma-2-2b-jpn-it-abliterated-17-ORPO |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MuSR (0-shot) |
|
type: TAUR-Lab/MuSR |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: acc_norm |
|
value: 5.67 |
|
name: acc_norm |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ymcki/gemma-2-2b-jpn-it-abliterated-17-ORPO |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MMLU-PRO (5-shot) |
|
type: TIGER-Lab/MMLU-Pro |
|
config: main |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 13.18 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ymcki/gemma-2-2b-jpn-it-abliterated-17-ORPO |
|
name: Open LLM Leaderboard |
|
--- |
|
|
|
Original model: https://huggingface.co/google/gemma-2-2b-jpn-it |
|
|
|
## Prompt format |
|
|
|
``` |
|
<start_of_turn>user |
|
{prompt}<end_of_turn> |
|
<start_of_turn>model |
|
<end_of_turn> |
|
<start_of_turn>model |
|
|
|
``` |
|
|
|
Note that this model does not support a System prompt. |
|
|
|
This is abliterated model of [google/gemma-2-2b-jpn-it](https://huggingface.co/google/gemma-2-2b-jpn-it) using the |
|
[method](https://medium.com/@mlabonne/uncensor-any-llm-with-abliteration-d30148b7d43e) |
|
described by mlabonne. |
|
|
|
Layer 17 of the original model was chosen for abliteration. |
|
I also created another layer 18 abliterated model for comparison. |
|
|
|
ORPO fine tuning was performed for four epoches. |
|
|
|
| Epoch | loss | eval_loss | |
|
| ----- | ---- | --------- | |
|
| 1 | 1.20152769684791564 | 1.0501047372817993 | |
|
| 2 | 1.25755584239959716 | 1.0144596099853516 | |
|
| 3 | 0.93099724054336543 | 0.9957754611968994 | |
|
| 4 | 0.88664623498916623 | 0.9857067465782166 | |
|
|
|
The fine tuned model is uploaded here to be evaluated by the Open LLM Leaderboard to see if the slightly brain damaged non-ORPO model can be healed. Again, the fine tuning method is also based on one described by [mlabonne](https://towardsdatascience.com/fine-tune-llama-3-with-orpo-56cfab2f9ada) but the input model was read into VRAM by [unsloth](https://github.com/unslothai/unsloth) to allow using the full 40k dataset to run on a single 3090. |
|
|
|
## Benchmark (100.0*raw scores only) |
|
|
|
Click on the model name go to the raw score json generated by Open LLM Leaderboard. |
|
|
|
| Model | Average | IFEval | BHH | Math Lv5 | GPQA | MUSR | MMLU-PRO | |
|
| ----- | ------- | ------ | ----|--------- | ---- | ---- | -------- | |
|
| [gemma-2-2b-jpn-it](https://huggingface.co/datasets/open-llm-leaderboard/results/blob/main/google/gemma-2-2b-jpn-it/results_2024-10-15T15-21-39.173019.json) | 30.82 | 54.11 | 41.43 | 0.0 | 27.52 | 37.17 | 24.67 | |
|
| [gemma-2-2b-jpn-it-abliterated-17-ORPO](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-17-ORPO/results_2024-10-20T02-46-59.069357.json) | 29.99 | 50.94 | 38.59 | 2.87 | 27.43 | 38.23 | 21.86 | |
|
| [gemma-2-2b-jpn-it-abliterated-17](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-17/results_2024-10-18T15-18-46.821674.json) | 30.29 | 52.65 | 40.46 | 0.0 | 27.18 | 36.90 | 24.55 | |
|
| [gemma-2-2b-jpn-it-abliterated-18](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-18/results_2024-10-18T15-41-42.399571.json) | 30.61 | 53.02 | 40.96 | 0.0 | 27.35 | 37.30 | 25.05 | |
|
|
|
Looks like fine tuning is probably not enough. May need to run more epoches. |
|
|
|
## How to run this model |
|
|
|
```py |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
import transformers |
|
import torch |
|
|
|
model_id = "gemma-2-2b-jpn-it-abliterated-17-ORPO" |
|
dtype = torch.bfloat16 |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_id, |
|
device_map="cuda", |
|
torch_dtype=dtype,) |
|
|
|
chat = [ |
|
{ "role": "user", "content": "Write a hello world program" }, |
|
] |
|
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True) |
|
``` |
|
|
|
## Downloading using huggingface-cli |
|
|
|
First, make sure you have hugginface-cli installed: |
|
|
|
``` |
|
pip install -U "huggingface_hub[cli]" |
|
``` |
|
|
|
Then, you can target the specific file you want: |
|
|
|
``` |
|
huggingface-cli download ymcki/gemma-2-2b-jpn-it-abliterated-17-ORPO --include "*" --local-dir ./ |
|
``` |
|
|
|
## Credits |
|
|
|
Thank you mlabonne for describing his fine tuning method. |
|
|
|
Thanks FullOf_Bad_Ideas from LocalLlama for the suggestion of using unsloth to save VRAM. |
|
|
|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) |
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ymcki__gemma-2-2b-jpn-it-abliterated-17-ORPO) |
|
|
|
| Metric |Value| |
|
|-------------------|----:| |
|
|Avg. |14.89| |
|
|IFEval (0-Shot) |49.48| |
|
|BBH (3-Shot) |14.92| |
|
|MATH Lvl 5 (4-Shot)| 2.87| |
|
|GPQA (0-shot) | 3.24| |
|
|MuSR (0-shot) | 5.67| |
|
|MMLU-PRO (5-shot) |13.18| |
|
|
|
|