---
base_model:
- google/gemma-2-2b-it
tags:
- text-generation-inference
- transformers
- unsloth
- gemma2
- trl
license: gemma
language:
- en
- fi
- sv
---

This example utilizes the [European AI Act regulation text](https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32024R1689) as training data in three languages: 
English, Finnish, and Swedish. The dataset comprises 9,175 data points for training and 2,456 for evaluation. 

Python libraries needed:

```python
pip install -U transformers
pip install torch torchvision torchaudio
pip install 'accelerate>=0.26.0'
```

The training arguments used are as follows:

```python
training_args = TrainingArguments(
    per_device_train_batch_size=32,  
    gradient_accumulation_steps=32,  
    warmup_steps=20,  
    max_steps=400,  
    learning_rate=1.5e-5,  
    fp16=not is_bfloat16_supported(),
    bf16=is_bfloat16_supported(),
    logging_steps=1,
    optim="adamw_8bit",
    weight_decay=0.01,
    lr_scheduler_type="cosine",
    seed=3407,
    output_dir=output_dir,
    report_to="none",
    eval_strategy="steps",
    eval_steps=10,
    load_best_model_at_end=True,
    metric_for_best_model="eval_loss",
    greater_is_better=False,
    save_total_limit=2,
)
```

The prediction is made using the standard Gemma:

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch

model_id = "mlconvexai/gemma-2-2b-it-finetuned-EU-Act-v2"
dtype = torch.bfloat16

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype=dtype,)

chat = [
    { "role": "user", "content": "Mikä on EU:n tekoälyasetus?" },
]
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
outputs = model.generate(
    input_ids=inputs.to(model.device), 
    max_new_tokens=1024,
    repetition_penalty=1.1,
    no_repeat_ngram_size=4,
)
print(tokenizer.decode(outputs[0]))
```

More detailed information about fine-tuning can be found on [Medium](https://medium.com/@timo.au.laine/eu-ai-act-fine-tune-multilingual-local-llm-2c0657cc47f8).

# Uploaded  model

- **Developed by:** mlconvexai
- **License:** Gemma
- **Finetuned from model :** google/gemma-2-2b-it

This gemma2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)