metadata

license: apache-2.0
base_model: google/flan-t5-base
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: flan-t5-base-openbsd-faq
    results: []

flan-t5-base-openbsd-faq

This model is a fine-tuned version of google/flan-t5-base fintuned on ajsbsd/openbsd-faq

These are questions from https://www.openbsd.org/faq/faq1.html for use on ajsbsd.net

It achieves the following results on the evaluation set:

Loss: 2.2385
Rouge1: 0.3935
Rouge2: 0.3383
Rougel: 0.3906
Rougelsum: 0.3844

Model description

This model is a fine-tuned version of google/flan-t5-base

Intended uses & limitations

OpenBSD Q/A chat-bot.

Training and evaluation data

Questions created from https://www.openbsd.org/faq/faq1.html in Q/A format for text2text generation.

Training procedure

Trained at Google Colab with the following code.

!pip install -q transformers[torch] tokenizers datasets evaluate rouge_score sentencepiece huggingface_hub --upgrade

from huggingface_hub import notebook_login notebook_login()

import nltk from datasets import load_dataset import evaluate import numpy as np from transformers import T5Tokenizer, DataCollatorForSeq2Seq from transformers import T5ForConditionalGeneration, Seq2SeqTrainingArguments, Seq2SeqTrainer

Load and split the dataset

dataset = load_dataset("ajsbsd/openbsd-faq") dataset = dataset["train"].train_test_split(test_size=0.2) #dataset = load_dataset("csv", data_files="./JEOPARDY_CSV.csv") #dataset = dataset["train"].train_test_split(test_size=0.2)

Load the tokenizer, model, and data collator

tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-base") model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-base") data_collator = DataCollatorForSeq2Seq(tokenizer=tokenizer, model=model)

We prefix our tasks with "answer the question"

prefix = "Please answer this question: "

Define our preprocessing function

def preprocess_function(examples): """Add prefix to the sentences, tokenize the text, and set the labels""" # The "inputs" are the tokenized answer: inputs = [prefix + doc for doc in examples["question"]] model_inputs = tokenizer(inputs, max_length=128, truncation=True)

# The "labels" are the tokenized outputs:
labels = tokenizer(text_target=examples["answer"], max_length=512, truncation=True)
model_inputs["labels"] = labels["input_ids"]
return model_inputs

Map the preprocessing function across our dataset

tokenized_dataset = dataset.map(preprocess_function, batched=True)

Set up Rouge score for evaluation

nltk.download("punkt", quiet=True) metric = evaluate.load("rouge")

def compute_metrics(eval_preds): preds, labels = eval_preds

# decode preds and labels
labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
decoded_preds = tokenizer.batch_decode(preds, skip_special_tokens=True)
decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)

# rougeLSum expects newline after each sentence
decoded_preds = ["\n".join(nltk.sent_tokenize(pred.strip())) for pred in decoded_preds]
decoded_labels = ["\n".join(nltk.sent_tokenize(label.strip())) for label in decoded_labels]

result = metric.compute(predictions=decoded_preds, references=decoded_labels, use_stemmer=True)
return result

Set up training arguments

training_args = Seq2SeqTrainingArguments( output_dir="./flan-t5-base-openbsd-faq", evaluation_strategy="epoch", learning_rate=3e-4, per_device_train_batch_size=8, per_device_eval_batch_size=4, weight_decay=0.01, save_total_limit=3, num_train_epochs=5, predict_with_generate=True, push_to_hub=False )

Set up trainer

trainer = Seq2SeqTrainer( model=model, args=training_args, train_dataset=tokenized_dataset["train"], eval_dataset=tokenized_dataset["test"], tokenizer=tokenizer, data_collator=data_collator, compute_metrics=compute_metrics )

Train the model

trainer.train()

trainer.push_to_hub()

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 8
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 5

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
No log	1.0	9	2.2184	0.3985	0.3308	0.3878	0.3902
No log	2.0	18	2.2060	0.4044	0.3231	0.3959	0.3937
No log	3.0	27	2.2271	0.4063	0.3315	0.4006	0.3971
No log	4.0	36	2.2251	0.4069	0.3366	0.4001	0.3937
No log	5.0	45	2.2385	0.3935	0.3383	0.3906	0.3844

Framework versions

Transformers 4.35.2
Pytorch 2.1.0+cu118
Datasets 2.14.7
Tokenizers 0.15.0