File size: 4,225 Bytes

d28328b
 
aba3258
 
 
 
d28328b
 
 
 
a909a34
d28328b
 
 
 
aba3258
d28328b
 
 
a909a34
 
 
 
 
 
 
 
 
 
d28328b
23d445b
 
 
 
 
d28328b
6c5bc38
d28328b
6c5bc38
d28328b
a1bb277
08670ea
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6c5bc38
d28328b
6c5bc38
 
d28328b
6c5bc38
 
 
d28328b
 
 
 
f50f6f2
d28328b
 
 
 
6c5bc38
d28328b
6c5bc38
d28328b
 
 
6c5bc38
 
fd1bfba
 
 
6c5bc38
 
 
 
 
 
 
 
 
 
d28328b
 
 
b7df593
d28328b
 
 
b7df593
d28328b
 
 
b7df593
 
 
d28328b
6c5bc38
d28328b
6c5bc38
d28328b
58ac8db

---
library_name: transformers
datasets:
- web_questions
metrics:
- perplexity
---

# Model Card for Model ID

This model card corresponds to the 7B instruct finetuned version of the Gemma model.



## Model Details
This is a general question-answer model finetuned on the web_questions dataset. 

### Model Description

This is a general question-answer LLM finetuned using Gemma on top of web_questions dataset.
Gemma is a family of lightweight, state-of-the-art open models from Google,
built from the same research and technology used to create the Gemini models.
They are text-to-text, decoder-only large language models, available in English,
with open weights, pre-trained variants, and instruction-tuned variants. Gemma
models are well-suited for a variety of text generation tasks, including
question answering, summarization, and reasoning. Their relatively small size
makes it possible to deploy them in environments with limited resources such as
a laptop, desktop or your own cloud infrastructure, democratizing access to
state of the art AI models and helping foster innovation for everyone.

- **Developed by:** Geerath Bhat
- **Model type:** Fine-tuned Instruct LLM.
- **Language(s) (NLP):** English
- **License:** No
- **Finetuned from model:** [google/gemma-7b-it]

### Usage

Google/Gemma has shared some code snippets on how to get quickly started with running the model. First make sure to `pip install -U transformers`, then copy the snippet from the section that is relevant for your usecase.

    hf_model_repo = Geerath/google-gemma-7b-it-finetuned-web-questions
    
    # Get the tokenizer
    tokenizer = AutoTokenizer.from_pretrained(hf_model_repo)
    
    # Load the model
    
    
    model = AutoModelForCausalLM.from_pretrained(hf_model_repo,
                                                 quantization_config=bnb_config,
                                                 device_map="auto")
    
    prompt = ["Question: Tell me something about IISc\n\nAnswer:\n"]
    
    # Generate response
    %%time
    input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids
    outputs = model.generate(input_ids=input_ids,
                             max_new_tokens=200,
                             do_sample = True,
                             temperature=0.2)
    
    result = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
    
    result = "Question:"+result.split("Question:")[1]

# Print the result
print(f"Generated response:\n{result}")
#### Fine-tuning the model

You can find fine-tuning scripts and notebook under the [`examples/` directory](https://huggingface.co/google/gemma-7b/tree/main/examples) of [`google/gemma-7b`](https://huggingface.co/google/gemma-7b) repository. To adapt it to this model, simply change the model-id to `google/gemma-7b-it`.
In that repository, we provide:

* A script to perform Supervised Fine-Tuning (SFT) on UltraChat dataset using QLoRA
* A script to perform SFT using FSDP on TPU devices
* A notebook that you can run on a free-tier Google Colab instance to perform SFT on English quotes dataset


## How to Get Started with the Model

Use the code provided by google/gemma-7b-it to get started with this finetuned model.

## Training Details


### Training Data

web_questions

### Training Procedure 

Trained using SFTTrainer and below are the TrainingArguments.

    num_train_epochs=1, # adjust based on the data size
    per_device_train_batch_size=4, # use 2 or 4 if you have less GPU RAM
    per_device_eval_batch_size=4,
    optim="paged_adamw_32bit",
    #gradient_accumulation_steps=2,
    save_strategy="epoch", 
    evaluation_strategy="epoch",
    learning_rate=2e-4,
    logging_steps=1,
    fp16=True,
    weight_decay=0.01,
    lr_scheduler_type="cosine",
    seed=42,

## Evaluation

Evaluated on test set of the web_questions dataset.

#### Testing Data

Currently tested on test set of web_questions dataset and will update soon the testing results with respect to other datasets. Thank you!!!

#### Metrics

Perplexity
Accuracy
F1 Score

### Results

After 2 epochs the training loss was 1.114500 and validation loss was 1.592121.

Perplexity on test data from web_questions dataset: 5.13