File size: 2,179 Bytes
43e56b1 b22c661 3e41e9d 43e56b1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
---
library_name: transformers
tags:
- meta-math
- code
- instruct
- Zephyr-7B-Alpha
datasets:
- meta-math/MetaMathQA
base_model: HuggingFaceH4/zephyr-7b-alpha
license: apache-2.0
---
### Finetuning Overview:
**Model Used:** HuggingFaceH4/zephyr-7b-alpha
**Dataset:** meta-math/MetaMathQA
#### Dataset Insights:
The MetaMathQA dataset is a newly created dataset specifically designed for enhancing the mathematical reasoning capabilities of large language models (LLMs). It is built by bootstrapping mathematical questions and rewriting them from multiple perspectives, providing a comprehensive and challenging environment for LLMs to develop and refine their mathematical problem-solving skills.
#### Finetuning Details:
Using [MonsterAPI](https://monsterapi.ai)'s [LLM finetuner](https://docs.monsterapi.ai/fine-tune-a-large-language-model-llm), this finetuning:
- Was conducted with efficiency and cost-effectiveness in mind.
- Completed in a total duration of 10.9 hours for 0.5 epoch using an A6000 48GB GPU.
- Costed `$22.01` for the entire finetuning process.
#### Hyperparameters & Additional Details:
- **Epochs:** 0.5
- **Total Finetuning Cost:** $22.01
- **Model Path:** HuggingFaceH4/zephyr-7b-alpha
- **Learning Rate:** 0.0001
- **Data Split:** 95% train 5% validation
- **Gradient Accumulation Steps:** 4
---
Prompt Structure
```
Below is an instruction that describes a task. Write a response that appropriately completes the request.
###Instruction:[query]
###Response:[response]
```
---
### Training loss:

---
### Benchmark Results:

GSM8K is a dataset of 8.5K high quality linguistically diverse grade school math word problems, These problems take between 2 and 8 steps to solve, and solutions primarily involve performing a sequence of elementary calculations using basic arithmetic operations (+ − ×÷) to reach the final answer. A bright middle school student should be able to solve every problem. Its a industry wide used benchmark for testing an LLM for for multi-step mathematical reasoning.
---
license: apache-2.0
|