monsterapi
/

zephyr-7b-alpha_metamathqa

Text Generation

Zephyr-7B-Alpha

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

zephyr-7b-alpha_metamathqa / README.md

Zangs3011's picture

Update README.md

3e41e9d over 1 year ago

|

history blame contribute delete

2.18 kB

	---
	library_name: transformers
	tags:
	- meta-math
	- code
	- instruct
	- Zephyr-7B-Alpha
	datasets:
	- meta-math/MetaMathQA
	base_model: HuggingFaceH4/zephyr-7b-alpha
	license: apache-2.0
	---

	### Finetuning Overview:

	Model Used: HuggingFaceH4/zephyr-7b-alpha
	Dataset: meta-math/MetaMathQA

	#### Dataset Insights:

	The MetaMathQA dataset is a newly created dataset specifically designed for enhancing the mathematical reasoning capabilities of large language models (LLMs). It is built by bootstrapping mathematical questions and rewriting them from multiple perspectives, providing a comprehensive and challenging environment for LLMs to develop and refine their mathematical problem-solving skills.

	#### Finetuning Details:

	Using [MonsterAPI](https://monsterapi.ai)'s [LLM finetuner](https://docs.monsterapi.ai/fine-tune-a-large-language-model-llm), this finetuning:

	- Was conducted with efficiency and cost-effectiveness in mind.
	- Completed in a total duration of 10.9 hours for 0.5 epoch using an A6000 48GB GPU.
	- Costed `$22.01` for the entire finetuning process.

	#### Hyperparameters & Additional Details:

	- Epochs: 0.5
	- Total Finetuning Cost: $22.01
	- Model Path: HuggingFaceH4/zephyr-7b-alpha
	- Learning Rate: 0.0001
	- Data Split: 95% train 5% validation
	- Gradient Accumulation Steps: 4

	---
	Prompt Structure

	```
	Below is an instruction that describes a task. Write a response that appropriately completes the request.


	###Instruction:[query]


	###Response:[response]
	```

	---
	### Training loss:
	![training loss](zephyr-mmqa-1.png "Training loss")



	---
	### Benchmark Results:

	![GSM8K Accuracy ](benchmark.png "GSM8K Accuracy")

	GSM8K is a dataset of 8.5K high quality linguistically diverse grade school math word problems, These problems take between 2 and 8 steps to solve, and solutions primarily involve performing a sequence of elementary calculations using basic arithmetic operations (+ − ×÷) to reach the final answer. A bright middle school student should be able to solve every problem. Its a industry wide used benchmark for testing an LLM for for multi-step mathematical reasoning.



	---
	license: apache-2.0