|
--- |
|
language: |
|
- en |
|
- fr |
|
- es |
|
- pt |
|
license: other |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
- falcon3 |
|
base_model: |
|
- neopolita/jessi-v0.4-falcon3-7b-instruct |
|
- tiiuae/Falcon3-7B-Instruct |
|
license_name: falcon-llm-license |
|
license_link: https://falconllm.tii.ae/falcon-terms-and-conditions.html |
|
model-index: |
|
- name: Falcon3-Jessi-v0.4-7B-Slerp |
|
results: |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: IFEval (0-Shot) |
|
type: HuggingFaceH4/ifeval |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: inst_level_strict_acc and prompt_level_strict_acc |
|
value: 76.76 |
|
name: strict accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/Falcon3-Jessi-v0.4-7B-Slerp |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: BBH (3-Shot) |
|
type: BBH |
|
args: |
|
num_few_shot: 3 |
|
metrics: |
|
- type: acc_norm |
|
value: 37.29 |
|
name: normalized accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/Falcon3-Jessi-v0.4-7B-Slerp |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MATH Lvl 5 (4-Shot) |
|
type: hendrycks/competition_math |
|
args: |
|
num_few_shot: 4 |
|
metrics: |
|
- type: exact_match |
|
value: 34.59 |
|
name: exact match |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/Falcon3-Jessi-v0.4-7B-Slerp |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: GPQA (0-shot) |
|
type: Idavidrein/gpqa |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: acc_norm |
|
value: 8.28 |
|
name: acc_norm |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/Falcon3-Jessi-v0.4-7B-Slerp |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MuSR (0-shot) |
|
type: TAUR-Lab/MuSR |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: acc_norm |
|
value: 20.49 |
|
name: acc_norm |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/Falcon3-Jessi-v0.4-7B-Slerp |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MMLU-PRO (5-shot) |
|
type: TIGER-Lab/MMLU-Pro |
|
config: main |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 34.0 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/Falcon3-Jessi-v0.4-7B-Slerp |
|
name: Open LLM Leaderboard |
|
--- |
|
# Merged Model |
|
|
|
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). |
|
|
|
 |
|
|
|
This model is currently ranked #1 on the Open LLM Leaderboard among models up to 8B parameters and #4 among models up to 14B parameters! |
|
|
|
## Merge Details |
|
### Merge Method |
|
|
|
This model was merged using the SLERP merge method. |
|
|
|
### Models Merged |
|
|
|
The following models were included in the merge: |
|
* [neopolita/jessi-v0.4-falcon3-7b-instruct](https://huggingface.co/neopolita/jessi-v0.4-falcon3-7b-instruct) |
|
* [tiiuae/Falcon3-7B-Instruct](https://huggingface.co/tiiuae/Falcon3-7B-Instruct) |
|
|
|
### Falcon3-7B-Instruct |
|
|
|
**Falcon3** family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B. |
|
|
|
This repository contains the **Falcon3-7B-Instruct**. It achieves state of art results (at the time of release) on reasoning, language understanding, instruction following, code and mathematics tasks. |
|
Falcon3-7B-Instruct supports 4 languages (English, French, Spanish, Portuguese) and a context length up to 32K. |
|
|
|
### Configuration |
|
|
|
The following YAML configuration was used to produce this model: |
|
|
|
```yaml |
|
base_model: neopolita/jessi-v0.4-falcon3-7b-instruct |
|
dtype: bfloat16 |
|
merge_method: slerp |
|
parameters: |
|
t: |
|
- filter: self_attn |
|
value: [0.0, 0.5, 0.3, 0.7, 1.0] |
|
- filter: mlp |
|
value: [1.0, 0.5, 0.7, 0.3, 0.0] |
|
- value: 0.5 |
|
slices: |
|
- sources: |
|
- layer_range: [0, 28] |
|
model: tiiuae/Falcon3-7B-Instruct |
|
- layer_range: [0, 28] |
|
model: neopolita/jessi-v0.4-falcon3-7b-instruct |
|
``` |
|
|
|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) |
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/suayptalha__Falcon3-Jessi-v0.4-7B-Slerp-details) |
|
|
|
| Metric |Value| |
|
|-------------------|----:| |
|
|Avg. |35.23| |
|
|IFEval (0-Shot) |76.76| |
|
|BBH (3-Shot) |37.29| |
|
|MATH Lvl 5 (4-Shot)|34.59| |
|
|GPQA (0-shot) | 8.28| |
|
|MuSR (0-shot) |20.49| |
|
|MMLU-PRO (5-shot) |34.00| |
|
|
|
<a href="https://www.buymeacoffee.com/suayptalha" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 60px !important;width: 217px !important;" ></a> |