File size: 2,377 Bytes

a713909
 
 
 
 
 
 
 
5f6a08f
 
 
a713909
4cd22e4
 
5f6a08f
 
1570df7
893399b
1570df7
5f6a08f
f0a7a16
72856eb
 
 
1570df7
893399b
 
 
 
 
 
 
8feb8ae
d4cdfb5
a713909
 
 
d4cdfb5
a713909
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cbd306f

---
base_model:
- Nitral-Archive/Virtuoso-Lite-chatmlified-10B_r16-ep1
- Nitral-Archive/NightWing3-10B-v0.1
library_name: transformers
tags:
- mergekit
- merge
license: other
language:
- en
---
# Using nightwing3 in the mix seems to have been a mistake.

![image/png](https://cdn-uploads.huggingface.co/production/uploads/642265bc01c62c1e4102dc36/0QE2gG0eheTSto_iO-RY0.png)

## Base model: (Falcon3-10B-deepseekv3-distill)[[Virtuoso_Lite]](https://huggingface.co/arcee-ai/Virtuoso-Lite)

# Quants: [IQ4 GGUF Here](https://huggingface.co/Nitrals-Quants/NightWing3_Virtuoso-10B-v0.2-IQ4_NL-GGUF) [4bpw exl2 Here](https://huggingface.co/Nitrals-Quants/NightWing3_Virtuoso-10B-v0.2-4bpw-exl2)

# ST Presets [Updated] [Here](https://huggingface.co/Nitral-AI/NightWing3_Virtuoso-10B-v0.2/tree/main/ST)

![image/png](https://cdn-uploads.huggingface.co/production/uploads/642265bc01c62c1e4102dc36/Y4ltNcBlgTZkOSPhvdRNr.png)

## Prompt format: ChatML
```
<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
```

### Models Merged:
* [Nitral-Archive/Virtuoso-Lite-chatmlified-10B_r16-ep1](https://huggingface.co/Nitral-Archive/Virtuoso-Lite-chatmlified-10B_r16-ep1)
* [Nitral-Archive/NightWing3-10B-v0.1](https://huggingface.co/Nitral-Archive/NightWing3-10B-v0.1)

### The following YAML configuration was used to produce this model:
```yaml
slices:
  - sources:
      - model: Nitral-Archive/Virtuoso-Lite-chatmlified-10B_r16-ep1
        layer_range: [0, 40]
      - model: Nitral-Archive/NightWing3-10B-v0.1
        layer_range: [0, 40]
merge_method: slerp
base_model: Nitral-Archive/Virtuoso-Lite-chatmlified-10B_r16-ep1
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.420
dtype: bfloat16

```
# Notes: The goal of this merge was to make use of both the falcon3-10B base model I trained earlier (nightwing3) and my more recent training run over Arcee's distillation of DeepSeekV3, which also uses falcon3-10B as a base (Virtuoso-Lite-chatmlified-10B_r16-ep1). Initially, I wasn't entirely satisfied with the results of either model on their own. However, with limited testing, this merged version appears to have smoothed out some of the rough edges present in the originals. Further evaluation is needed to fully assess its performance.