niro

niro is an improvement over the excellent WizardLM-Evol-V2-Unfiltered model, which at the time of writting is the best 1.8 billion parameters mistral model. Keep in mind, nero is an un-trained merge, further improvements are yet to come.

benchmarks

zero-shot evaluations performed on current sota small models; mmlu is still the reason qwen models are better on average. Currently, niro is on par with the best language model below 2b parameters.

Parameters Model MMLU ARC HellaSwag PIQA Winogrande Average
0.5b qwen 2.5 47.29 31.83 52.17 70.29 57.06 51.72
0.5b arco 26.17 37.29 62.88 74.37 62.27 52.60
0.5b arco (exp) 25.51 38.82 63.02 74.70 61.25 52.66
1.7b smollm 27.65 46.26 65.74 76.06 60.93 55.33
1.8B niro-preview 41.75 40.96 72.07 77.97 65.51 59.65
1.5b qwen 2.5 58.68 44.71 67.62 75.73 62.67 61.88
Downloads last month
137
Safetensors
Model size
1.83B params
Tensor type
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Collection including appvoid/niro-preview-2409