This cross-architecture distillation, with Phi?

#14
by sometimesanotion - opened

Even though this model is superceded by Virtuoso Small, it's an outstanding achievement. Can this be done with Phi-4, particularly if Phi is Llamafied? Can Phi-4 be given strong function-calling or code autocomplete capabilities through distillation? I would really like a complete alternative to Qwen in the 14B parameter space.

Arcee AI org

We need to create a discord server to keep everyone updated on our progress. we have two open-source releases coming soon (both in that size range) distilled from much larger models. Also, the teacher’s name rhymes with “Keepseek.”

I will gladly join that Discord. Thank you for your team's efforts. It's amazing what mergekit and evolkit can do with home hardware!

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment