Llama-3.1-8B-AthenaSky-MegaMix / README.md

Adding Evaluation Results (#1)

7abe5e4 verified 14 days ago

8.64 kB

	---
	language:
	- en
	license: apache-2.0
	library_name: transformers
	tags:
	- merge
	- mergekit
	- lazymergekit
	- model_stock
	- ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix
	base_model:
	- Pedro13543/mega_blend_model
	- Skywork/Skywork-o1-Open-Llama-3.1-8B
	- Undi95/Meta-Llama-3.1-8B-Claude
	- mergekit-community/good_mix_model_Stock
	- mergekit-community/L3.1-Athena-d-8B
	pipeline_tag: text-generation
	model-index:
	- name: Llama-3.1-8B-AthenaSky-MegaMix
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: IFEval (0-Shot)
	type: HuggingFaceH4/ifeval
	args:
	num_few_shot: 0
	metrics:
	- type: inst_level_strict_acc and prompt_level_strict_acc
	value: 63.01
	name: strict accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: BBH (3-Shot)
	type: BBH
	args:
	num_few_shot: 3
	metrics:
	- type: acc_norm
	value: 31.39
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MATH Lvl 5 (4-Shot)
	type: hendrycks/competition_math
	args:
	num_few_shot: 4
	metrics:
	- type: exact_match
	value: 27.95
	name: exact match
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GPQA (0-shot)
	type: Idavidrein/gpqa
	args:
	num_few_shot: 0
	metrics:
	- type: acc_norm
	value: 3.69
	name: acc_norm
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MuSR (0-shot)
	type: TAUR-Lab/MuSR
	args:
	num_few_shot: 0
	metrics:
	- type: acc_norm
	value: 6.9
	name: acc_norm
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU-PRO (5-shot)
	type: TIGER-Lab/MMLU-Pro
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 27.82
	name: accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
	name: Open LLM Leaderboard
	---
	# ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix

	## Overview
	ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix is a powerful AI model built through model stock merging using MergeKit. It brings together some of the best models available on Hugging Face, ensuring strong performance in a wide range of NLP tasks, including reasoning, coding, roleplay, and instruction-following.

	![Model Fusion](https://huggingface.co/front/assets/huggingface_logo-noborder.svg)

	This model was created by merging high-quality foundational and fine-tuned models to create an optimized blended architecture that retains the strengths of each contributing model.

	## Merge Details
	- Merge Method: `model_stock`
	- Base Model: [`mergekit-community/L3.1-Athena-d-8B`](https://huggingface.co/mergekit-community/L3.1-Athena-d-8B)
	- Dtype: `bfloat16`
	- Tokenizer Source: `mergekit-community/L3.1-Athena-d-8B`

	## Models Merged
	The following models contributed to this fusion:

	- [`Pedro13543/mega_blend_model`](https://huggingface.co/Pedro13543/mega_blend_model) - A well-balanced blend of roleplay and instruction-tuned Llama-3.1 variants.
	- [`Skywork/Skywork-o1-Open-Llama-3.1-8B`](https://huggingface.co/Skywork/Skywork-o1-Open-Llama-3.1-8B) - Optimized for reasoning and slow-thinking capabilities.
	- [`Undi95/Meta-Llama-3.1-8B-Claude`](https://huggingface.co/Undi95/Meta-Llama-3.1-8B-Claude) - Fine-tuned on Claude Opus/Sonnet data, improving response depth and conversational engagement.
	- [`mergekit-community/good_mix_model_Stock`](https://huggingface.co/mergekit-community/good_mix_model_Stock) - A diverse mixture including RP-focused and knowledge-heavy datasets.

	## Configuration
	```yaml
	name: ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix
	base_model: mergekit-community/L3.1-Athena-d-8B
	dtype: bfloat16
	merge_method: model_stock
	models:
	- model: Pedro13543/mega_blend_model
	- model: Skywork/Skywork-o1-Open-Llama-3.1-8B
	- model: Undi95/Meta-Llama-3.1-8B-Claude
	- model: mergekit-community/good_mix_model_Stock
	tokenizer_source: mergekit-community/L3.1-Athena-d-8B
	```

	## Features & Improvements
	🔹 Advanced Reasoning & Thoughtfulness - Thanks to `Skywork-o1` integration, this model excels in logical thinking and problem-solving.

	🔹 Enhanced Conversational Depth - The inclusion of `Meta-Llama-3.1-8B-Claude` adds better response structuring, making it more engaging in dialogue.

	🔹 Versatile Roleplay & Creativity - Leveraging `mega_blend_model` and `good_mix_model_Stock`, the model supports immersive roleplaying and storytelling.

	🔹 Strong Instruction Following - Trained on various instruction datasets to provide clear, informative, and helpful responses.

	## Use Cases
	- Chat & Roleplay - Supports natural, engaging, and dynamic conversational flow.
	- Programming & Code Generation - Provides reliable code completions and debugging suggestions.
	- Creative Writing - Generates compelling stories, character dialogues, and immersive text.
	- Educational Assistance - Helps explain complex topics and answer academic questions.
	- Logic & Problem-Solving - Can handle reasoning-based and structured thought processes.


	## 🛠 How to Use

	### 🔥 Ollama (Quick Inference)

	You can run the model using Ollama for direct testing:

	```bash
	ollama run hf.co/ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
	```

	### 🤗 Hugging Face Transformers (Python)

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
	import torch

	model_name = "ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix"

	# Load tokenizer & model
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype=torch.bfloat16,
	device_map="auto"
	)

	# Initialize text generation pipeline
	text_generator = pipeline(
	"text-generation",
	model=model,
	tokenizer=tokenizer,
	torch_dtype=torch.bfloat16,
	device_map="auto"
	)

	# Example prompt
	prompt = "Describe the significance of AI ethics in modern technology."

	# Generate output
	outputs = text_generator(
	prompt,
	max_new_tokens=200,
	do_sample=True,
	temperature=0.7,
	top_k=50,
	top_p=0.95
	)

	print(outputs[0]["generated_text"])
	```

	---

	## Model Alignment & Ethics
	⚠️ Uncensored Use: This model does not apply strict moderation. Users should implement appropriate safety filters before deployment.

	⚠️ Responsibility Notice: You are responsible for the outputs generated by this model. It is recommended to apply ethical safeguards and content moderation when integrating this model into applications.

	📜 License: Governed by the Meta Llama 3.1 Community License Agreement.

	## Feedback & Contributions
	We welcome feedback, bug reports, and performance evaluations! If you find improvements or wish to contribute, feel free to reach out or submit suggestions.

	---
	ZeroXClem Team \| 2025 ![ZXC](https://cdn-avatars.huggingface.co/v1/production/uploads/64408cd43e0374802e19f454/nOnDGGBF0p-AwkCGw0IZh.png)
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/ZeroXClem__Llama-3.1-8B-AthenaSky-MegaMix-details)

	\| Metric \|Value\|
	\|-------------------\|----:\|
	\|Avg. \|26.79\|
	\|IFEval (0-Shot) \|63.01\|
	\|BBH (3-Shot) \|31.39\|
	\|MATH Lvl 5 (4-Shot)\|27.95\|
	\|GPQA (0-shot) \| 3.69\|
	\|MuSR (0-shot) \| 6.90\|
	\|MMLU-PRO (5-shot) \|27.82\|