Daemontatox's picture
Adding Evaluation Results (#1)
99ede7d verified
---
base_model:
- meta-llama/Llama-3.3-70B-Instruct
tags:
- state-of-the-art
- reasoning
- chain-of-thought
- text-generation
- transformers
- llama
- instruction-tuning
license: apache-2.0
language:
- en
datasets:
- Daemontatox/Deepthinking-COT
- gghfez/QwQ-LongCoT-130K-cleaned
pipeline_tag: text-generation
library_name: transformers
model-index:
- name: Llama3.3-70B-CogniLink
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: wis-k/instruction-following-eval
split: train
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 69.31
name: averaged accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FLlama3.3-70B-CogniLink
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: SaylorTwift/bbh
split: test
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 52.12
name: normalized accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FLlama3.3-70B-CogniLink
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: lighteval/MATH-Hard
split: test
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 39.58
name: exact match
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FLlama3.3-70B-CogniLink
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
split: train
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 26.06
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FLlama3.3-70B-CogniLink
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 21.4
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FLlama3.3-70B-CogniLink
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 46.37
name: accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FLlama3.3-70B-CogniLink
name: Open LLM Leaderboard
---
![image](./image.webp)
# Model Card: CogniLink - Redefining Reasoning AI
## Overview
CogniLink is a **state-of-the-art (SOTA) reasoning model**, engineered to set new benchmarks in logical problem-solving and chain-of-thought capabilities. Leveraging the power of LLaMA 3.3 70B, CogniLink excels in multi-step reasoning, inference, and real-time decision-making across diverse domains. Whether tackling mathematical proofs, legal analyses, or dynamic real-world scenarios, CogniLink ensures clarity, precision, and scalability.
Designed for both **high-performance tasks** and **resource-efficient environments**, CogniLink represents the perfect fusion of innovation and practicality.
---
## Key Features
- **Base Model:** [unsloth/llama-3.3-70b-instruct](https://huggingface.co/unsloth/llama-3.3-70b-instruct-bnb-4bit)
- **Developed By:** Daemontatox
- **License:** Apache 2.0 (open and permissive)
- **Primary Language:** English
- **Specialization:** Multi-domain reasoning, step-by-step logic, and advanced inference.
**CogniLink is optimized for tasks requiring:**
- **Reasoning Depth:** Multi-step logic with exceptional accuracy.
- **Chain-of-Thought (CoT):** Built-in mechanisms to generate clear, stepwise reasoning paths.
- **Resource Efficiency:** Ideal for deployment on both high-performance servers and resource-constrained devices, including edge computing platforms.
---
## Training and Optimization
CogniLink’s fine-tuning was accelerated using **[Unsloth](https://github.com/unslothai/unsloth)**, enabling a **2x faster training pipeline**. The training process was powered by Hugging Face's **TRL library**, ensuring seamless instruction tuning and robust adaptability across reasoning-heavy applications.
With advanced techniques like **quantization-aware training** and parameter-efficient fine-tuning, CogniLink is lightweight without compromising on performance, making it a top choice for edge deployment and embedded systems.
Special thanks to **[Modal.com](https://modal.com)** for providing **H100 GPUs**, which enabled accelerated training and optimized performance for CogniLink. Their generous support significantly contributed to the model’s development and deployment readiness.
---
## Applications
CogniLink is versatile and excels in various industries:
### **1. Education and Training**
- Powers AI tutors for **step-by-step problem-solving** in STEM education.
- Supports interactive learning tools with detailed explanations.
### **2. Research and Academia**
- Assists researchers with **hypothesis testing**, complex analysis, and paper drafting.
- Enhances productivity in tasks requiring deep logical reasoning.
### **3. Business Decision Support**
- Real-time **scenario analysis** for strategic decision-making.
- Risk assessment tools for dynamic business environments.
### **4. Legal and Policy Analysis**
- Enables multi-step reasoning for **case law interpretations** and **regulatory reviews**.
- Assists legal professionals with clear and logical argument generation.
### **5. Healthcare AI**
- Supports diagnostics and medical workflows with robust reasoning models.
- Ensures accuracy in multi-step inferential tasks like patient case reviews.
---
## Technical Specifications
- **Quantization:** Fully compatible with 4-bit inference for efficient performance.
- **Latency:** Optimized for real-time responses in latency-sensitive applications.
- **Scalability:** Deployable on diverse hardware setups, from high-end GPUs to edge devices.
---
## Why Choose CogniLink?
CogniLink isn’t just a model; it’s a **reasoning companion**. Its fine-tuned chain-of-thought design ensures not just answers, but **rational, explainable processes**, giving users the confidence and insights they need to make critical decisions.
- **Transparent Reasoning:** Every decision is backed by a logical thought process.
- **Versatile Applications:** From academia to business, CogniLink adapts effortlessly.
- **Cutting-Edge Efficiency:** High performance meets cost-effectiveness.
---
## Get Started
CogniLink is available for download and deployment. Start integrating advanced reasoning into your applications today!
For inquiries, contributions, or support, visit **[Unsloth GitHub](https://github.com/unslothai/unsloth)**.
**CogniLink: Connecting Intelligence with Clarity.**
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/Daemontatox__Llama3.3-70B-CogniLink-details)!
Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=Daemontatox%2FLlama3.3-70B-CogniLink&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)!
| Metric |Value (%)|
|-------------------|--------:|
|**Average** | 42.47|
|IFEval (0-Shot) | 69.31|
|BBH (3-Shot) | 52.12|
|MATH Lvl 5 (4-Shot)| 39.58|
|GPQA (0-shot) | 26.06|
|MuSR (0-shot) | 21.40|
|MMLU-PRO (5-shot) | 46.37|